Big data question. how to generate a variable efficiently and aggregate

Question

0 votos

I have a file of tens of millions observations with a string identifier, which I load as a datastore:

............. V1 ..... V2 ............ V3 ........ V4
# # * # KLM88 2001-06-30 10 COMPANY1
# # * # KLM88 2000-12-31 20 COMPANY1
# # * # MNH7C 2001-09-30 23 COMPANY1
# # * # MNH7C 2001-06-30 15 COMPANY1
# # * # MNH7C 2000-12-31 6 COMPANY1
# # * # HG9LB 2000-12-31 2 COMPANY1

I also have a mat file with some extra information and matching of first variable:

# KLM88 COUNTRYA
# MNH7C COUNTRYA
# HG9LB COUNTRYB

I wish for an end result such that I aggregate on country and date and company my dataset :

# * # 2001-09-30 23 COMPANY1 COUNTRYA
# * # 2001-06-30 25 COMPANY1 COUNTRYA
# * # 2000-12-31 26 COMPANY1 COUNTRYA
# * # HG9LB 2000-12-31 2 COMPANY1 COUNTRYB

I know I can do so by reading per dataChunk and with for loop assigning the country. However, that takes a huge amount of time. Any other suggestions of how to do so? I am fairly new to the concepts of tall arrays/ mapreduce etc. Thus, I am not sure how could I arrive to what I want more efficiently.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Iniciar sesión para seguir la actividad

Answer 1

Sean de Wolski el 25 de Jul. de 2017

0 votos

Use the table join function.

https://www.mathworks.com/help/releases/R2017a/matlab/ref/join.html

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Big data question. how to generate a variable efficiently and aggregate

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Etiquetas

Community Treasure Hunt

Big data question. how to generate a variable efficiently and aggregate

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos