splitapply doesn't split well into bins

2 visualizaciones (últimos 30 días)
Amit Ifrach
Amit Ifrach el 11 de Oct. de 2021
Comentada: Matt J el 13 de Oct. de 2021
לק"י
Hi guys,
I wanted splitapply command to split to 90 different bins. somewhy it returns only 50.
Here is the process I made:
First, 'cell1areas' (size - 18800X1) - a variable that contains vector of areas was loaded.
then 'bins' or 'groups' from 0 to 90000 in 1000 spacing was created in 'edges' variable.
after that, discretize function was applied to the area vector data. the max value of the variable dis is 62 (max(dis)).
valid function was apllied to check rather the data is a number or NaN.
last, splitapply function was called with @sum to sum all values for each group.
The problem is, that the spltsum variable have 50 'bins' or vector elements in it, instead of the desired 90 (which is the number of bins in edges) or even 62(!) like the discretize gave only 62 different numbers and not 90.
Thanks in advace, this community is great and really helpfull!
the code:
edges=[0 0:1000:90000 90000];
dis=discretize(cell1areas, edges);
valid=isfinite(cell1areas);
spltsum=splitapply(@sum , cell1areas(valid) , findgroups(dis(valid)) );

Respuesta aceptada

Matt J
Matt J el 11 de Oct. de 2021
Editada: Matt J el 13 de Oct. de 2021
You can use accumarray instead.
spltsum=accumarray(dis(valid), cell1areas(valid) , [90,1]);
  5 comentarios
Amit Ifrach
Amit Ifrach el 13 de Oct. de 2021
לק"י
thanks!
and another (last) one, I want the data to be splitted in bins defined by:
edges=[0 0:1000:90000 90000];
but as far as I understand the acuumarray arbitrary devides the data into 90 bins without paying attention to the length of the bins required (because of the last argument, [90,1]). is it true?
spltsum=accumarray(dis(valid), cell1areas(valid) , [90,1]);
if so, I need a way that the data will be splitted by the edges vector alone.
or to put it in other words:
I assume accumarray only sums up each value in cell1area that has the same 'bin' (value of bin as an integer).
the binning of cell1area is done primarily by discretize function (dis variable in this example).
accumarray only sums up all the values in cell1area that has the same binnig (by the dis function).
if so, why should I mention in the accumarray function the [90,1] vector/variable. it should know that I want 90 bins that are separated from each other by 1000 untill the value 90000, not arbitrary values that matlab thinks suites to devide the data I give it.
thanks!
Matt J
Matt J el 13 de Oct. de 2021
Not all 90 bins contain counts. If you don't tell accumarray how many bins you have, it will assume you only have max(dis(valid)) bins.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre String Parsing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by