23 views (last 30 days)

Thank you for looking at my question! I have included a brief introduction below; any suggestions or comments would be greatly appreciated!

Traditional histograms are generated using an array (e.g. sample_array = [1,1,1,2,2,3,3,3,3,4]) and the histogram is generated using h = histogram(sample_array,nbins);. In this example, with nbins = 4, I would have a simple histogram of column height associated with the number of times a particular value is observed in the sample array.

However, in my work I have come upon the need to instead use an array in place of a single value. For example:

sample_array = [1,1,[1,2],2,2,3,[2,3,4,5],3,4];

I am aware this is not an array. For convenience I am instead using a cell to contain the data:

sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};

What I need to do is generate the resulting histogram of sample_cell where I give EACH ENTRY of the cell EQUAL WEIGHT. The corresponding weights would be as follows:

sample_weight = {1,1,[1/2,1/2],1,1,1,[1/4,1/4,1/4,1/4],1,1};

From this, the resulting histogram would have the following counts in the bins for 1 thru 4:

Bin: Count

1: 2.5

2: 2.75

3: 2.25

4: 1.25

I am looking for a way to generate this resulting histogram which does not include using the least common multiple of the sizes of each entry. (I have a temporary solution to the problem including this quantity, however, I am unable to scale it up properly as I am dealing with very large prime numbers which result in LCM > 10^9.)

Again, any help or suggestions that you might have would be greatly appreciated!

David Young
on 6 Aug 2015

Edited: David Young
on 6 Aug 2015

If all the samples are positive integers, and the bins are all centred on the positive integers and with unit width, as in the initial example, you can just do this:

% data

sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};

samples = cat(2, sample_cell{:});

weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...

'UniformOutput', false);

weights = cat(2, weight_cell{:});

counts = accumarray(samples(:), weights(:)).';

If this isn't the case (as in your more accurate example in the comments), you have to modify the code above by putting the samples into bins before weighting and counting them. This then looks like this:

% data and histogram parameters

sample_cell = {[0,0.41],0.32,[0.13,0.67,0.2],0.9,[0.3,1,0.89]};

edges = 0:0.1:1;

% put all the samples into one vector, and make a vector of their weights

samples = cat(2, sample_cell{:});

weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...

'UniformOutput', false);

weights = cat(2, weight_cell{:});

% work out which bin of the histogram each sample falls into

bins = discretize(samples, edges);

% Now form the counts, applying the weights for each sample

wtdcounts = accumarray(bins(:), weights(:)).';

% and normalise to probabilities

normcounts = wtdcounts/sum(wtdcounts); % normalise to sum to 1

% plot like histogram

centres = conv(edges, [0.5 0.5], 'valid');

bar(centres, normcounts, 1);

This gives the same results as the code in your comment, but will be a great deal more economical I think.

David Young
on 6 Aug 2015

Opportunities for recent engineering grads.

Apply TodayFind the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
## 0 Comments

Sign in to comment.