Borrar filtros
Borrar filtros

How to detect repetition in data?

9 visualizaciones (últimos 30 días)
Jacqueline
Jacqueline el 2 de Jul. de 2013
Hello,
So I have various data files containing different information, such as engine speed, engine torque, etc. Each file has about 10,000 points, one for each second (so the data was gathered for over two hours). I'm trying to analyze the data such that if for 60 seconds, the data is the same, then there is an error with the data. For example, if the engine speed was 79.356 for 60 data points, there is an error..
How do I go about doing this?

Respuesta aceptada

Evan
Evan el 2 de Jul. de 2013
Editada: Evan el 2 de Jul. de 2013
Do you only want to identify adjacent points of repitition for your data, or any points that are not unique? If you're wanting the former, you could try loading in the data into a numerical matrix and using the "diff" command across your vector. Any point where the difference is zero would be a repeated value. In this way you could determine the beginning, extent, etc. of data reptition for whatever conditions you need to meet to throw an error.
Example:
data = 100*rand(1,10000); %random dataset
data(1,50:120) = 79.356; %set some data to constant value
datarep = ~diff(data);
Now, you can count the run-length of each set of repeated data. There might be other ways of doing it, but for run lengths I often convert to a string and use "regexp."
s = regexprep(num2str(datarep),' ',''); %convert to string, remove spaces
[ids runs] = regexp(s,'1+','start','match');
l = cellfun('length',runs);
In this way, ids will tell you where each set of repeated values starts, and l will tell you the length of each. This will give you enough information for seeing if your error conditions are met.
  1 comentario
Jacqueline
Jacqueline el 3 de Jul. de 2013
I kind of get what you're doing. I understand the diff function, but you lost me with the rest. When I use the diff function on a variable, it gives me a long list of numbers. I want to know if/where there are 60 zeros in a row, because that is where the data has not changed for 60 seconds. How do I do that?

Iniciar sesión para comentar.

Más respuestas (1)

Kwen
Kwen el 2 de Jul. de 2013
I would use a loop and the unique function.
I'm not sure you problem is consistent though-you can possibly have values that add above 60 even with the unique function but that would not necessarily cause an error?

Categorías

Más información sobre Numeric Types en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by