Plot a curve that splits data into two sets

7 visualizaciones (últimos 30 días)
RPatel
RPatel el 4 de Ag. de 2017
Comentada: Image Analyst el 14 de Ag. de 2017
Hello,
I have data points which represent 2 classes (collisions avoided and probable collisions). My goal is to plot a curve (polynomial equation), that would split the data points say in a chosen ratio (Say 90% collisions avoided to 10% probable collisions). Note that data points corresponding to two classes are very close.
I have tried using 'fit' funciton in matlab, and for a polynomial of degree 8, here is what I get (refer image). But it doesn't split the data as required.
I am looking at Support Vector Machines for Binary Classification (I am not an expert in this domain), I am not sure if it would help. How can I get the data seggregation I want?
Best,
Raj

Respuestas (3)

Greg Heath
Greg Heath el 4 de Ag. de 2017
Your data is extremely discontinuous. The best you can hope for is a decision tree.
Hope this helps
Thank you for formally acceptingmy answer
Greg
  2 comentarios
RPatel
RPatel el 14 de Ag. de 2017
Thanks Greg for your suggestion, but it will not help my study...
Image Analyst
Image Analyst el 14 de Ag. de 2017
Too bad because I think that's your best shot at a possible solution. Since your data is so overlapping, I think that those two parameters are not enough to do the discrimination. You'd best try to look for a third or fourth parameter, like acceleration, velocity vector angles, or something. If you can't, then I think a treebagger/random forest/decision tree type of approach is the best you can hope for, like Greg said. See the scatterplot example on https://www.mathworks.com/help/stats/ensemble-methods.html#bsx62vu Actually your ad hoc convex hull example is somewhat related to a treebagger type of solution. It also sounds a bit like dbscan https://en.wikipedia.org/wiki/DBSCAN

Iniciar sesión para comentar.


John D'Errico
John D'Errico el 4 de Ag. de 2017
But why would a polynomial regression fit have any chance of satisfying this goal? It would be pure random chance if it came even close. It is especially wrong to hope that such a fit, based on purely distance as the independent variable would have a chance.
It seems you are looking for a nonlinear discriminant curve, based on both velocity and distance. I'd suggest neural nets, but just because you want to see a 90% success rate does not mean any such function exists. You could have as easily have insisted on a 99.99% target success rate. If wishes were horses, beggars would ride.
What you need to be modeling is a boolean result, thus collision or not, as a function of TWO independent variables, vehicle velocity and inter-vehicle distance. Again, use a tool of your choice. But a polynomial regression is still NOT the tool I would ever advise here.
  1 comentario
RPatel
RPatel el 4 de Ag. de 2017
Hello John,
I never had to do something of this sort before, and I have no idea about the diverse tools matlab offers to solve this kind of an issue. 'Polynomial regression fit' is just one of them I came across and I tried.
Indeed, I would like to have a different curve, for different percentage of success rate (90 %, 99%, etc.).
I will have a look at neural nets to see if it helps. Thanks for your comments :)

Iniciar sesión para comentar.


RPatel
RPatel el 14 de Ag. de 2017
As there doesn't seem to be any solution to this, here is what I did:
I found the centroid, chose x % of the closest points. Then I plot a convexhull around those points. Next, I check whether a particular point of interest lies in or out of the convex hull. Using this, I manage to get the percentage of collisions avoided to probable collisions (of points inside the hull)..
Hope this helps to others who might face a similar situation...

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by