Distance-based clustering for 10-20 million 3D points

Hi.
I am looking for an efficient way to cluster 10-20 million unorganized 3D points based on the distance (i.e. setting a distance threshold so every point at less than that distance to its neighbours is clustered with them).
Any implementation of DBscan (or similar) able to deal with the kind/amount of data I have described would do the job.
Thanks.

10 comentarios

I have tried with a 13M points dataset with the new DBScan function from Matlab and with pcsegdist, and I gave up after 2h of processing in both cases.
Ideally, I'm looking for something able to do that in less than 2-3 min in a regular PC, and I could assume the use of trees/voxelizations.
Elif Özer
Elif Özer el 24 de Mayo de 2020
Editada: Elif Özer el 24 de Mayo de 2020
Hello, did you find a solution for your problem?
Hi. Yes, but out of Matlab, I am afraid
:(( What did you do? Can you explain ? If you wanna share my email is ozer15@itu.edu.tr .
Carlos Cabo
Carlos Cabo el 25 de Mayo de 2020
Editada: Carlos Cabo el 25 de Mayo de 2020
There are a few simple and public options. One if the easiest to implement could be using the DBSCAN function from sklearn.cluster in python combined with a voxelization. Of course you could try to run something similar to what I describe in the initial post without voxelizing, but it wouldn't be solved in 2-3 min with 13M points.
There is a dbscan in MATLAB now. It's in the Statistics and Machine Learning Toolbox.
>> which dbscan
C:\Program Files\MATLAB\R2020a\toolbox\stats\stats\dbscan.m
clc; clear; clear all;
pntCld = pcread('ism_train_cat.pcd');
number=pntCld.Count
points=pntCld.Location;
X=points;
Y=points;
D = pdist2(X,Y);
radius=2;
minpts=10;
idx = dbscan(D,radius,minpts,'Distance','precomputed')
subplot(1,2,1)
pcshow(pntCld)
subplot(1,2,2)
PlotClusterinResult(points, idx);
title(['DBSCAN Clustering (\epsilon = ' num2str(radius) ', MinPts = ' num2str(minpts) ')']);
My aim is cluster point clouds data according to dense and sparse area and obtain the points from cluster. dbscan gives me 2D solution. Right now ı cant understand the dense area in 3D. How can ı solve this problem. Is there anyy way to convert 3D figure?
It seems that you are using only X and Y coordinates.
Also, the way you asign values to X and Y doesn't seem to be appropriate.
Moreover, if I understood well your aim, maybe DBSCAN is not the best solution: your point cloud seems to be quite homogeneous and there are not visible separation between groups of points.
It is quite common that the name of DBSCAN leads to some confusion: it is true that it clusters data based on the density, but it is not usually able to separate different groups of 'points' with different densities, unless (i) they are separated (there are gaps between them and those gaps are bigger than the separation of the points that are expected to be in the same group), or (ii) you apply DBSCAN iteratively at different 'scales' (i.e. with different parameters: distance threshold and/or min_n_points)
@Image Analyst: If the function hasn't changed from the 2019a version, I've tried it and it doesn't seem to be very efficient with just a few million points in 3D.
It doesn't semm to use any space partition structure (or at least I didn't find any reference to it).
Ali
Ali el 14 de Jul. de 2020
@Carlos you have to downsample the point cloud first, this is the recommended approach by Matlab Documentation, refer to pcdownsample.

Iniciar sesión para comentar.

Categorías

Más información sobre Cloud Integrations en Centro de ayuda y File Exchange.

Preguntada:

el 6 de Sept. de 2019

Respondida:

el 22 de Oct. de 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by