input: C matrix 2xN (2D points)
output: C matrix 2xM (2D points) with equal or less points.
Lets say we have C matrix 2xN that contains several 2D points, and it looks something like that:
What we want is to group “close” points to one point, measured by the average of the other points. For example, in the second image, every group of blue circle will be one point, the point coordinate will be the average point off all points in the blue circle. also by saying “close”, I mean that: their distance one to each other will be smaller then DELTA (known scalar). So wanted output is:
About running time of the algorithm, I don’t have upper-limit request but I call that method several times…
I am using Matlab, and what i have tried is this:
function C = ReduceClosePoints(C ,l_boundry)
x_size = abs(l_boundry(1,1)-l_boundry(1,2)); %220
DELTA = x_size/10;
T = [];
for i=1:size(C,2)
sum = C(:,i);
n=1;
for j=1:size(C,2)
if i~=j %not same point
D = DistancePointToPoint(C(:,i),C(:,j));
if D < DELTA
sum = sum + C(:,j);
n=n+1;
end
end
end
sum = sum./n; %new point -> save in T matrix
T = [T sum];
end
C = T;
end
And its not working 🙁
Also I am new to Matlab.
Thank you for your help!!
1
You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.
Let s
be a nonempty subset of your points and distance(p,s)
the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of “point clusters”:
input: pointlist
groups:={} // will contain groups of points, disjoint subsets of "pointlist"
for each p in pointlist
let newgroup:={p}
for each g in groups where `distance(p,g)<DELTA`
newgroup := join(newgroup, g) // join the two sets
remove g from groups
next g
add newgroup to groups
next p
output: groups
Note that this algorithm does not guarantee that each pair of points of a group has a distance < DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA. You have to decide for yourself if that is sufficient, but as @DieterLücking has pointed out, don’t expect to find an algorithm to solve the problem exactly the way you have described it originally.
Once you have your groups of points calculated, you can replace each group by the median value of it’s points.
Concerning running time: the running time depends heavily of calculating the distance
function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the “bounding box” of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA
, then you can be sure that distance(p, s)>=DELTA
as well.
2
The problem is not trivial!
Assume three points A, B, C where the distance d(A, B) is less than DELTA, d(B, C) is less than DELTA, but not d(A, C). Now you could select the minimum distance to join two points (lets say A and B) and get an intermediate result AB and C. Assume further the distance d(AB, C) is less than DELTA you will get a point ABC which is not the ‘average’ of three points.
I guess putting a grid over all points and clamping point coordinates to that grid might be better.