I am currently working to develop a program that takes in an array of 4 columns and around 200,000 rows. Each column represents x, y, z (coordinates), and a flag denoting whether or not this point is red (boolean 1 for yes, 0 for no). I am searching for a way to go through these points and find the largest bundle of red points within the given array.
Currently, my idea is to start with a random number between 0 and 1-len(x). This would give me a random index to begin checking points. I already have a function written that uses a KDTree to find the nearest neighbor of the random point and checks whether or not it is red. From here, I begin to get lost. My gut tells me that I should be searching for the next closest point, checking if it is red or not, and so on. However, I feel like I am missing a lot. I’m not sure how to check whether or not I indeed have the largest bundle.
If it helps, it is safe to assume that this bundle of real points is spherically symmetrical, and while it won’t be directly at the origin (0,0,0), it can be assumed to have a center no larger than 1 unit from (0,0,0).
It would be a great help to me if someone could point me in the right direction. Thank you!
3