I have a cyclic 1-D sequence of tuples of numeric values, and I’m trying to figure out the best way to find the length of the cycle. This is complicated by 2 factors:
- The sequence of values in one cycle is not guaranteed to be an exact match to the values in another cycle. One reason for this is that the cycle is not guaranteed to be an integer number of values long. Indeed part of the reason for detecting the cycle length is to try to resample the cycle to produce a new sequence for which the cycle length is an integer and determine the consensus values for the cycle.
- As stated this is not a sequence of numeric values, but a sequence of tuples. If they were individual numeric values, auto-correlation would be a trivial solution. However each tuple could reasonably be viewed as a point in n-dimensional space, and the distance between 2 points in that space would make for a straightforward difference function between 2 values.
The total number of samples is in the hundreds of thousands or millions, so as to capture potentially 100+ cycles (to get the best resolution for consensus). Thus something with O(n^2) space requirement for the full sequence is not really viable.