In my search for a suitable solution, I came across various approaches such as numpy.loadtxt or numpy.genfromtxt that seemed promising at first glance, but didn’t work straight-forwardly.
The Challenges
One line of my dataset is multiline, see the first two rows:
[array([[ 1, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0],
[ 0, 0, 1, 0, 0],
[ 1, 0, -1, 0, 0],
[ 0, 0, 0, 1, 0],
[ 0, 0, 0, 0, 1],
[-2, -1, 0, -1, -1],
[-2, -2, 0, -1, 0]]), 24, 4, 0, 232, 988, 1464, 10, 8, 246, 12]
[array([[ 1, 0, 0, 0, 0],
[-1, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0],
[ 0, 0, 1, 0, 0],
[ 0, 0, 0, 1, 0],
[-1, -1, 0, 1, 0],
[ 0, 0, 0, 0, 1],
[ 0, -2, -2, -2, -1]]), 28, 4, 0, 244, 1036, 1536, 10, 8, 260, 13]
...
The file is not very small, it can grow (after collecting more data) up to 3 GB.
What I tried so far
Trying to read data using np.genfromtxt('polytopes_5d_reflexive.txt')
does not work seamlessly. Classic line-by-line reading is also not advantageous, as the number of lines describing one data row varies.
The Background
The saying goes “mathematical data is inexpensive”. In our case, we actually generate such data ourselves, namely five-dimensional polytopes, in order to examine their properties such as vertex number, volume and others.
I do not want to demand a ready-made solution, but I am very grateful for the thought-inducing impulse in the right direction.