I have some very big files (more than 100 millions lines).
And I need to read their last line.
As I ‘m a Linux user, in a shell/script I use ‘tail‘ for that.
Is there a way to rapidly read the last line of a file in python ?
Perhaps, using ‘seek‘, but I ‘m not aware with that.
The best I obtain is this :
from subprocess import run as srun
file = "/my_file"
proc = srun(['/usr/bin/tail', '-1', file], capture_output=True)
last_line = proc.stdout
All other pythonic code I tried are slower than calling external /usr/bin/tail
I also read these threads that not satisfy my demand :
How to implement a pythonic equivalent of tail -F?
Head and tail in one line
Because I want some speed of execution and avoid memory overload.
Edit:
I try what I understand on comments and …
I get a strange comportment :
>>> with open("./Python/nombres_premiers", "r") as f:
... a = f.seek(0,2)
... l = ""
... for i in range(a-2,0,-1):
... f.seek(i)
... l = f.readline() + l
... if l[0]=="n":
... break
...
1023648626
1023648625
1023648624
1023648623
1023648622
1023648621
1023648620
1023648619
1023648618
1023648617
1023648616
>>> l
'n2001098251n001098251n01098251n1098251n098251n98251n8251n251n51n1n'
>>> with open("./Python/nombres_premiers", "r") as f:
... a = f.seek(0,2)
... l = ""
... for i in range(a-2,0,-1):
... f.seek(i)
... l = f.readline()
... if l[0]=="n":
... break
...
1023648626
1023648625
1023648624
1023648623
1023648622
1023648621
1023648620
1023648619
1023648618
1023648617
1023648616
>>> l
'n'
How to get l = 2001098251
?
8
tail doesn’t support arbitrarily long lines — it takes the last chunk of the file and iterates from there. Doing the same thing yourself could look like:
def last_line(f, bufsize=4096):
end_off = f.seek(0, 2)
f.seek(max(end_off - bufsize, 0), 0)
lastline = None
while (line := f.readline()):
if line[-1] == 'n':
lastline = line
else:
break # last line is not yet completely written; ignore it
return lastline[:-1] if lastline is not None else None
import sys
print(last_line(open(sys.argv[1], 'r')))
Note that if you want to continue to read new content as the file is edited over time, you should use inotify to watch for changes. /a/78969468/14122 demonstrates this.
5