I am using the following code to parse JSON multiline objects separated by comma from a stream stored in a .json file:
def stream_read_json(fn): import json start_pos = 0 with open(fn, 'r', encoding='utf-8') as f: while True: try: obj = json.load(f) yield obj return except json.JSONDecodeError as e: f.seek(start_pos) json_str = f.read(e.pos) obj = json.loads(json_str, encoding = 'utf-8') start_pos += e.pos yield obj
The first object is parsed correctly; the next ones, not.
While testing random values of f.seek(start_pos)
, I see there is an inconsistency with the index found by except json.JSONDecodeError as e:
How can I ensure the objects will be parsed correctly?
Tried to get f.seek(start_pos)
for the second JSON object at debug prompt
Martin Horst is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.