I am attempting to parse binary data. The data is marked with sentinels ‘0x232349’ and ‘0x232353’. However, the following code:
data = read_file()
data = BitArray(data)
sweep_sent = BitArray('0x232349')
imu_sent = BitArray('0x232353')
list(data.findall(sweep_sent))
list(data.findall(imu_sent))
Returns nothing. I know that the sentinels are there – both because I can just see them in the binary file in a hex editor, and because the following function:
def find_sentinel(bitarr, bytestr):
# Convert the 3-byte string to an integer
target = int.from_bytes(bytestr.encode(), 'big')
n = len(bitarr)
indices = []
# We need to make sure we have at least 24 bits (3 bytes) to compare
if n < 24:
return indices
# Iterate through the bitarray in chunks of 24 bits (3 bytes)
for i in range(0, n - 23, 8): # 8-bit steps, 24 bits is 3 bytes
# Extract the 24-bit segment
segment = bitarr[i:i+24]
# Convert the segment to an integer
segment_int = segment.uint
# Compare to the target
if segment_int == target:
indices.append(i//8) # Return the starting byte index of the match
return indices # Return -1 if no match is found
Works correctly. However, I would like to use bitstring’s findall(), for a variety of reasons, but especially since performance is important here.