So, I have already made the code that validates, and reads the wav header, with research from this website. But I want to know, how is the data in the data section stored? They are in 16 bit sections, placed right next to each other. I thought that making a 440hz sine wave in Audacity, and then exporting it, would show some results, and the bytes do look neater, but still seams like nonsense. Trust me, I’ve looked everywhere on the internet, but if you know, or think you could figure out the answer, then please, by all means. Here is my code if you want to play around with it.
## CHECK FILE VALIDIDITY ##
# Import required libarays
import sys
from numpy import take
# Make a function to read the byte data from the file
def postobytes(data, pos):
return bytes(take(data, pos).tolist())
# Open the file
fileName = sys.argv[1]
data = list(open(fileName, 'rb').read())
# Define the checking variables
headerCorrect1 = b'RIFF'
headerCorrect2 = b'WAVEfmt '
range1 = [0,1,2,3]
range2 = [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]
# Check the header data
if postobytes(data, range1) == headerCorrect1 and postobytes(data, range2) == headerCorrect2:
print('Valid RIFF header')
else:
raise FileExistsError('Invalid RIFF header')
## GET AND PRINT FILE LENGTH ##
# Make a function to convert big endian to little endian
def bigtolittle(data):
return data[::-1]
def intfrompos(data, pos):
return int.from_bytes(bigtolittle(postobytes(data, pos)))
# Get the length of the file
lengthRange = [4,5,6,7]
length = intfrompos(data, lengthRange)
# Make human readable length
print('nFile Size:')
if length/1000 < 1000:
print(str(round(length/100)/10)+'K')
elif length/1000000 < 1000000:
print(str(round(length/10000)/100)+'M')
else:
raise NotImplementedError('File to big, gigabyte and on not implemented')
## GET DATA AND FIND DATA START ##
# Get sample rate
srRange = [24,25,26,27]
sampleRate = intfrompos(data, srRange)
print('nSample Rate:n'+str(sampleRate)+'Hz')
# Get bits per sample
bpsRange = [34,35]
bitsPerSample = intfrompos(data, bpsRange)
print('nBits per Sample:n'+str(bitsPerSample)+' Bits')
if bitsPerSample != 16:
raise ValueError('Unsupported or invalid Bits per Sample')
# Find data starter
for pos, byte in enumerate(data):
if byte == 100:
if data[pos+1] == 97:
if data[pos+2] == 116:
if data[pos+3] == 97:
print('nData Header at '+str(pos))
dhsp = pos # Data header start position
break
print('nPlaying WAV file')
audio = []
dlRange = [dhsp+4,dhsp+5,dhsp+6,dhsp+7]
dl = intfrompos(data, lengthRange)
for pt, unused in enumerate(data):
pos = dhsp+pt+8
if pos > dl+6:
break
byte = data[pos]
audio.append(byte)
# FOR STACK OVERFLOW HELPERS: The audio list is all you need to mess with. Below is where you can do that.
I had expected to find an answer, but couldn’t.
Donovan Black is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.