I am working with some digital archeology and recovered some information from tapes. The files are very big (gigabytes). As archive software is very old and maybe proprietary, I want to write some custom tools.
I need to find byte sequence in file and remove it (copy to other file without it). There several types of byte sequences to remove (it is backup software markers. There is header and some random stuff like checksum, known length).
Then I need to split file using another byte markers to many files. I can detect header, but can not decoder data in it, so I slit between headers. Size of files- from few kilobytes to hundreds of megabytes.
I have done this using MS visual basic, but using “read all bytes”. So I have 2 gigabytes limit.
Maybe there is easy (and fast) way to do this reading file in portions using some “FileSystem.Seek Method” like in old computers?
some of my dirty tricks:
Private Structure Radom
Dim startas As List(Of Integer)
Dim stopas As List(Of Integer)
End Structure
Private Function FindIt(ByRef bytes As Byte(), ByRef search As Byte(), limitas As UInt32) As Radom
Dim index As Int32
Dim i As UInt32 = 0
Dim l As UInt32 = 0
Dim z As Radom
z.startas = New List(Of Integer)
z.stopas = New List(Of Integer)
While index < limitas - 5 And index >= 0
index = Array.IndexOf(bytes, search(0), index)
If index < 0 Then
Exit While
End If
If Confirmdit(bytes, search, index) Then
index += search.Length
If (i > 0) Then
z.startas.Add(l)
z.stopas.Add(index - search.Length)
End If
l = index - search.Length
i += 1
End If
index += 1
End While
If l > 0 Then
z.startas.Add(l)
z.stopas.Add(limitas - 1)
End If
Return z
End Function
Private Function Confirmdit(ByRef buferis As Byte(), ByRef ieskom As Byte(), start As UInt32) As Boolean
Dim b As Boolean = True
For i = 0 To ieskom.Length - 1
If start + i > buferis.Length - 1 Then
b = False
Exit For
End If
If buferis(start + i) <> ieskom(i) Then
b = False
Exit For
End If
Next
Confirmdit = b
End Function