Through various implementations on the net I have implemented a Bendcode parser and am trying to find the info hash of any input .torrent file. But due to some problem my parser gives correct output till the info dictionary (till number of peices entry) after which I am getting undefined chars in output.Please help me correct this error.
public class BencodeMine {
String stringinput;
int head = 0;
int tail;
//char[] input;
byte[] input;
public BencodeMine(String stringinput) throws IOException {
this.stringinput = stringinput;
tail = stringinput.length() - 1;
//input = stringinput.toCharArray();
// input = stringinput.getBytes(StandardCharsets.UTF_8);
ByteArrayOutputStream bOutput = new ByteArrayOutputStream(12);
bOutput.write(stringinput.getBytes());
input = bOutput.toByteArray();
}
public Object parse() {
byte head = read(input);
if (head == 'i')
return parseInt();
else if (Character.isDigit(head))
return parseString();
else if (head == 'l')
return parseList();
else if (head == 'd')
return parseDictionary();
return "-2";
}
public byte read(byte[] input) {
if (input[head] == 'i') {
head++;
//tail--;
return 'i';
} else if (Character.isDigit(input[head])) {
return input[head];
} else if (input[head] == 'l') {
head++;
//tail--;
return 'l';
} else if (input[head] == 'd') {
head++;
tail--;
return 'd';
} else return 'e';
}
public long parseInt() {
long ans = 0;
while (input[head] != 'e' && input[head] != ':') {
ans *= 10;
ans += (char)input[head] - '0';
head++;
}
head++;
return ans;
}
public String parseString() {
long count = parseInt();
StringBuilder sb = new StringBuilder();
ArrayList<Byte> trial = new ArrayList<>();
while(count!=0 && head<=tail){
sb.append((char) input[head]);
trial.add(input[head]);
head++;
count--;
}
return sb.toString();
}
public ArrayList<Object> parseList(){
ArrayList<Object> ans = new ArrayList<>();
while(true){
if(head > tail ) break;
ans.add(parse());
if(input[head]=='e') {
head++;
break;
}
}
return ans;
}
public TreeMap<String,Object> parseDictionary(){
TreeMap<String,Object> ans = new TreeMap<>();
while(true){
if(head>tail || input[head]=='e')
break;
String key = parseString();
ans.put(key, parse());
//if(input[head]=='e'){
//head++;
// break;
// }
}
return ans;
}
}
The output I am getting:
{announce=http://nyaa.tracker.wf:7777/announce, announce-list=[[http://nyaa.tracker.wf:7777/announce], [http://anidex.moe:6969/announce], [udp://tracker.coppersurfer.tk:6969/announce], [udp://tracker.opentrackr.org:1337/announce], [*udp://9.rarbg.to:2710/announce], [udp://9.rarbg.me:2710/announce], [udp://tracker.leechers-paradise.org:6969/announce], [udp://tracker.internetwarriors.net:1337/announce], [udp://tracker.cyberia.is:6969/announce], [udp://exodus.desync.com:6969/announce], [udp://tracker3.itzmx.com:6961/announce], [udp://tracker.torrent.eu.org:451/announce], [udp://tracker.tiny-vps.com:6969/announce], [udp://retracker.lanta-net.ru:2710/announce], [http://open.acgnxtracker.com:80/announce], [wss://tracker.openwebtorrent.com], [udp://open.stealth.si:80/announce]], comment=https://nyaa.si/view/1823558, created by=NyaaV2, creation date=1716553995, encoding=utf-8, info={length=571089672, name=[New-raws] Pokᅢᄅmon (2023) – 51 [1080p] [AMZN].mkv, piece length=262144, pieces=@ メᄑᄑ9ᄑQwᄑZSᄑᄑᄑ2Zᄑᄑ5ᄑᄑᄑᄑᄑ퐈ノ0ᄑ~6ᄑᄑᄑ^ᄑᄑᄑpᄑᄑᄑᄑ:<^,ᄑᄑvᄑQE6ᄑ@Kᄑᄑᄑᄑᄑᄑ<ᄑᄑxᄑᄑᄑᄑᄑᄑqᄑ>ᄑᄑᄑᄑ?ᄑᄑtXᄑb9qB)ᄑᄑᄑ ᄑsᄑ₩ヌンᄑᄑfᄑrzᄑᄑᄑrbIᄑ’ ᄑᄑlᄑ파リMᄑkᄑDᄑwᄑ4 ᄑ’ᄑCq3ᄑᄑ%3ᄑ#ᄑpᅴニD(표ᄃsᄑᄑkᄑᄑ%pᄑWᄑᄑIᄑfᄑeᄑz_=ᄑ/ᄑ ᄑBᄑᄑ+M[Vᄑ!.푀ᄉᄑ<}}
I tried taking input as a byte[], I know torrent files may not contain UTF-8 characters so perhaps that might be the trouble but if that is the problem how do I convert the byte[] elements to UTF-8 chars again?
Priyanshu Sharma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.