I am trying to put an android app APK to my analysis tool. Specifically Microsoft “Company Portal” (version 5.0.6295.0 as of this writing, you can download it and pull from device via adb) Play Store Link.
This tool is supposed to extract the string resources from the resources.arsc file but it seems to not parse the ResStringPool headers correctly and gets a StringCount and StyleCount too large, causing the read to go beyond the boundary. Every other app I’ve tested has been fine. And I’ve confirmed that the app installs fine on several devices.
I’ve also noticed that there are different results using different versions of Google’s AAPT tool.
For reference, here’s comparison of output for 2 apps:
$ aapt dump strings CompanyPortal.apk | head
# Outputs: String pool of 41756 unique UTF-8 non-sorted strings, 41756 entries and 41743 styles using 3531384 bytes: (incorrect)
$ aapt2 dump strings CompanyPortal.apk | head
# Outputs: String pool of 41801 unique UTF-8 non-sorted strings, 41801 entries and 529 styles using 3203980 bytes: (correct)
$ aapt dump strings NormalApp.apk | head
# Outputs: String pool of 12217 unique UTF-8 non-sorted strings, 12217 entries and 13 styles using 589388 bytes:
$ aapt2 dump strings NormalApp.apk | head
# Outputs: String pool of 12217 unique UTF-8 non-sorted strings, 12217 entries and 13 styles using 589388 bytes:
So it seems clear that my tool works the same as AAPT v1 and I need to somehow incorporate the change that AAPT2 has to get an accurate count. When I extract the resources.arsc from the 2 apps I see
$ od -x CompanyPortalresources.arsc | head
0000000 0002 000c a0b8 0054 0001 0000 0001 001c
0000020 e278 0035 a31c 0000 a30f 0000 0100 0000 # (589388,12217,13)
0000040 18c8 0005 3f78 0033 0000 0000 0003 0000
0000060 009a 0000 0133 0000 01cc 0000 025b 0000
$ od -x NormalAppresources.arsc | head
0000000 0002 000c e5a8 000e 0001 0000 0001 001c
0000020 fe4c 0008 2fb9 0000 000d 0000 0100 0000 # (3531384, 41756, 41743)
0000040 bf34 0000 fd74 0008 0000 0000 0036 0000
0000060 005c 0000 0074 0000 0096 0000 00ba 0000
with a hex lookup we can see where the incorrect values are coming from, so I first assumed it was just different encoding, but looking at all the representations in this web hex tool I still can’t see where aapt2 is pulling the correct values from. Specifically I don’t see 0211 anywhere in the hex dump of the file.
- 3531384 => 0x0035e278, 41756 => 0x0000A31C, 41743 => 0x0000A30F
- 3203980 => 0x0030e38c, 41801 => 0x0000a349, 529 => 0x00000211
- 589388 => 0x0008fe4c, 12217 => 0x00002fb9, 13 => 0x0000000d
Below are other links I’ve used for reference, but as far as I can tell, they all seem to suggest reading Int32 at the same spot to get the counts, but I’m not good at reading C/C++ so maybe I’m missing something. Any tips/help would be much appreciated, thanks for your time.
- General description of resources.arsc
- Apk parsing tool jadx (written in Java)
- AOSP platform tools
- Android studio AAPT2