I have kind of an unique problem that I need to deal with and I feel like either my brain is not enough to solve it or its impossible (I really hope it is possible though)
I have a C# assembly that is obfuscated rather simply by just removing the names of every symbol (class, field, method, property, event) except method parameter names and replacing it with a garbage string (like: “padfgsbjklnh”)
When I manually reverse engineer it and rename the symbols with appropriate names on a certain version, lets say version 1, I have this “progress” that I’ve made. However at a later date when the assembly is updated to version 2 (added, changed or removed content) the assembly is mostly the same data but since its a new version its again obfuscated and it will discard all of my progress and basically I will be back to square 1.
I am trying to figure out a solution to save and transfer my manual changes to the new assembly automatically
Now there is a couple of limitations which make this problem really hard.
- I cannot rely on symbol names to map strings (except method param/arg names, they are original and reliable) – The symbols name are random with each assembly update and are unusable
- The symbol order in the PE/COFF metadata section of the assembly cannot be relied on since its not consistent
- The symbol metadata can be helpful, but not 100% relied on – Since a symbol can get changed (eg. Its data type), this is not a bulletproof piece of data that can be used as a unique key
- The solution I am aiming for is to be able to work with any update – This means the solution should be able to know when a given symbol with a set name has been changed (eg. For example for a field, if the field is defined like
int <name_is_irrelevant>;
and I have manually deobfuscated it to an appropriate nameattempts_left
, I should be able to tell that this symbol is indeedattempts_left
if its signature changes tolong <name_is_irrelevant>;
I have tried a solution where a signature hash map is created based on the symbol metadata (signature, params, implmap, attributes, etc…) for both the old and new version and then the named strings from the old version are remapped to the new version with all keys with the same signature – As it sounds this was not really reliable, since the slightest change (like changing field from int
to long
) will just erase the information for an already deobfuscated symbol and I will have to manually resolve it again.
I thought to somehow use behavior patterns by analyzing maybe what symbol is used where and how, and based on that to “tag/categorize/label” it uniquely, but this might break if for example a field is no longer used in a certain method.
I think this is a really hard problem and possibly impossible with my requirements, so any help/brainstorming is really appreciated!