Mittwoch, 20. April 2011

Deobfuscating java projects

Lately I've been digging into minecraft. Normally digging into minecraft means something like this:

I was rather digging into something like this:

/*    */ import java.io.DataInputStream;
/*    */ import java.io.DataOutputStream;
/*    */ 
/*    */ public class a extends gc
/*    */ {
/*    */   public int a;
/*    */   public int b;
/*    */   public int c;
/*    */ 
/*    */   public void a(DataInputStream paramDataInputStream)
/*    */   {
/* 21 */     this.a = paramDataInputStream.readInt();
/* 22 */     this.b = paramDataInputStream.readInt();
/* 23 */     this.c = paramDataInputStream.readByte();
/*    */   }
/*    */ 
/*    */   public void a(DataOutputStream paramDataOutputStream) {
/* 27 */     paramDataOutputStream.writeInt(this.a);
/* 28 */     paramDataOutputStream.writeInt(this.b);
/* 29 */     paramDataOutputStream.writeByte(this.c);
/*    */   }

Obfuscated java code :-(

What is obfuscated code?

By obfuscating your java classes you want to prevent others from taking your source, modify it and release it again under another brand. There are tools which automatically obfuscate your code. These tools normally do the following things:

  • Put all artifacts (classes, interfaces, ...) into the default package
  • Rename all artifacts and members to something generic like 'a'

These changes make it very hard for humans to understand the meaning of some code. But there are a few things that can't be changed that easily during obfuscation:

  • Access modifiers
  • Types
  • Signatures in general
  • Import statements

What has all this to do with minecraft?

mojang is obfuscating it's minecraft releases so nobody can steal their code. Unfortunately this makes it very hard for developers to extend minecraft due to the lack of APIs within minecraft. So the community helped itself and created bukkit. bukkit is an API for minecraft. bukkit is so important just because of the minecraft obfuscation. With every minecraft release (major or minor) the signatures of all classes change. bukkit is the abstraction layer for the obfuscation changes.

But this means a lot of work for the bukkit developers [1]. They need to revert the minecraft obfuscation for every minecraft release manually.

The plan

Now there is a chance to make undoing the obfuscation more easily. But it will only work when the following conditions are met:

  • You got the deobfuscated source of a former version
  • The software is implemented iterative. So only relative small amounts of the code change during two versions.

With these assumptions you can statistically try to match artifacts from the old source with artifacts from the new obfuscated source. You just need to compare the things that don't change during obfuscation. Theres a big chance that the two artifacts with the most equality are actually the same artifact. This allows you to transfer the artifact's name and package from the old source into the new source.

The project

I like the idea of automatically deobfuscate java code. So I've hacked something and released it on github. java-deobscurify tries to find matching artifacts in the clear text and obfuscated source. java-deobscurify is still a work in progress. Let's say pre alpha... but I hope it will produce reasonable output in the near future.

References

  1. When will Bukkit be updated for Minecraft 1.5?