Pages

Wednesday, May 30, 2012

Compiling SWT Applications with GCJ, Part 2

In the previous part, I compiled a simple SWT application using GCJ, but the size of the resulting EXE file was less than optimal:

27.724.288 Bytes

This is the result of linking with the very large swt.o - which was generated from swt.jar. A simple way to verify that we can do better: I will remove one of the .class files from swt.jar, and then try to recompile the test application SWTTest.

Here are the steps to follow:
  1. Extract the JAR contents.
  2. Remove a .class file that is not needed.
  3. Recreate the JAR.
  4. Rebuild the application.

For a simple experiment I decided to try and remove the Browser class. Afterwards, the resulting application size was reduced to 25.269.248 Bytes. So it already makes a bit of a difference just from removing this one unneeded class. But how can I optimally remove all unneeded classes?

Let me use a really naive procedure to try and optimize this. The SWT library can be thought of as a list of classes which we can number C_1, C_2, ..., C_N. I will arbitrarily choose a class C_x and remove it. Then, I will try to rebuild swt.o (this time without C_x) and finally to link it with the test application SWTTest. If the build/link succeeds, then the class C_x is evidently not needed for SWTTest.

However, a failure to build swt.o in this scenario does not necessarily mean that SWTTest needs the class C_x. For example, it could be the case that a different class in swt.jar is dependent on C_x (which we tried to remove) even though SWTTest does not need it. In fact, if we could find all of C_x's dependencies first and remove them, then the build of swt.o and subsequently of SWTTest would succeed.

One way to solve this problem is to sort the list of classes in order of increasing dependency, such that the least dependent classes come early and the most dependent packages come later. We can refer to this ordering informally as "build ordering" to build a project, the compiler must presumably build the classes in this order. Note that there may be more than one valid build ordering.

The procedure to produce this type of ordering is known as Topological Sort, which is described well in this article: patrick dewane gathers wool: Linear Ordering of Dependencies with a Topological Sort. After performing the topological sort, the classes are in our required build ordering.

Let me now define my naive algorithm more precisely:
Given as input: a list of classes and a project which depends on one or more classes in this list.
To be determined: the minimum list of required classes to build the project successfully.
  1. Sort all classes using the topological sort, such that the classes form a "build ordering". That is, if class B is dependent on class A, then class A must appear before class B in the sorted list. Let us refer to this sorted list of classes as L.
  2. Let i be an iterator for L = L[1], L[2], .... L[L.length]. Initially, i points to the last class in L. That is, i = L.length.
  3. Consider the class C = L[i]. Remove C from the JAR file and try to build the project. If the build fails, restore the removed class C to the JAR file. If the build succeeds, let L[i] = NULL.
  4. Let i = i - 1 and repeat step 3. Stop when i = 0 (no more classes are left).
  5. Remove all NULL entries from list L to produce list LStar. The list LStar[1], LStar[2], ..., LStar[LStar.length] is the minimum list of required classes.
Notes:
  • The inputs for our test case are: the list of classes in org.eclipse.swt, and the project SWTTest (a simple "Hello World" project that uses SWT).
  • For Step 1 (Topological sort): To simplify checking the dependency of each class, let us assume a tool exists called ClassDependsOn which returns the dependency list of a given .class file.
  • For Step 3 (Removing a class): When removing a class called SomeClass, there may be multiple files associated with this class named SomeClass.class, SomeClass$1.class, SomeClass$2.class, .... When removing SomeClass, it is necessary to remove all of these files.
  • For Step 3 (Restoring a class): To facilitate easily restoring a removed class, do not actually delete .class files when removing the class. Instead, move them outside of the JAR file. A simple way to do this is as follows: Suppose you are building the JAR files from inside the directory called C:\Foo. In this case, removing the class SomeClass means to move the file SomeClass.class which is located in the directory C:\Foo\com\mycompany\mypackage into a directory called C:\Foo.trash\com\mycompany\mypackage. Notice that the trash folder C:\Foo.trash should have an identical folder structure as C:\Foo.
  • For Step 3 (Testing the build): To test the build, it seems it is necessary to make a JAR file first and then build the JAR, resulting in swt.o. Afterwards compile and link the SWTTest project as described in Compiling SWT Applications with GCJ, Part 1 to verify the build process.

No comments:

Post a Comment