bytecode
This should be an in-depth but terse guide to java. With examples to really get a grip of the internals.
hello
HelloWorld.java :
| hello world | |
|---|---|
To execute this program, the main function should be static, it does compile without static but you just won't have an entry point :
: Error: Main method is not static in class HelloWorld, please define the main method as: public static void main(String[] args)
This is logical, since there is no instantiated HelloWorld object to run main from, static makes it com into existence.
Let's take apart helloworld.
javap
Now of course it would print "Jo!", but here is a way to make things more transparent : javap the java class file disassembler. It can be used just to get java code again :
| javap | |
|---|---|
| output | |
|---|---|
Interesting is the implicit constructor that is added.
But.. you can go further and print the bytecode with -c :
| decompile class file | |
|---|---|
For more info : visit
And visit
You can see each functions header is followed by a piece of code. The number means the index in the bytecode array that constitutes the code. So you can see it as the byte offset of that opcode and you will see it rises faster for lines with invocations and arguments.
- aload_0 means to push the local variable "this" onto the stack.
- invokespecial is used for invoking initialization methods, you can see which one after the comment. In this case initialise the mother of all Objects.
- return is exactly what it says.
The main method does not start with aload_0, since it is a static method and does not have a this.
- getstatic gets a static field from the System.out library which is imported by default. As far as i know it puts it's address on the stack.
- ldc pushes various variables onto the stack, such as ints and string.
- invokevirtual does indeed virtual method #4 from the
byte level
Even lower down you can look at the .class file bytes itself.
As a quickstart, you could represent the whole file with this C struct:
Here is a complete strip down for HelloWorld.class :
| helloworld.class binary | |
|---|---|
This is the first line, which always starts with the magic number 0xcafebabe.
| offset | size | description |
|---|---|---|
| 0 | 4 | the magic number 0xcafe oxbabe |
| 4 | 2 | minor version of the class file format |
| 6 | 2 | major version of the class file format |
Bytes 4-7 specify the class version number, which is 0 and 0x34. The major version a sequential number linked to a specific java version.
See visit, or more detailed : visit
In this case it is 0x34 (52), which is java SE 8 so with the 0 minor version this is java 8.0 code, which suits:
Next comes the constant pool table :
| offset | size | description |
|---|---|---|
| 8 | 2 | number of entries in the constant pool (sort of) |
| 10 | cpsize | this is the constant pool and it is of variable size |
So 10 is the last stable entry, and we have to read away the cp table to see how big it is. There are 0x20 entries in there, so 32. But it's not actually 32, since they start at slot 1 (not 0) and some types take up two slots. In general the count is the number of entries -1 :
Each entry now has this format:
Since info contains different information for each tag type, it is here represented as a byte array.
In table form :
| offset | size | description |
|---|---|---|
| 0 | 1 | tag indicating the entry type |
| 1 | var | tag specific information |
In our example, the tag is 0x0a, which is (10: CONSTANT_Methodref)
So filled with the data it will be :
| offset | size | description |
|---|---|---|
| 0 | 1 | tag indicating the entry type |
| 1 | 2 | class index 0007 |
| 3 | 2 | name/type index 0010 |
The rest of the entries would be :
under construction, i just got this far