LARA

The Scala Cafebabe Bytecode Generation Library

IMPORTANT: This page is now outdated. Cafebabe has moved to GitHub and so has the documentation.

Introduction

This page is an introduction to the byte code generation library “Cafebabe”. Rather than exposing an API-like reference, we present the capabilities of the library through examples.

Other pages of interest are:

Types in ''.class'' files

Type in Java code in .class files
Void void V
Integer int I
Boolean boolean Z
Byte byte B
Single precision floating point float F
Double precision floating point double D
Long integer long J
Short integer short S
Character char C
Object reference package.ClassName Lpackage/ClassName;
Array type [] [type

For instance, an array of arrays of strings is written as:

String [][]

…in Java, and represented as [[Ljava/lang/String; in .class files.

Method signatures are written as the concatenation of the parameters in parentheses, followed by the return type. For instance:

String repString(int times, String original) { ... }

…has the signature (ILjava/lang/String;)Ljava/lang/String;, and:

int foo(double d, boolean test, byte b) { ... }

…has the signature (DZB)I.

Examples

Creating a ''.class'' file

To create a handler to a class file representation, simply write:

val classFile = new ClassFile("TestClass", None)

…where the first parameter is the (fully qualified) class name, and the second one is the name of a potential parent class wrapped in an Option value. Use:

val classFile = new ClassFile("TestSubClass", Some("TestClass"))

…to specify a parent class different from the default Java Object class.

Once all the fields, methods and code chunks for the methods have been set, you can write the .class file itself to the disk by using:

classFile.writeToFile("./Test.class")

Note that to comply with the JVM specification, the file name and path should match the qualified class name in the standard way (subdirectories represent packages, etc.).

Adding fields

To add a field to a class, use the following method:

val fh: FieldHandler = classFile.addField("Ljava/lang/String;", "name")

…where the first argument is the field's type properly encoded, and the second one the field's name. Fields are protected and non-final by default. Note that the returned FieldHandler has no particular use.

Adding methods

To add a method, use:

val mh: MethodHandler = classFile.addMethod("I", "sayHello", "IZLjava/lang/String;")

This adds a method named sayHello, which returns an integer and takes as arguments an integer, a Boolean and a string. An accepted alternative syntax is the following:

val mh: MethodHandler = classFile.addMethod("I", "sayHello", "I", "Z", "Ljava/lang/String;")

…where the parameter types are explicitely separated.

Methods are public by default. The main usage of the MethodHandler is to get a CodeHandler to actually attach some code to the method (see below).

Attaching code to methods

To attach code to a method, one needs to recover a CodeHandler instance referring to that method. The way to do this is the following:

val ch: CodeHandler = methodHandler.codeHandler

Alternatively, the handler can be recovered directly when the method is added as following:

val ch: CodeHandler = classFile.addMethod("V", "foo", "I").codeHandler

Standard Java Byte Codes

A CodeHandler instance can be used to add byte codes to a method body using a “C++ stream”-like notation. All Java byte codes are supported. The following is the code for a non-static method that multiplies its first (integer) argument by two and returns the result:

ch << ILOAD_1 << DUP << IADD << IRETURN

Java byte codes are written using exactly the names used in the JVM specification, only that they are fully-capitalized as in the above example.

Abstract Byte Codes

Additionally to “standard” Java byte codes, Cafebabe defines some “abstract” byte codes which are really just here for convenience purposes (most often to avoid having to explicitly deal with constant pool entries or to handle labels and jumps automatically). The convention is that they are capitalized using CamelCase to differentiate them from standard byte codes.

You can find a list of the available abstract byte codes here.

Getting slots for local variables

In order to get slot indices which are known not to be used at a given program point, we use the following method:

val fresh: Int = ch.getFreshVar

When we are done working on a local variable (for instance because it was only used in a temporary computation), we can notify the CodeHandler that the slot has become free again:

ch.freeVar(fresh)

This technique automatically accounts for local variables and does not allocate their slots to new variables until they have been freed. Additionally, note that all slots should be acquired/released through these two methods, as this is the only way to guarantee that the computation of the maximum number of locals will be accurate.

Getting fresh label names

To get fresh label names for flow-control abstract byte codes, the following helper is available:

ch.getFreshLabel("afterloop")

The label name is guaranteed to be fresh and will be prefix with the string.

Freezing: Finalizing the code

When the body of a method is completely specified, one needs to call the .freeze method on its CodeHandler:

ch.freeze

This:

  • translates all abstract byte codes into actual byte codes
  • computes the jump offsets and introduce them at the right places
  • computes the maximum stack height
  • computes the maximum number of local variables

The last two numbers are required by the classfile format.

Miscellaneous

Default constructors

A default constructor can be automatically generated for a class file by calling:

classFile.addDefaultConstructor

Main method

A MethodHandler for a public, static method returning void, called main and with a string array as a single argument can be obtained by using:

val mainHandler = classFile.addMainMethod

…the code can then be attached using its CodeHandler as usual.

Debug printing

As long as the code is not “frozen”, it can be printed out by calling:

codeHandler.print

This is used mostly for debugging jumps and labels, as method and field accesses are already encoded with constant pool indices at this point (displayed as RawBytes(n)).

Additional comments can be added to the stream of bytecodes using:

codeHandler << Comment("The loop starts here")

There comments are displayed in the debug print-out, but do not generate any bytecode.

Source code and binaries

Let us know if you find a bug.

Authors

Cafebabe was written by Philippe Suter and Sebastian Gfeller.