Synchronizing GC in Java and V8

In the last post I wrote that I work on a project that involves a lot of interoperability between Java and V8 JavaScript engine. Here is an interesting problem I was investigating the last couple of days.

Both V8 and JVM use garbage collector for memory management. While using GC provides a lot of benefits sometimes having two garbage collectors in a single process can be tricky though. Suppose we have a super-charged version of LiveConnect where we have access to the full Java API.

var file = new java.io.File("readme.txt");

console.log("length=" + file.length());

These two lines of JavaScript may seem quite simple at first glance. We create an instance of java.io.File and call one of its methods. The tricky part is that we are doing this from JavaScript and we must take care that the actual Java instance would not be GC’ed before we callĀ length method. In other words, we should provide some form of memory management. Suppose we decide to use JNI global references and we call NewGlobalRef every time when we create a new Java object from JavaScript. Accordingly we call DeleteGlobalRef when V8 makes Java object unreachable from JavaScript.

Let’s see a more complicated scenario.

var outStream = new java.io.FileOutputStream("log.txt");

var eventCallback = new com.example.EventCallback({
    onDataReceived: function(data) {
       outStream.write(data);
    }
});

var listener = new com.example.EventListener(eventCallback);

In this case we create an instance of com.example.EventCallback and provide its implementation in JavaScript. Now suppose that all these three JavaScript objects become unreachable and V8 is ready to GC them. Just because all of these objects are unreachable in JavaScript it does not mean that their actual counterparts in Java are unreachable. It’s possible that listener and eventCallback objects are still reachable through a stack of a listener Java thread.

gcchain

Now comes the interesting detail. While in JavaScript eventCallback has a reference to outStream through the function onDataReceived there is no such reference in Java and it is legitimate for Java GC to collect outStream object. The next time when the callback object calls write method there won’t be a corresponding Java object and the application will fail.

There are several solutions to this problem. One of them is to maintain the reachability in Java GC heap graph in sync with the one in JavaScript. After all, if there is an edge connecting eventCallback and outStream Java GC won’t try to collect the latter.

There are two options:

  • sync Java heap graph automatically
  • sync Java heap graph manually

As usual there is a trade-off. While the first option is very desirable there is a price to pay. We should analyze every closure in V8 that is GC’ed and traverse all objects reachable from there. This could slow down the GC by orders of magnitude.

The second option also has drawbacks. In general, JavaScript developers are not used to manual memory management. Introducing new memory management API could cause a lot of discomfort to the less experienced JavaScript developers.

scope(eventCallback, outStream);

Event if we make the API nice and simple, there is a burden of the mental model that JavaScript developers have to maintain. I tend to prefer this option though because many C/C++ developers proved it is possible to build high quality software using manual memory management.

In closing I would say that there are other solutions to this problem. I’ll discuss them in another blog post.