android – Never Ending Journey

Memory management in NativeScript for Android

Note: This post will be a bit different from the previous ones. It’s intended to provide brief history as to why current NativeScript for Android implementation is designed this way. So, this post will be most useful for my Telerik ex-colleagues. Think of it as kind of historic documentation. Also, it is a chance to have a peek inside a developer’s mind 😉

I already gave you a hint about my current affairs. Since February I took the opportunity to pursue new ventures in a new company. The fact that my new office is the very next building to Telerik HQ gives me an opportunity to keep close connections with my former colleagues. At one such coffee break I was asked about the current memory management implementation. As I am no longer with Telerik, my former colleagues miss some important history that explains why this feature is implemented this way. I tried to explain briefly that particular technical issue in a previous post, however I couldn’t go much in depth because NativeScript was not announced yet. So, here I’ll try to provide more details.

Note: Keep in mind that this post is about NativeScript for Android platform, so I will focus only on that platform.

On the very first day of the project, we decided that we should explore what can be done with JavaScript-to-Java bidirectional marshalling. So, we set up a simple goal: make an app with a single button that increments a counter. Let’s see what Android docs says about button widget.

 public class MyActivity extends Activity {
     protected void onCreate(Bundle savedInstanceState) {
         super.onCreate(savedInstanceState);

         setContentView(R.layout.content_layout_id);

         final Button button = findViewById(R.id.button_id);
         button.setOnClickListener(new View.OnClickListener() {
             public void onClick(View v) {
                 // Code here executes on main thread after user presses button
             }
         });
     }
 }

After so many years, this is the first code fragment you see on the site. And it should be so. This code fragment captures the very essence of what button widget is and how it is used. We wanted to provide JavaScript syntax which feels familiar to Java developers. So, we ended up with the following syntax:

var button = new android.widget.Button(context);
button.setOnClickListener(new android.view.View.OnClickListener({
   onClick: function() {
      // do some work
   }
}));

This example is shown countless times in NativeScript docs and various presentation slides/materials. It is part of our first and main test/demo app.

Motivation: we wanted to provide JavaScript syntax which is familiar to existing Android developers.

This decision brings an important implication, namely the usage of JavaScript closures. To understand why closures are important for the implementation, we could take a look at the following simple, but complete, Java example.

package com.example;

import android.app.Activity;
import android.os.Bundle;
import android.view.View;
import android.widget.Button;
import android.widget.LinearLayout;
import android.widget.TextView;

public class MyActivity extends Activity {
    private int count = 0;

    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        LinearLayout layout = new LinearLayout(this);
        layout.setFitsSystemWindows(false);
        layout.setOrientation(LinearLayout.VERTICAL);

        final TextView txt = new TextView(this);
        layout.addView(txt);

        Button btn = new Button(this);
        layout.addView(btn);
        btn.setText("Increment");
        btn.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                txt.setText("Count:" + (++count));
            }
        });

        setContentView(layout);
    }
}

Behind the scene, the Java compiler will generate anonymous class that we can decompile and inspect closely. For the purpose of this post I am going to use fernflower decompiler. Here is the output for MyActivity$1 class.

package com.example;

import android.view.View;
import android.view.View.OnClickListener;
import android.widget.TextView;

class MyActivity$1 implements OnClickListener {
   // $FF: synthetic field
   final TextView val$txt;
   // $FF: synthetic field
   final MyActivity this$0;

   MyActivity$1(MyActivity this$0, TextView var2) {
      this.this$0 = this$0;
      this.val$txt = var2;
   }

   public void onClick(View view) {
      this.val$txt.setText("Count:" + MyActivity.access$004(this.this$0));
   }
}

We can see the Java compiler generates code that:
1) captures the variable txt
2) deals with ++count expression

This means that the click handler object holds references to the objects it accesses in its closure. We can call this class stateful as it has class members. Fairly trivial observation.

Let’s take a look again at the previous JavaScript code.

var button = new android.widget.Button(context);
button.setOnClickListener(new android.view.View.OnClickListener({
   onClick: function() {
      // do some work
   }
}));

We access the button widget and call its setOnClickListener method with some argument. This means that we should have instantiated Java object which implements OnClickListener so that the button can use it later. You can find the class implementation for that object in your project platform directory

[proj_dir]/platforms/android/src/main/java/com/tns/gen/android/view/View_OnClickListener.java

Let’s see what the actual implementation is.

package com.tns.gen.android.view;

public class View_OnClickListener
       implements android.view.View.OnClickListener {
  public View_OnClickListener() {
    com.tns.Runtime.initInstance(this);
  }

  public void onClick(android.view.View param_0)  {
    java.lang.Object[] args = new java.lang.Object[1];
    args[0] = param_0;
    com.tns.Runtime.callJSMethod(this, "onClick", void.class, args);
  }
}

As we can see this class acts as a proxy and doesn’t have fields. We can call this class stateless. We don’t store information that we can use to describe its closure if any.

So, we saw that Java compiler generates classes that keep track of their closures while NativeScript generates classes that don’t keep track of their closures. This is a simple implication due to the fact the JavaScript is a dynamic language and the information of lexical scope is not enough to provide full static analysis. The full information about JavaScript closures can be obtain at run time only.

The ovals diagram I used in my previous post visualize the missing object reference to the closed object. So, now we have an understanding what happens in NativeScript runtime for Android. The current NativeScript, at the time of writing version 3.3, provides mechanism to “compensate” for the missing object references. To put it simply, for each JavaScript closure accessible from Java we traverse all reachable Java objects in order to keep them alive until the closure becomes unreachable from Java. Well, while we were able to describe the current solution in a single sentence it doesn’t mean it doesn’t have drawbacks. This solution could be very slow if an object with large hierarchy, like global, is reachable from some closure. If this is the case, the implication is that we will traverse the whole V8 heap on each GC.

Back then in 2014, when we hit this issue for the first time, we discussed the option to customize part of the V8 garbage collector in order to provide faster heap traversing. The drawback is slower upgrade cycle for V8 which means that JavaScriptCore engine will provide more features at given point in time. For example, it is not easy to explain to the developers why they can use class syntax for iOS but not for Android.

Motivation: we wanted to keep V8 customization at minimum so we can achieve relatively feature parity by upgrading V8 engine as soon as possible.

So, now we know traversing V8 heap can be slow, what else? The current implementation is incomplete and case-by-case driven. This means that it is updated when there are important and common memory usage patterns. For example, currently we don’t traverse Map and Set objects.

Let’s see what can happen in practice. Create a default app.

tns create app1

Run the app and make sure it works as expected.

Now, we have to go through the process of designing a user scenario where the runtime will crash. We know that the current implementation doesn’t traverse Map and Set objects. So, we have to make Java object which is reachable only through, let’s say, Map object. This is only the first part of our exercise. We also must take care to make it reachable through a closure. Finally, we must give a chance for GC to collect it before we use it. So, let’s code it.

function crash() {
    var m = new Map();
    m.set('o', new java.lang.Object() /* via the map only */);
    var h = new android.os.Handler(android.os.Looper.getMainLooper());
    h.post(new java.lang.Runnable({
        run: function() {
            console.log(m.get('o').hashCode());
        }
    }));
}

That’s all. Finally, we have to integrate crash within our application. We can do so by modifying onTap handler in [proj_dir]/app/main-view-model.js as follows:

viewModel.onTap = function() {
    crash();
    gc();
    java.lang.Runtime.getRuntime().gc();
    this.counter--;
    this.set("message", getMessage(this.counter));
}

Run the app and click the button. You should get error screen similar to the following one.

Motivation: we wanted to evolve V8 heap traversing on case-by-case basis in order to traverse as little as possible.

Understanding this memory usage pattern (create object, set up object reachability, GC and usage) is a simple but powerful tool. With the current implementation the fix for Map and Set is similar to this one. Also, realizing that in the current implementation the missing references to the captured objects is the only reason for this error is critical for any further changes. This is well documented in the form of unit tests.

So far we discussed the drawbacks of the current implementation. Let’s say a few words about its advantages. First, and foremost, it keeps the current memory management model familiar to the existing Java and JavaScript developers. This is important in order to attract new developers. If two technologies, X and Y, solve similar problems and offer similar licenses, tools, etc., the developers are in favor for the one with simpler “mental model”. While introducing alloc/free or try/finally approach is powerful, it does not attract new developers because it sets higher entry level, less explicit approach. Another advantage, which is mostly for the platform developers, is the fact that current approach aligns well with many optimizations that can be applied. For example, taking advantage (introducing) of GC generations for the means of NativeScript runtime. Also, it allows per-application fine tuning of existing V8 flags (e.g, gc_interval, incremental_marking, minor_mc, etc.). Tweaking V8 flags won’t have general impact when manual memory management is applied. In my opinion, tuning these flags is yet another way to help regular Joe shooting himself in the foot, but providing sane defaults and applying adaptive schemes very possible could be a huge win.

It is important to note that whatever approach is applied, this must be done carefully because of the risk of OOM exception. Introducing schemes like GC generation should consider the object memory weight. This will make obsolete the current approaches that use time and/or memory pressure heuristics. In general, such GC generation approach will pay off well.

I hope I shed more light on this challenging problem. Looking forward to see how the team is going to approach it. Good luck!

How Android Instant Run Works

Today I installed Android Studio 2.0 Beta 7 and I really enjoyed one of its new features, namely Instant Run. We built a similar feature for NativeScript, named LiveSync, so I was curious to how Instant Run feature works.

The first thing I noticed is instant-run.jar placed in <my project>/build/intermediates/incremental-runtime-classes/debug directory. This library provides Server class in the com.android.tools.fd.runtime package where are the most interesting things. This class is a simple wrapper around android.net.LocalServerSocket which opens a unix domain socket with the application package name. And indeed during the code update you can see similar messages in the logcat window.

com.example.testapp I/InstantRun: Received connection from IDE: spawning connection thread

In short, when you change your code the IDE generates new *.dex file which is uploaded in files/instant-run/dex-temp directory which is private for your application package. It then communicates with the local server which uses com.android.tools.fd.runtime.Restarter class. The Restarter class in an interested one. It uses a technique very similar to the one we use in NativeScript Companion App to either restart the app or recreate the activities. The last one is a nice feature which the current {N} companion app doesn’t support. I find it a bit risky but probably it will work for most scenarios. I guess we could consider to implement something similar for {N}.

So far we know the basics of Instant Run feature. Let’s see what these *.dex files are and how they are used. For the purpose of this article I am going to pick a scenario where I change a code inside Application’s onCreate method. Note that in this scenario Instant Run feature won’t work since this method is called once. I pick this scenario just to show that this feature has some limitations. Nevertheless, the generated code shows clearly how this feature is designed and how it should work in general. Take a look at the following implementation.

package com.example.testapp;
public class MyApplication extends Application {
    private static MyApplication app;
    private String msg;
    public MyApplication() {
        app = this;
    }
    @Override
    public void onCreate() {
        super.onCreate();
        int pid = android.os.Process.myPid();
        msg = "Hello World! PID=" + pid;
    }
    public static String getMessage() {
        return app.msg;
    }
}

Let’s change msg as follows

msg = "Hello New World! PID=" + pid;

Now if I click Instant Run button the IDE will generate new classes.dex file inside <my project>/build/intermediates/reload-dex/debug directory. This file contains MyApplication$override class and we can see the following code.

public static void onCreate(MyApplication $this) {
  Object[] arrayOfObject = new Object[0];
  MyApplication.access$super($this, "onCreate.()V", arrayOfObject);
  AndroidInstantRuntime.setStaticPrivateField($this, MyApplication.class, "app");
  int pid = Process.myPid();
  AndroidInstantRuntime.setPrivateField($this, "Hello New World! PID=" + pid, MyApplication.class, "msg");
}

By now it should be easy to guess how the original onCreate method is rewritten. The rewritten MyApplication.class file is located in <my project>/build/intermediates/transforms/instantRun/debug/folders/1/5/main/com/example/testapp folder.

public void onCreate() {
  IncrementalChange localIncrementalChange = $change;
  if (localIncrementalChange != null) {
    localIncrementalChange.access$dispatch("onCreate.()V", new Object[] { this });
    return;
  }
  super.onCreate();
    
  app = this;
  int pid = Process.myPid();
  this.msg = ("Hello World! PID=" + pid);
}

As you can see there is nothing special. During compilation the new gradle-core-2.0.0-beta7.jar library uses the classes like com.android.build.gradle.internal.incremental.IncrementalChangeVisitor to instrument the compiled code so it can support Instant Run feature.

I hope this post sheds some light on how Android Instant Run feature works.

Efficient IO in Android

What could be simpler than a file copy? Well, it turned out that I underestimated such an easy task.

Here is the scenario. During the very first NativeScript for Android application startup the runtime extracts all JavaScript asset files to the internal device storage. The source code is quite simple and it was based on this example.

static final int BUFSIZE = 100000;

private static void copyStreams(InputStream is, FileOutputStream fos) {
    BufferedOutputStream os = null;
    try {
        byte data[] = new byte[BUFSIZE];
        int count;
        os = new BufferedOutputStream(fos, BUFSIZE);
        while ((count = is.read(data, 0, BUFSIZE)) != -1) {
            os.write(data, 0, count);
        }
        os.flush();
    } catch (IOException e) {
        Log.e(LOGTAG, "Exception while copying: " + e);
    } finally {
        try {
            if (os != null) {
                os.close();
            }
        } catch (IOException e2) {
            Log.e(LOGTAG, "Exception while closing the stream: " + e2);
        }
    }
}

It is important to note the in our code BUFSIZE constant has value 100000 while in the original example the value is 5192. While this code works as expected it turns out it is quite slow.

In our scenario we extract around 200 files and on LG Nexus 5 device it takes around 5.75 seconds. This is a lot of time. It turned out that most of this time is spent inside the garbage collector.

D/dalvikvm(8611): GC_FOR_ALLOC freed 265K, 2% free 17131K/17436K, paused 8ms, total 8ms
D/dalvikvm(8611): GC_FOR_ALLOC freed 398K, 4% free 16930K/17636K, paused 11ms, total 11ms
D/dalvikvm(8611): GC_FOR_ALLOC freed 197K, 4% free 16930K/17636K, paused 7ms, total 7ms
... around 650 more lines

The first thing I optimized was to make data variable a class member.

static final int BUFSIZE = 100000;

static final byte data[] = new byte[BUFSIZE];

private static void copyStreams(InputStream is, FileOutputStream fos) {
   // remove 'data' local variable
}

I thought this will solve the GC problem but when I ran the application I was greeted with the following familiar log messages.

D/dalvikvm(8408): GC_FOR_ALLOC freed 248K, 2% free 17212K/17496K, paused 7ms, total 8ms
D/dalvikvm(8408): GC_FOR_ALLOC freed 417K, 4% free 17029K/17696K, paused 8ms, total 8ms
D/dalvikvm(8408): GC_FOR_ALLOC freed 199K, 4% free 17029K/17696K, paused 7ms, total 7ms
... around 330 more lines

This time it took around 2.25 seconds to extract the files. And the GC kicked 330 times instead of 660 times. Well, it was better but it wasn’t what I wanted. The GC kicked twice less than the previous example but still it was too much.

The next thing I tried is to set BUFSIZE to 4096 instead of 100000.

static final int BUFSIZE = 4096;

This time it took around 0.85 seconds to extract the assets and the GC kicked 8 times.

D/dalvikvm(8218): GC_FOR_ALLOC freed 323K, 3% free 17137K/17496K, paused 8ms, total 8ms
D/dalvikvm(8218): GC_FOR_ALLOC freed 673K, 5% free 16947K/17684K, paused 8ms, total 9ms
D/dalvikvm(8218): GC_FOR_ALLOC freed 512K, 5% free 16947K/17684K, paused 8ms, total 9ms
... just 5 more lines

It was a nice improvement but I thought it should be faster than this. I was still puzzled with this relatively high level of GC activity so I decided to read the online documentation.

A specialized OutputStream for class for writing content to an (internal) byte array. As bytes are written to this stream, the byte array may be expanded to hold more bytes.

I’ve should read this before I start. It was a good lesson to me.

Once I knew what happens inside BufferedOutputStream internals I decided just not to use it. I call write method of FileOutputStream and voilà. The time to extract the assets is around 0.65 seconds and the GC kicks 4 times at most.

Out of curiosity I decided to try to bypass the GC using libzip C library. It took less than 0.2 seconds to extract the assets. Another option is to use AAssetManager class from NDK but I haven’t tried it yet. Anyway, it seems that IO processing is one of those areas where unmanaged code outperforms Java.