Mobile – Never Ending Journey

How Android Instant Run Works

Today I installed Android Studio 2.0 Beta 7 and I really enjoyed one of its new features, namely Instant Run. We built a similar feature for NativeScript, named LiveSync, so I was curious to how Instant Run feature works.

The first thing I noticed is instant-run.jar placed in <my project>/build/intermediates/incremental-runtime-classes/debug directory. This library provides Server class in the com.android.tools.fd.runtime package where are the most interesting things. This class is a simple wrapper around android.net.LocalServerSocket which opens a unix domain socket with the application package name. And indeed during the code update you can see similar messages in the logcat window.

com.example.testapp I/InstantRun: Received connection from IDE: spawning connection thread

In short, when you change your code the IDE generates new *.dex file which is uploaded in files/instant-run/dex-temp directory which is private for your application package. It then communicates with the local server which uses com.android.tools.fd.runtime.Restarter class. The Restarter class in an interested one. It uses a technique very similar to the one we use in NativeScript Companion App to either restart the app or recreate the activities. The last one is a nice feature which the current {N} companion app doesn’t support. I find it a bit risky but probably it will work for most scenarios. I guess we could consider to implement something similar for {N}.

So far we know the basics of Instant Run feature. Let’s see what these *.dex files are and how they are used. For the purpose of this article I am going to pick a scenario where I change a code inside Application’s onCreate method. Note that in this scenario Instant Run feature won’t work since this method is called once. I pick this scenario just to show that this feature has some limitations. Nevertheless, the generated code shows clearly how this feature is designed and how it should work in general. Take a look at the following implementation.

package com.example.testapp;
public class MyApplication extends Application {
    private static MyApplication app;
    private String msg;
    public MyApplication() {
        app = this;
    }
    @Override
    public void onCreate() {
        super.onCreate();
        int pid = android.os.Process.myPid();
        msg = "Hello World! PID=" + pid;
    }
    public static String getMessage() {
        return app.msg;
    }
}

Let’s change msg as follows

msg = "Hello New World! PID=" + pid;

Now if I click Instant Run button the IDE will generate new classes.dex file inside <my project>/build/intermediates/reload-dex/debug directory. This file contains MyApplication$override class and we can see the following code.

public static void onCreate(MyApplication $this) {
  Object[] arrayOfObject = new Object[0];
  MyApplication.access$super($this, "onCreate.()V", arrayOfObject);
  AndroidInstantRuntime.setStaticPrivateField($this, MyApplication.class, "app");
  int pid = Process.myPid();
  AndroidInstantRuntime.setPrivateField($this, "Hello New World! PID=" + pid, MyApplication.class, "msg");
}

By now it should be easy to guess how the original onCreate method is rewritten. The rewritten MyApplication.class file is located in <my project>/build/intermediates/transforms/instantRun/debug/folders/1/5/main/com/example/testapp folder.

public void onCreate() {
  IncrementalChange localIncrementalChange = $change;
  if (localIncrementalChange != null) {
    localIncrementalChange.access$dispatch("onCreate.()V", new Object[] { this });
    return;
  }
  super.onCreate();
    
  app = this;
  int pid = Process.myPid();
  this.msg = ("Hello World! PID=" + pid);
}

As you can see there is nothing special. During compilation the new gradle-core-2.0.0-beta7.jar library uses the classes like com.android.build.gradle.internal.incremental.IncrementalChangeVisitor to instrument the compiled code so it can support Instant Run feature.

I hope this post sheds some light on how Android Instant Run feature works.

What’s New in Chakra JavaScript Engine

A few weeks ago I decided to install Windows 10 Mobile Insider Preview on my Nokia Lumia 630 and played a little bit with it. Since then, I have completely forgotten about it until yesterday when I saw a notification for pending software update (10.0.12562.84). So I grabbed the opportunity to see what changed in Chakra JavaScript engine and JsRT API.

Overview

I am going highlight only some of the architectural changes in Chakra, for more information you can read MSDN documentation. Firstly, Microsoft decided to create new chakra.dll library and keep the old jscript9.dll library for compatibility reasons. This is a good decision because it allows shorter release cycles and provides some space for experimentation as well. Secondly, it seems that Microsoft is all into performance optimizations right now. Some of the most important optimizations are:

concurrent JIT compiler
new simple JIT compiler (when bailout happens)
improved polymorphic inline cache
equivalent object type specialization
bounds checking elimination (array optimization)
minified code optimization (sounds interesting and very promising)
concurrent mark-and-sweep GC (mark phase)

Lastly, with the upcoming ECMAScript 6 Microsoft decided to provide better support for it which is a big win for everybody.

JsRT

This is where it becomes interesting. As I work on NativeScript project, I would like to access WinRT APIs from JavaScript. In fact, Microsoft already supports this scenario in WinJS but I am interested in accessing all WinRT APIs and being able to build XAML based UI from JavaScript. Last September I blogged how to embed Chakra in Windows Phone 8.1 but back then this scenario was practically not supported by Microsoft. There wasn’t even jscript9.lib import library for ARM.

I am happy to say that those days are gone. Now, JsRT provides better support for WinRT projections. This is done through the following APIs:

JsProjectWinRTNamespace
JsInspectableToObject
JsObjectToInspectable

Let’s see how this works (I assume you have already installed Windows 10 Technical Preview and Visual Studio 2015 RC). Create new WinRT library project (Visual C++ -> Windows -> Windows Universal -> Windows Runtime Component). In my case I named it WindowsRuntimeComponent1 and created a simple Greeter class as follows.

namespace WindowsRuntimeComponent1
{
    public ref class Greeter sealed
    {
    public:
        Platform::String^ SayHello()
        {
            return ref new Platform::String(L"Hello");
        }
    };
}

Create an empty app (Visual C++ -> Windows -> Windows Universal -> Blank App) and add reference to the WindowsRuntimeComponent1 project. You have to define the macro USE_EDGEMODE_JSRT in order to use the new JsRT API and link against chakrart.lib as well. Projecting WinRT classes is as easy as follows.

JsErrorCode err = JsProjectWinRTNamespace(L"WindowsRuntimeComponent1");
assert(JsNoError == err);

Now we are ready to consume the projected WinRT classes from JavaScript.

var g = new WindowsRuntimeComponent1.Greeter();
var s = g.sayHello();

I have to say that the debugging experience is almost perfect. I say “almost” only because I don’t see script debugging for ARM devices. I guess since this is Visual Studio 2015 RC it is a kind of expected. Also, you can always use script debugger on Windows Phone emulator since it is running x86 code.

You can find the sample project at GitHub.

Conclusion

Using the new JsRT together with Windows 10 Universal Application Platform (UAP) makes it easy to write apps that use JavaScript scripting. The good thing is that UAP guarantees that your apps will work across all kind of devices. There are some important limitations though:

cannot use XAML types (I guess it is still related to WebHostHidden attribute)
cannot extend types from JavaScript (again related to XAML)
cannot access Chakra in WinJS apps from WinRT components

I guess if you don’t want to build JavaScript/native bridges then the new JsRT is good enough. Resolving the above-mentioned issues will allow writing much more sophisticated apps though. Right now, you can use JsRT for simple scripting and nothing else. Making Chakra engine an open-source project will solve these and other issues. It will allow people to contribute to and customize the engine. Will it ever happen? Only time will tell.

NativeScript Performance – Part 2

The last two weeks I was busy with measuring and optimizing the performance of NativeScript for Android. My main focus was the application startup time and I would like to share some good news.

Results

Let’s first see the results and then I will dig into the details. As in the previous tests I uses the same test devices:

Device1 – Nexus 5, Android 4.4.1, build KOT49E
Device2 – Nexus 6, Android 5.0.1, build LRX22C

I used the same application as well. Here are the results:

For Device1 the first startup time was reduced from average 3.1419 seconds to average 2.8262 seconds (10% improvement) [*]
For Device2 the first startup time was reduced from average 3.541 seconds to average 3.3147 seconds (6% improvement) [*]

Details

Before I dig into the details, I would like to give you a quick reminder how I measured the times. As in the previous tests I used the built-in time/perf info that Android ActivityManager provides. It is not the best measuring tool but it is good enough for our purposes.

After detailed profiling with DDMS and NDK profilers I identified two areas for improvements:

asset extraction
proxy property access

Assets

The old implementation for asset extraction was based on AssetManager. While its API is very convenient, it is not well suited for optimal memory allocation. As a result using AssetManager along with java.io.* classes generates a lot of temporary objects which triggers the GC quite often. The solution we chose is to use libzip C++ library. It is fast and more importantly it doesn’t mess with the GC.

For applications with size similar to the test app using libzip doesn’t help much. The actual improvement is around 30-40 milliseconds. However, for big apps (e.g. 500+ files) libzip really shines. You can easily get improvement of 300-500ms, and in some scenarios more than a second. This was a good reason to reimplement the Java code into C++ and give NativeScript the ability to scale really well.

Java Object Wrappers

Proxies are an experimental ECMAScript 6 feature. In V8 (and for the matter of fact in any other JavaScript engine), direct property access is much faster than direct proxy access. This is easily understandable when you think how the JIT compiler emits the code to access traditional properties. Also, while proxies are good for scripting simple object access they don’t scale in more complex scenarios. With the time it becomes harder to implement the correct dispatch logic.

I am glad to say that we now use plain JavaScript objects to wrap Java objects. We also build the correct prototype chain to map Java class hierarchy. This give us an excellent opportunity to cache runtime objects at more granular level. And as we are going to see, caching changes everything.

While using libzip helped a little bit, it is easy to do the math and see that using prototype chains is the main factor for the improved startup time.

Let’s see how the new caches impact other scenarios. Take a look at the following code fragment.

var JavaDate = java.util.Date;
var start = new Date();
for (var i=0; i<10000; i++) {
    var d1 = new JavaDate();
    var d2 = new JavaDate();
    d1.compareTo(d2);
    d2.compareTo(d1);
}
var end = new Date();
console.log("time=" + (end.getTime() - start.getTime()));

This is not a real world scenario. I wrote this code for sole test purposes. My intent here is to exercise some Java intensive code. Also, note that using JavaScript Date.getTime is not the best way to measure time, but as we are going to see it is good enough for our purposes.

Here are the results.

On Device1 – using proxy objects it takes more than 12.5 seconds, using prototype chain it takes less than 2.6 seconds
On Device2 – using proxy objects it takes more than 11.6 seconds, using prototype chain it takes less than 2.2 seconds

In my opinion, there is no need for any further or more precise benchmarks. Simply put, using prototype chains along with proper caching is much faster than proxy objects.

Further Improvements

So far, we saw that the first startup of a simple application like CutenessIO takes around 3 seconds. Can we make it faster?

First, we have to set some reasonable expectations. Let’s see how fast HelloWorld applications written in Java and NativeScript start up. For the Java version I used the standard Eclipse project template (which is very similar to the one in Android Studio). I stripped all things like menus and fancy themes. My main goal was the make it as simple as possible (which is not much different from the standard empty project). I did the same for the NativeScript project.

Here are the results.

On Device1 – Java 200 milliseconds[*], NativeScript 641.5 milliseconds[*]
On Device2 – Java 333.5 milliseconds[*], NativeScript 875.3 milliseconds[*]

So, we have to investigate where the difference comes from. For the purpose of this article, I am going to pick Device1 (the analysis for Device2 is the same).

Let’s analyze a particular run.

Time for loading libNativeScript library: 7ms
Time for extracting assets: 30ms
Time for V8 initialization: 150ms
Time for calling Application.onCreate in JavaScript: 60ms
Time for calling Activity.onCreate in JavaScript: 100ms
Time from Application object initialization to Activity initialization: 510ms
Time to display main activity: 658ms

As we can see, the total time of asset extraction and V8 initialization is 180ms which is roughly the time needed for pure Java application to start. So far, it seems unlikely to reduce this time.

The total time spent in running JavaScript 160ms. This is a bit surprising. I would love to see the time spent in V8 to be, say, 400ms because this would mean that running JavaScript is 78% (400/510) of all time. High percentage of time spent inside in V8 is a good thing because this will give us an opportunity to optimize the performance. However, this would not be the case for most applications. We can think of NativeScript as a way to command Java world from JavaScript. Hence, most of the work is done in Java. That’s the nature of NativeScript.

So, we spent 160ms running a few lines of JavaScript. Can we do better? A careful analysis showed that most of this time is spent in JNI infrastructure calls and data marshalling. It seems hard to reduce it, but not unlikely. A possible option is to tweak V8 engine and/or use libffi to generate thunks.

Another 200ms is spent in some run-once pluming code. With a little effort, we could refactor the runtime to support components/modules and gain some performance. Finally, some time is spent inside the Java GC.

In closing, I would say that currently NativeScript for Android is performing well. There are no major performance issues. The current implementation is approaching the point where no big performance wins can be easily achieved. But easy is not interesting 😉 Stay tuned.

On NativeScript Performance

Overview

Last week NativeScript made it into public beta and just for a few days we got tremendous amount of feedback. One question that came up over and over again was, “How do NativeScript Apps Perform”? In this post, I want to explain the details behind performance and share some great news with you about the upcoming release of NativeScript.

How it started

As other new projects NativeScript started from the idea to take a new look at the cross-platform mobile development with JavaScript. In the beginning, we had to determine if the concept of NativeScript was even feasible. Should we translate JavaScript into Java? What about Objective-C back into JavaScript? During this exploratory phase, we learned that the answer was actually much simpler than this thanks to the JavaScript bridge that exists for both iOS and Android. Well, thanks to Android fragmentation, this is only partially true. Let me explain…

Challenges

Working on a project like NativeScript is anything but easy. There are many challenges imposed by working with two very different runtimes like Dalvik and V8. Add the restricted environment in Android and you will get the idea. Controlling object lifetime when you have two garbage collectors, efficient type marshalling, lack of 64bit integers in JavaScript, correctly working with different UTF-8 encodings, and overloaded method resolution, just to name a few. All these are nontrivial problems.

Statically Generated Bindings

One specific problem is the extending/subclassing of Java types from JavaScript. It is astonishing how a simple task like working with a UI widget becomes a challenging technical problem. It takes no longer to look than the Button documentation and its seemingly innocent example.

button.setOnClickListener(new View.OnClickListener() {
    public void onClick(View v) {
        // Perform action on click
    }
});

While the Java compiler is there for you to generate an anonymous class that implements View.OnClickListener interface there is no such facility in JavaScript. We solved this problem by generating proxy classes (bindings). Basically we generated *.java source files, compiled them to *.class files which in turn were compiled to *.dex files. You can find these *.dex files in assets/bindings folder of every NativeScript for Android project. The total size of these files is more than 12MB which is quite a lot.

Here begins the interesting part. Android 5 comes with a new runtime (ART). One of major changes in ART is the ahead-of-time (AOT) compiler. Now you can imagine what happens when the AOT compiler has to compile more than 12MB *.dex files on the very first run of any NativeScript for Android application. That’s right, it takes a long time. The problem is less apparent in Android 4.x but it is still there.

Dynamically Generated Bindings

The solution is obvious. We simply need to generate bindings in runtime instead of compile time. The immediate advantages are that we will generate bindings only for those classes that we actually extend in JavaScript. Lesser the bindings, lesser the work for the AOT compiler.

We started working on the new binding generator right after the first private beta. We were almost done for the public beta. However, almost doesn’t count. We decided to play safe and release the first beta with statically generated bindings. The good news is that the new binding generator is already merged in the master branch (only two days after the public beta announcement).

Today I ran some basic performance tests on the following devices:

Device1 – Nexus 5, Android 4.4.1, build KOT49E
Device2 – Nexus 6, Android 5.0.1, build LRX22C

For the tests I used the built-in time/perf info that Android OS provides. You probably have seen similar information in your logcat console.

I/ActivityManager(770): START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=com.tns/.NativeScriptActivity} from pid 1030
....
I/ActivityManager(770): Displayed com.tns/.NativeScriptActivity: +3s614ms

Here are the results:

For Device1 the first start-up time was reduced from average 60.761 seconds to average 3.1419 seconds
For Device2 the first start-up time was reduced from average 39.384 seconds to average 3.541 seconds

A consequential start-up time for both devices is ~2.5 or less seconds.

What’s next

There is a lot of room for performance improvement. Currently NativeScript for Android uses JavaScript proxy object to get a callback when Java field is accessed or Java method is invoked. The problem is that proxy objects (interceptors) are not fast. We plan to replace them with plain JavaScript objects that have properly constructed prototype chain with accessors instead of interceptors. Another benefit of using prototype chains with accessors is that we will support JavaScript instanceof operator.

Another area for improvement is the memory management. Currently, we generate a lot of temporary Java objects which may kick the Java GC unnecessary often. Moving some parts of the runtime from Java to C++ is a viable option that we are going to explore.

Conclusion

In closing, I would like to say that we are astounded by how popular NativeScript has become in such a short amount of time. We have learned so much in the building the NativeScript runtime, and our experience in that process helps us improve NativeScript every single day. We’re looking forward to version 1. Building truly native mobile applications with native performance using JavaScript is the future, and the future is now.

First Impressions using Windows 10 Technical Preview for phones

Windows 10 Technical Preview for phones was released two days ago and today I decided to give it a try. The installation process on my Nokia 630 was very smooth and completed for about 30 minutes including the migration of the old data. Finally, I ended up with WP10 OS version 9941.12498.

After I used WP10 for about 6 hours I can say that this build is quite stable. The UI and all animations are very responsive. So far I didn’t experience any crashes. The only glitch I found is that the brightness setting is not preserved after restart and it is set automatically to HIGH. All my data including photos, music and documents were preserved during the upgrade.

There are many productivity improvements in WP10. Action Center and Settings menu are much better organized. It seems that IE can render some sites better than before though I am not sure if it is the new IE rendering engine or just the site’s html has been optimized.

I checked to see whether there are changes in Chakra JavaScript engine but it seem the list of exported JsRT functions is the same as before. The actual version of jscript9.dll is 11.0.9941.0 (fbl_awesome1501.150206-2235).

I tested all of my previously installed apps (around 60) and they all work great. The perceived performance is the same, except for Lumia Panorama which I find slower and Minecraft PE which I find faster.

There are many new things for the developers as well. I guess one of the most interesting changes in WP10 is the improved speech support API. Using the speech API is really simple.

using Windows.Phone.Speech.Synthesis;

var s = new SpeechSynthesizer();
s.SpeakTextAsync("Hello world");

WP10 comes with two predefined voice profiles.

using using Windows.Phone.Speech.Synthesis;

foreach (var vi in InstalledVoices.All)
{
    var si = new SpeechSynthesizer();
    si.SetVoice(vi);
    await si.SpeakTextAsync(vi.Description);
}

The actual values of vi.Description are as follows.

Microsoft Zira Mobile - English (United States)
Microsoft Mark Mobile - English (United States)

You can hear how Zira and Mark actually sound below.

I find Mark’s voice a little bit more realistic.

This is all I got for today. In closing I would say it seems that WP10 has much more to offer. Stay tuned.