Efficient IO in Android

What could be simpler than a file copy? Well, it turned out that I underestimated such an easy task.

Here is the scenario. During the very first NativeScript for Android application startup the runtime extracts all JavaScript asset files to the internal device storage. The source code is quite simple and it was based on this example.

static final int BUFSIZE = 100000;

private static void copyStreams(InputStream is, FileOutputStream fos) {
    BufferedOutputStream os = null;
    try {
        byte data[] = new byte[BUFSIZE];
        int count;
        os = new BufferedOutputStream(fos, BUFSIZE);
        while ((count = is.read(data, 0, BUFSIZE)) != -1) {
            os.write(data, 0, count);
        }
        os.flush();
    } catch (IOException e) {
        Log.e(LOGTAG, "Exception while copying: " + e);
    } finally {
        try {
            if (os != null) {
                os.close();
            }
        } catch (IOException e2) {
            Log.e(LOGTAG, "Exception while closing the stream: " + e2);
        }
    }
}

It is important to note the in our code BUFSIZE constant has value 100000 while in the original example the value is 5192. While this code works as expected it turns out it is quite slow.

In our scenario we extract around 200 files and on LG Nexus 5 device it takes around 5.75 seconds. This is a lot of time. It turned out that most of this time is spent inside the garbage collector.

D/dalvikvm(8611): GC_FOR_ALLOC freed 265K, 2% free 17131K/17436K, paused 8ms, total 8ms
D/dalvikvm(8611): GC_FOR_ALLOC freed 398K, 4% free 16930K/17636K, paused 11ms, total 11ms
D/dalvikvm(8611): GC_FOR_ALLOC freed 197K, 4% free 16930K/17636K, paused 7ms, total 7ms
... around 650 more lines

The first thing I optimized was to make data variable a class member.

static final int BUFSIZE = 100000;

static final byte data[] = new byte[BUFSIZE];

private static void copyStreams(InputStream is, FileOutputStream fos) {
   // remove 'data' local variable
}

I thought this will solve the GC problem but when I ran the application I was greeted with the following familiar log messages.

D/dalvikvm(8408): GC_FOR_ALLOC freed 248K, 2% free 17212K/17496K, paused 7ms, total 8ms
D/dalvikvm(8408): GC_FOR_ALLOC freed 417K, 4% free 17029K/17696K, paused 8ms, total 8ms
D/dalvikvm(8408): GC_FOR_ALLOC freed 199K, 4% free 17029K/17696K, paused 7ms, total 7ms
... around 330 more lines

This time it took around 2.25 seconds to extract the files. And the GC kicked 330 times instead of 660 times. Well, it was better but it wasn’t what I wanted. The GC kicked twice less than the previous example but still it was too much.

The next thing I tried is to set BUFSIZE to 4096 instead of 100000.

static final int BUFSIZE = 4096;

This time it took around 0.85 seconds to extract the assets and the GC kicked 8 times.

D/dalvikvm(8218): GC_FOR_ALLOC freed 323K, 3% free 17137K/17496K, paused 8ms, total 8ms
D/dalvikvm(8218): GC_FOR_ALLOC freed 673K, 5% free 16947K/17684K, paused 8ms, total 9ms
D/dalvikvm(8218): GC_FOR_ALLOC freed 512K, 5% free 16947K/17684K, paused 8ms, total 9ms
... just 5 more lines

It was a nice improvement but I thought it should be faster than this. I was still puzzled with this relatively high level of GC activity so I decided to read the online documentation.

A specialized OutputStream for class for writing content to an (internal) byte array. As bytes are written to this stream, the byte array may be expanded to hold more bytes.

I’ve should read this before I start. It was a good lesson to me.

Once I knew what happens inside BufferedOutputStream internals I decided just not to use it. I call write method of FileOutputStream and voilà. The time to extract the assets is around 0.65 seconds and the GC kicks 4 times at most.

Out of curiosity I decided to try to bypass the GC using libzip C library. It took less than 0.2 seconds to extract the assets. Another option is to use AAssetManager class from NDK but I haven’t tried it yet. Anyway, it seems that IO processing is one of those areas where unmanaged code outperforms Java.

First Impressions using Windows 10 Technical Preview for phones

Windows 10 Technical Preview for phones was released two days ago and today I decided to give it a try. The installation process on my Nokia 630 was very smooth and completed for about 30 minutes including the migration of the old data. Finally, I ended up with WP10 OS version 9941.12498.

wp10

After I used WP10 for about 6 hours I can say that this build is quite stable. The UI and all animations are very responsive. So far I didn’t experience any crashes. The only glitch I found is that the brightness setting is not preserved after restart and it is set automatically to HIGH. All my data including photos, music and documents were preserved during the upgrade.

There are many productivity improvements in WP10. Action Center and Settings menu are much better organized. It seems that IE can render some sites better than before though I am not sure if it is the new IE rendering engine or just the site’s html has been optimized.

I checked to see whether there are changes in Chakra JavaScript engine but it seem the list of exported JsRT functions is the same as before. The actual version of jscript9.dll is 11.0.9941.0 (fbl_awesome1501.150206-2235).

I tested all of my previously installed apps (around 60) and they all work great. The perceived performance is the same, except for Lumia Panorama which I find slower and Minecraft PE which I find faster.

There are many new things for the developers as well. I guess one of the most interesting changes in WP10 is the improved speech support API. Using the speech API is really simple.

using Windows.Phone.Speech.Synthesis;

var s = new SpeechSynthesizer();
s.SpeakTextAsync("Hello world");

WP10 comes with two predefined voice profiles.

using using Windows.Phone.Speech.Synthesis;

foreach (var vi in InstalledVoices.All)
{
    var si = new SpeechSynthesizer();
    si.SetVoice(vi);
    await si.SpeakTextAsync(vi.Description);
}

The actual values of vi.Description are as follows.

Microsoft Zira Mobile - English (United States)
Microsoft Mark Mobile - English (United States)

You can hear how Zira and Mark actually sound below.

I find Mark’s voice a little bit more realistic.

This is all I got for today. In closing I would say it seems that WP10 has much more to offer. Stay tuned.

Object Oriented Programming: An Evolutionary Approach

This post is not about the book Object Oriented Programming: An Evolutionary Approach by Brad Cox. I decided to use the book’s title because the author nailed the connection between software and evolution. It is a good book, by the way. I recommend it.

Last week a coworker sent me a link to the React.js Conf 2015 Keynote video in which they introduced React Native. Because I work on NativeScript, I was curious to see how Facebook solves similar problems as we do. So, I finally got some time and watched the video. The presentation is short and probably the most important slide is the following one.

ReactNativeSlide

But this blog post is not about React Native. The thing that triggered me to write is something that the speaker said (the transcription is mine).

This is a component. This, we feel, is the proper separation of concerns for applications.

I couldn’t agree more. Components have been around for many years (don’t get me wrong, I am not bringing up one of those everything new is well-forgotten old themes). Yet, components and component-based development are not as widely accepted as I think they should be. It is probably because so many people/companies saw a value in software components and started defining/building them as they think it is the right way. And this process brought all the confusion what a component really is.

Beside the fact that the term component is quite overloaded it is important to note that many had tried to (re)define it in different times and environments/contexts. Nevertheless, some properties of software components were defined in exactly the same manner during the 70’s, 80’s, 90’s and later. Let’s see what these properties are.

  • binary standard/compatibility
  • separation of interface and implementation
  • language agnostic

These are some of the fundamental properties of any software component. More recent component definitions include properties like:

  • versioning
  • transaction support
  • security
  • etc.

But this blog post is not about software components either. It is about software evolution. We can define software evolution as a variation in software over the time. This definition is not complete but it is good enough. It can be applied on different levels, whether it is the software industry as a whole or a small application. It is important to say that software evolution is a result of our understanding about software, including an exact knowledge, culture and beliefs, at any point of time. We can reuse the analogy of terms like mutation, crossover, hybrid and so on to describe processes in software evolution.

Combining different ideas is one of the primary factor for software evolution. And this is where we need software components. There is a common comparison between software components and LEGO blocks. The analogy of software genes might be another alternative. An application DNA is defined by its genes.

Software evolution is not a linear process. Do you remember Twitter’s dance between client-side and server-side rendering? It is a great example of survival of the fittest principle. So, what will be the next thing in software evolution? I don’t think anybody know the answer. So far, the software components seem to be a practical way to go. Seeing big companies like Facebook to emphasize on composability is a good sign.

The best way to predict your future is to create it.
– Abraham Lincoln

The Quiet Horror of instanceof Operator

During the last months I was busy with NativeScript more than ever. While my work keeps me busy with embedding V8 JavaScript engine I rarely have the chance to write JavaScript. Recently I had to deal with mapping Java OOP inheritance into JavaScript and more specifically I had to fix a failing JavaScript unit test which uses instanceof operator. So I grabbed the opportunity to dig more into instanceof internals.

It is virtually impossible to talk about instanceof operator without mentioning typeof operator first. According MDN documentation

The typeof operator returns a string indicating the type of the unevaluated operand.

As described typeof operator does not seem useful. Probably the most interesting thing the use of unevaluated word. This allows us to test whether particular symbol is defined. For example

if (typeof x !== 'undefined')

will execute without ReferenceError even when x is not present.

Let’s see instanceof documentation

The instanceof operator tests whether an object has in its prototype chain the prototype property of a constructor.

After digging into instanceof operator I was even more puzzled. While typeof operator was introduced since the first edition of ECMAScript it seems that language designer(s) didn’t have clear idea about instanceof operator. It is mentioned as a reserved keyword in the second edition of ECMAScript and it is finally introduced into the third edition of ECMAScript. The operator definition is clear but I have troubles finding meaningful uses. Let’s see the following common example.

if (x instanceof Foo) {
   x.bar();
}

I feel uneasy with the assumption that if x has Foo‘s prototype somewhere in its prototype chain then it is safe to assume that bar exists. Mixing properties of nominal type system with JavaScript just doesn’t seem intuitive to me. I guess there are some practical scenarios where typeof and instanceof operators are useful but my guess is that their number is limited.

Embedding Chakra JavaScript Engine on Windows Phone

Today I am going to show you how to embed Chakra JavaScript engine in Windows Phone 8.1 app. Please note that at the time of writing this app won’t pass Microsoft Windows Store certification requirements. I won’t be surprised though if Microsoft reconsider their requirements in future.

Last year Microsoft released JsRT which exposes C-style API for embedding Chakra JavaScript engine. To use the API you only need to include jsrt.h and add a reference to jsrt.lib. On my machine the header file is located at

C:\Program Files (x86)\Windows Kits\8.1\Include\um\jsrt.h

and the lib files (for x86 and x64 accordingly) are located at

C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x86\jsrt.lib
C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x64\jsrt.lib

Curiously, there is no jsrt.lib for ARM architecture. It is even more interesting that JsRT is not exposed in Windows Phone SDK. E.g. you won’t find jsrt.h file in

C:\Program Files (x86)\Windows Phone Kits\8.1\Include

neither you will find jsrt.lib in

C:\Program Files (x86)\Windows Phone Kits\8.1\lib\ARM
C:\Program Files (x86)\Windows Phone Kits\8.1\lib\x86

However this shouldn’t discourage us. The first thing we should check is that JsRT API is exposed on Windows Phone 8.1. I know it is there because IE11 shares same source code for desktop and mobile and because Windows Phone 8.1 supports WinRT programming model. Anyway, let’s check it.

Find flash.vhd file. On my machine it is located at

C:\Program Files (x86)\Microsoft SDKs\Windows Phone\v8.1\Emulation\Images

Use Disk Management and attach flash.vhd file via Action->Attach VHD menu. Navigate to \Windows\System32 folder on MainOS partition and copy JSCRIPT9.DLL somewhere. Open Visual Studio command prompt and run the following command

dumpbin /exports JSCRIPT9.DLL >jscript9.def

Open jscript9.def file in your favorite editor and make sure you see the full JsRT API listed here. Edit the file so it becomes like this one https://gist.github.com/anonymous/88e44e8931cc8d118da9. Run the following command from the Visual Studio command prompt

lib /def:jscript9.def /out:jsrt.lib /machine:ARM

This will generate import library so you can use all exports defined in JSCRIPT9.DLL library. We are almost ready.

We generated jsrt.lib import library for ARM architecture, what’s next? In order to use jsrt.h header in our Windows Phone 8.1 project we must edit it a little bit. First copy it and its dependencies to your project. Here is the list of all the files you should copy

  • activdbg.h
  • activprof.h
  • ActivScp.h
  • DbgProp.h
  • jsrt.h

In case you don’t want JavaScript debugging support you can copy jsrt.h file only and replace all pointers to the interfaces from ActiveScript API with void*. Once you copy the the header files you must edit them to switch to Windows Phone API. To do so, you have to replace WINAPI_PARTITION_DESKTOP with WINAPI_PARTITION_PHONE_APP. It may sound like a lot of work but it is just a few lines change. You can see the change here.

That’s it. Now you can use the new header and lib files in your project. You can find the full source code at https://github.com/mslavchev/chakra-wp81.

In closing I would like to remind you that at present this app won’t pass Windows Store certification requirements. Here is the list of the requirement violations

Supported API test (FAILED)
    This API is not supported for this application type - Api=CoGetClassObject. Module=api-ms-win-core-com-l1-1-1.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsCreateContext. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsCreateRuntime. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsDisposeRuntime. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsRunScript. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsSetCurrentContext. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsStartDebugging. Module=jscript9.dll. File=ChakraDemoApp.exe.
    This API is not supported for this application type - Api=JsStringToPointer. Module=jscript9.dll. File=ChakraDemoApp.exe.

Hopefully Microsoft will revisit their requirements.

Running JavaScriptCore on Windows Phone 8.1

After the first release of NativeScript I decided to spend some time playing with JavaScriptCore engine. We use it in NativeScript bridge for iOS and so far I heard good words about it from my colleagues. So I decided to play with JavaScriptCore and compare it to V8 engine.

At present NativeScript supports Android and iOS platforms only. We have plans to add support for Windows Phone as well and I thought it would be nice to have some experience with JavaScriptCore on Windows Phone before we make a choice between Chakra and JavaScriptCore.

The first difference I noticed between JavaScriptCore and V8 is that it has an API much closer to C style while the V8 API is entirely written in C++. This is not an issue and sometimes I consider it as an advantage.

The second difference, in my opinion, is that JavaScriptCore API is more simpler and expose almost no extension points. From this point of view I consider the V8 API the better one. Compared to JavaScriptCore, V8 provides much richer API for controlling internals of the engine like JIT compilation, object heap management and garbage collection.

Nevertheless it was fun to play with JavaScriptCore engine. You can find a sample project at GitHub.

Synchronizing GC in Java and V8

In the last post I wrote that I work on a project that involves a lot of interoperability between Java and V8 JavaScript engine. Here is an interesting problem I was investigating the last couple of days.

Both V8 and JVM use garbage collector for memory management. While using GC provides a lot of benefits sometimes having two garbage collectors in a single process can be tricky though. Suppose we have a super-charged version of LiveConnect where we have access to the full Java API.

var file = new java.io.File("readme.txt");

console.log("length=" + file.length());

These two lines of JavaScript may seem quite simple at first glance. We create an instance of java.io.File and call one of its methods. The tricky part is that we are doing this from JavaScript and we must take care that the actual Java instance would not be GC’ed before we call length method. In other words, we should provide some form of memory management. Suppose we decide to use JNI global references and we call NewGlobalRef every time when we create a new Java object from JavaScript. Accordingly we call DeleteGlobalRef when V8 makes Java object unreachable from JavaScript.

Let’s see a more complicated scenario.

var outStream = new java.io.FileOutputStream("log.txt");

var eventCallback = new com.example.EventCallback({
    onDataReceived: function(data) {
       outStream.write(data);
    }
});

var listener = new com.example.EventListener(eventCallback);

In this case we create an instance of com.example.EventCallback and provide its implementation in JavaScript. Now suppose that all these three JavaScript objects become unreachable and V8 is ready to GC them. Just because all of these objects are unreachable in JavaScript it does not mean that their actual counterparts in Java are unreachable. It’s possible that listener and eventCallback objects are still reachable through a stack of a listener Java thread.

gcchain

Now comes the interesting detail. While in JavaScript eventCallback has a reference to outStream through the function onDataReceived there is no such reference in Java and it is legitimate for Java GC to collect outStream object. The next time when the callback object calls write method there won’t be a corresponding Java object and the application will fail.

There are several solutions to this problem. One of them is to maintain the reachability in Java GC heap graph in sync with the one in JavaScript. After all, if there is an edge connecting eventCallback and outStream Java GC won’t try to collect the latter.

There are two options:

  • sync Java heap graph automatically
  • sync Java heap graph manually

As usual there is a trade-off. While the first option is very desirable there is a price to pay. We should analyze every closure in V8 that is GC’ed and traverse all objects reachable from there. This could slow down the GC by orders of magnitude.

The second option also has drawbacks. In general, JavaScript developers are not used to manual memory management. Introducing new memory management API could cause a lot of discomfort to the less experienced JavaScript developers.

scope(eventCallback, outStream);

Event if we make the API nice and simple, there is a burden of the mental model that JavaScript developers have to maintain. I tend to prefer this option though because many C/C++ developers proved it is possible to build high quality software using manual memory management.

In closing I would say that there are other solutions to this problem. I’ll discuss them in another blog post.

Java and V8 Interoperability

The project I currently work on involves a lot of Java/JavaScript (V8 JavaScript engine) interoperability. Fortunately, Java provides JNI and V8 has a nice C++ API which make the integration process very smooth. Most of the Java-JNI-V8 type marshaling is quite straightforward but there is one exception.

The JNI uses modified UTF-8 strings to represent various string types. Modified UTF-8 strings are the same as those used by the Java VM. Modified UTF-8 strings are encoded so that character sequences that contain only non-null ASCII characters can be represented using only one byte per character, but all Unicode characters can be represented.

I was well aware of this fact since the beginning of the project but somehow I neglected it. Until recently, when one of my colleagues showed me a peculiar bug that turned out to be related to the process of marshaling a non-trivial Unicode string.

At first, I tried a few quick and dirty workarounds just to prove that the root of problem is more complex it seemed. Then I realized that jstring type is not the best type when it comes to string interoperability with V8 engine. I decided to use jbyteArray type instead of jstring though I had some concerns about the performance overhead.

private static native void doSomething(byte[] strData);

String s = "some string";
byte[] strData = s.getBytes("UTF-8");
doSomething(strData);

The code doesn’t look ugly though the string version looks better. I did microbenchmarks and it turned out the performance is good enough for my purposes. Nevertheless, I decided to compare the performance with Nashorn JavaScript engine. As expected, Nashorn implementation was faster because it uses the same internal string format as the JVM.

On Agile Practices

I have recently read the article What Agile Teams Think of Agile Principles from Laurie Williams and it got me thinking. The study conclusion is as follows:

The authors of the Agile Manifesto and the original 12 principles spelled out the essence of the agile trend that has transformed the software industry over more than a dozen years. That is, they nailed it.

Here are the top 10 agile practices from the case study.

Agile practice Mean Standard Deviation
Continuous integration 4.5 0.8
Short iterations (30 days of less) 4.5 0.8
“Done” criteria 4.5 0.8
Automated tests run with each build 4.4 0.9
Automated unit testing 4.4 0.9
Iterations review/demos 4.3 0.8
“Potentially shippable” features at the end of each iteration 4.3 0.9
“Whole” multidisciplinary team with one goal 4.3 0.8
Synchronous communication 4.4 0.9
Embracing changing requirements 4.3 0.8

These are indeed practices instead of exact science and I am going to elaborate more on this topic. But first I would like to recap a few things from the history of the software industry.

Making successful software is hard. Many software projects failed in the past and many software projects are failing now. There are a lots of studies that confirm it. Some studies claim that more than 50% of all software projects fail. In order to improve the rate of successful projects we tried to adopt know-how from other industries. The software industry adopted metaphors like building software and software engineering. We started to apply waterfall methodologies and rigorous scientific methods for defining software requirements like UML. We tried many things in order to do better software but not much changed.

Then people came up with the idea of agile software methodology. Agile manifesto states:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

Agile methodology proposes different mindset. We started put emphasis on things like creativity and self-organizing teams more than engineering. Nowadays, we use metaphors like writing software much more often than 15 years ago. Some people go further by comparing programmers with writers and consequently in order to do good software we need good writers instead of good engineers. While I find such claims a bit controversial they are many people who share similar opinions. In general, today we talk about software craftsmanship instead of software engineering.

These two approaches are not mutually exclusive. I see good trends of merging both of them whenever it is reasonable. It is natural for people to select the best from both worlds. Still, agile methodology is considered young. Most of the software companies still publish their job offerings as “Software Engineer Wanted” instead of “Software Craftsman Wanted“. This is only one example of what we have inherited in IT industry. It does not matter how much an IT company boasts how agile it is, the fact is that we need time to fully adopt the new mindset. The good thing is that the new mindset focuses on the individual and I think this is the key for better software.