Java and V8 Interoperability

The project I currently work on involves a lot of Java/JavaScript (V8 JavaScript engine) interoperability. Fortunately, Java provides JNI and V8 has a nice C++ API which make the integration process very smooth. Most of the Java-JNI-V8 type marshaling is quite straightforward but there is one exception.

The JNI uses modified UTF-8 strings to represent various string types. Modified UTF-8 strings are the same as those used by the Java VM. Modified UTF-8 strings are encoded so that character sequences that contain only non-null ASCII characters can be represented using only one byte per character, but all Unicode characters can be represented.

I was well aware of this fact since the beginning of the project but somehow I neglected it. Until recently, when one of my colleagues showed me a peculiar bug that turned out to be related to the process of marshaling a non-trivial Unicode string.

At first, I tried a few quick and dirty workarounds just to prove that the root of problem is more complex it seemed. Then I realized that jstring type is not the best type when it comes to string interoperability with V8 engine. I decided to use jbyteArray type instead of jstring though I had some concerns about the performance overhead.

private static native void doSomething(byte[] strData);

String s = "some string";
byte[] strData = s.getBytes("UTF-8");
doSomething(strData);

The code doesn’t look ugly though the string version looks better. I did microbenchmarks and it turned out the performance is good enough for my purposes. Nevertheless, I decided to compare the performance with Nashorn JavaScript engine. As expected, Nashorn implementation was faster because it uses the same internal string format as the JVM.

On Agile Practices

I have recently read the article What Agile Teams Think of Agile Principles from Laurie Williams and it got me thinking. The study conclusion is as follows:

The authors of the Agile Manifesto and the original 12 principles spelled out the essence of the agile trend that has transformed the software industry over more than a dozen years. That is, they nailed it.

Here are the top 10 agile practices from the case study.

Agile practice Mean Standard Deviation
Continuous integration 4.5 0.8
Short iterations (30 days of less) 4.5 0.8
“Done” criteria 4.5 0.8
Automated tests run with each build 4.4 0.9
Automated unit testing 4.4 0.9
Iterations review/demos 4.3 0.8
“Potentially shippable” features at the end of each iteration 4.3 0.9
“Whole” multidisciplinary team with one goal 4.3 0.8
Synchronous communication 4.4 0.9
Embracing changing requirements 4.3 0.8

These are indeed practices instead of exact science and I am going to elaborate more on this topic. But first I would like to recap a few things from the history of the software industry.

Making successful software is hard. Many software projects failed in the past and many software projects are failing now. There are a lots of studies that confirm it. Some studies claim that more than 50% of all software projects fail. In order to improve the rate of successful projects we tried to adopt know-how from other industries. The software industry adopted metaphors like building software and software engineering. We started to apply waterfall methodologies and rigorous scientific methods for defining software requirements like UML. We tried many things in order to do better software but not much changed.

Then people came up with the idea of agile software methodology. Agile manifesto states:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

Agile methodology proposes different mindset. We started put emphasis on things like creativity and self-organizing teams more than engineering. Nowadays, we use metaphors like writing software much more often than 15 years ago. Some people go further by comparing programmers with writers and consequently in order to do good software we need good writers instead of good engineers. While I find such claims a bit controversial they are many people who share similar opinions. In general, today we talk about software craftsmanship instead of software engineering.

These two approaches are not mutually exclusive. I see good trends of merging both of them whenever it is reasonable. It is natural for people to select the best from both worlds. Still, agile methodology is considered young. Most of the software companies still publish their job offerings as “Software Engineer Wanted” instead of “Software Craftsman Wanted“. This is only one example of what we have inherited in IT industry. It does not matter how much an IT company boasts how agile it is, the fact is that we need time to fully adopt the new mindset. The good thing is that the new mindset focuses on the individual and I think this is the key for better software.

ECMAScript 262 And Browser Compatibility

I was curious to check how well the current browsers implement ECMAScript 262 specification. At the time of writing I have the following browsers installed on my machine:

  • Google Chrome (34.0.1847.116 m)
  • Microsoft IE (11.0.9600.17031 / 11.0.7)
  • Mozilla Firefox (28.0)

I guess Chrome, IE and Firefox present at least 90% of all desktop browsers. I tested them against the test suite provided by ECMA (http://test262.ecmascript.org) and it turned out none of them passed all the tests. Here is the list of failing tests for each browser.

I know that some of the smartest guys in IT work on these projects for many years. So, the next time I hear that JavaScript is a simple scripting language I won’t consider it seriously.

JustMock Lite is now open source

A few weeks ago something important happened. I am very happy to say that JustMock Lite is now open source. You can find it on GitHub.

In this post I would like to share some bits of history. JustMock was my first project at Telerik. It was created by a team of two. The managed API was designed and implemented by my friend and colleague Mehfuz and I implemented the unmanaged CLR profiler. The project was done in a very short time. I spent 6 weeks to implement all the functionality for so called elevated mocking. This includes mocking static, non-virtual methods and sealed types. After a few iterations JustMock was released early in April 2010.

I remember my very first day at Telerik. I had a meeting with Hristo Kosev and together we set the project goals. It turned out JustMock was just an appetizer for JustTrace. Back then we did not have much experience with the CLR unmanaged profiling API and Hristo wanted to extend Telerik product family with a performance and memory profiling tool. So, the plans were to start with JustMock and gain know-how before we build JustTrace. Step by step, we extended the team and JustMock/JustTrace team was created. Here is the door sign that the team used to have.

jmjt

Later the team changed its name to MATTeam (mocking and tracing team).

Looking back, I think we built two really good products. As far as I know, at the time of writing this post JustMock is still the only tool that can mock of the most types from mscorlib.dll assembly. JustTrace also has its merits. It was the first .NET profiler with support for profiling managed Windows Store apps. I left MATTeam an year ago and I hope soon I can tell you about what I work on. Stay tuned.

Native code profiling with JustTrace

The latest JustTrace version (Q1 2014) has some neat features. It is now possible to profile unmanaged applications with JustTrace. In this post I am going to show you how easy it is to profile native applications with JustTrace.

For the sake of simplicity I am going to profile notepad.exe editor as it is available on every Windows machine. First, we need to setup the symbol path folder so that JustTrace can decode correctly the native call stacks. This folder is the place where all required *.pdb files should be.

jtsettings

In most scenarios, we want to profile the code we wrote from within Visual Studio. If your build generates *.pdb files then it is not required to setup the symbols folder. However, in order to analyze the call stacks collected from notepad.exe we must download the debug symbols from Microsoft Symbol Server. The easiest way to obtain the debug symbol files is to use symchk.exe which comes with Microsoft Debugging Tools for Windows. Here is how we can download notepad.pdb file.

symchk.exe c:\Windows\System32\notepad.exe /s SRV*c:\symbols*http://msdl.microsoft.com/download/symbols

[Note that in order to decode full call stacks you may need to download *.pdb files for other dynamic libraries such as user32.dll and kernelbase.dll for example. With symchk.exe you can download debug symbol files for more than one module at once. For more details you can check Using SymChk page.]

Now we are ready to profile notepad.exe editor. Navigate to New Profiling Session->Native Executable menu, enter the path to notepad.exe and click Run button. Once notepad.exe is started, open some large file and use the timeline UI control to select the time interval of interest.

jtnative

In closing, I would say that JustTrace has become a versatile profiling tool which is not constrained to the .NET world anymore. There are plenty of unmanaged applications written in C or C++ and JustTrace can help to improve their performance. You should give it a try.

Notes on Asynchronous I/O in .NET

Yesterday I worked on a pet project and I needed to read some large files in an asynchronous manner. The last time I had to solve similar problem was in the times of .NET v2.0 so I was familiar with FileStream constructors that have bool isAsync parameter and BeginRead/EndRead methods. This time, however, I decided to use the newer Task based API.

After some time working I noticed that there was a lot of repetition and my code was quite verbose. I googled for an asynchronous I/O library and I picked some popular one. Indeed the library hid the unwanted verbosity and the code became nice and tidy. After I finished the feature I was working on, I decided to run some performance tests. Oops, the performance was not good. It seemed like the bottleneck was in the file I/O. I started JustDecompile and quickly found out that the library was using FileStream.ReadAsync method. So far, so good.

Without much thinking I ran my app under WinDbg and set breakpoint at kernel32!ReadFile function. Once the breakpoint was hit I examined the stack:

0:007> ddp esp
0577f074  720fcf8b c6d04d8b
0577f078  000001fc
0577f07c  03e85328 05040302
0577f080  00100000
0577f084  0577f0f8 00000000
0577f088  00000000

Hmm, a few wrong things here. The breakpoint is hit on thread #7 and the OVERLAPPED argument is NULL. It seems like ReadAsync is executed in a new thread and the read operation is synchronous. After some poking with JustDecompile I found the reason. The FileStream object was created via FileStream(string path, FileMode mode) constructor which sets useAsync to false.

I created a small isolated project to test further ReadAsync behavior. I used a constructor that explicitly sets useAsync to true. I set the breakpoint and examined the stack:

0:000> ddp esp
00ffed54  726c0e24 c6d44d8b
00ffed58  000001f4
00ffed5c  03da5328 84838281
00ffed60  00100000
00ffed64  00000000
00ffed68  02e01e34 00000000
00ffed6c  e1648b9e

This time the read operation is started on the main thread and an OVERLAPPED argument is passed to the ReadFile function.

0:000> dd 02e01e34 
02e01e34  00000000 00000000 04c912f4 00000000
02e01e44  00000000 00000000 72158e40 02da30fc
02e01e54  02da318c 00000000 00000000 00000000
0:000> ? 04c912f4 
Evaluate expression: 80286452 = 04c912f4

A double check with SysInternals’ Process Monitor confirms it.

readmonitor

I emailed the author of the library and he was kind enough to response immediately. At first, he pointed me to the following MSDN page that demonstrates “correct” FileStream usage but after a short discussion he realized the unexpected behavior.

badasync

I don’t think this is a correct pattern and I quickly found at least two other MSDN resources that use explicit useAsync argument for the FileStream constructor:

In closing, I would say that simply using ReadAsync API doesn’t guarantee that the actual read operation would be executed in an asynchronous manner. You should be careful which FileStream constructor you use. Otherwise you could end up with a new thread that executes the I/O operation synchronously.

501 Must-Write Programs

In the spirit of 501 Must-Visit… book series I decided to write this post. My ambitions are much smaller so don’t expect a comprehensive “guide” to computer programming. Nevertheless, I just think it would be useful to show you short programs that demonstrate important computer ideas and techniques.

Fortunately, as software developers, we don’t need to know 501 things in order to accomplish our daily tasks. So my intention is write a list of programs with brief explanation for each one. I will extend the list whenever I find programs good enough to represent important computer techniques. At first sight, the programs in the list may seem random or unusual but they all demonstrate important aspects of commonly used concepts in computer programming. Here goes the first ten in no particular order:

  1. Nth Fibonacci number  – I know, I know… it is too simple. However, I think we often can learn from simple things. Let’s see what there is for us to learn. Fibonacci numbers are one of the canonical examples for recursion. They can be used to explain time complexity in a very easy manner as well. It’s also good to mention Binet’s formula and memoization technique.
  2. Conway’s Game of Life – I include this program in the list just because it is beautiful. It is an excellent example of how simple rules can lead to complex interaction. Cellular automation is used in computability theory, mathematics, physics, complexity science and theoretical biology just to mention a few. To the curious readers, I recommend to read about Wolfram code as well.
  3. Quine (self-replicating program) – It is really fun to write a quine in your favorite (Turing complete) programming language. While writing a quine program is fun and good for your creativity, it is worthy to note its deep connection with mathematical logic and fixed point in mathematics.
  4. Currency converter – While this program may not seem fun or a programming challenge, it’s a handy tool and it will make you familiar with units of measure concept which is important in numeric calculation.
  5. String searching – I guess this is one of most common tasks in our work. Every now and then we have to search for a given substring. While the naive implementation is easy to write, you will quickly find out that it is impractical for large strings due its time and space complexity. Then you will enter the wonderful world of Knuth–Morris–Pratt and Rabin–Karp string search algorithms. There are tons of scientific literature on the topic and most of it comes from bioinformatics and computational mathematics.
  6. Huffman coding – It is, by far, the easiest introduction into the information theory and lossless data compression. While nowadays there are much better data compression algorithms, the idea of Shannon entropy is still used.
  7. Line counting – Well, you may find this is too lame. Counting the lines of a text file is not a big deal until the file is small and can fit into the main memory. Then you quickly come with the idea of data buffer. While buffering is not a rocket science it is one of most practical and useful things in most code bases.
  8. Calculate definite integral – Yes, you read it right. At first you may wonder how calculus is related to computer programming. I chose this relatively simple program because it demonstrates important ideas used in numerical analysis such as approximation and sampling. Both ideas are very important in computer programming because everything in computer science is discrete in a broader sense. If your programming languages allows passing functions as parameters then you can write this program in functional programming style.
  9. Graph traversal – Graphs are one of the most commonly used data structures. While both depth-first search and breadth-first search algorithms are easy to understand, together they represent a duality which is common to find in the computer science.
  10. Tic-tac-toe game – Writing this game is one of the easiest introduction to game theory. The simplicity of the game allows you to comprehend hard ideas like NP-complete and NP-hard problems, backtracking algorithms, alpha-beta pruning and minmax rule.

How to solve SOS/DAC mismatch

Have you ever experienced the following SOS/DAC mismatch error in WinDbg?

Failed to load data access DLL, 0x80004005
Verify that 1) you have a recent build of the debugger (6.2.14 or newer)
2) the file mscordacwks.dll that matches your version of mscorwks.dll is
in the version directory
3) or, if you are debugging a dump file, verify that the file
mscordacwks_<arch>_<arch>_<version>.dll is on your symbol path.
4) you are debugging on the same architecture as the dump file.
For example, an IA64 dump file must be debugged on an IA64
machine.

There are a lot of blog posts and articles that explain the cause for this error. The solution is simple: find the correct mscordacwks.dll version. And this is the moment when your pain starts. Well, not any more. I wrote a simple static C# class that downloads the correct mscordacwks.dll file for you. You can use it as easy as follows:

DacHelper.DownloadDac(@"C:\mydump.dmp", @"C:\symbols");

You can extend the class to download sos.dll file as well or to support ARM or IA64 processors. Keep in mind that you have to compile the source code with /unsafe option enabled. Also don’t forget to include the following two files (with the correct bitness) from WinDbg folder in your path:

  • dbghelp.dll
  • symsrv.dll

Enjoy 🙂

zipicon1Source Code

Introduction to JsRT

I recently started working on a new project and part of it is about embedding V8 JavaScript engine. So far, my experience with V8 is very good. The object model is nice and clean and although the lack of a good documentation it is easy to work with it. I strongly recommend using V8, but in this post I am going to show you another option.

After I gained some know-how with V8, I decided to explore IE 11 Chakra JavaScript engine. Last week Microsoft announced IE 11 availability for Windows 7 and I guess it will become a good alternative to V8. I guess using Chakra engine may be tempting for developers that do not want to build V8 from source code or if they want to avoid static linking in order to keep small executable files.

For the sake of this introductory post, I am going to show you how to implement a simple printf-like functionality with Chakra (please note that I am going to provide just an overview, for more details see [1]). Suppose we want to implement a very simple print function that accepts some format string, an integer and a string and formats the output as shown.

native.printf('number=%#x string=%s\n', 255, 'test')

We are going to embed Chakra engine in a simple console app that runs this script and just outputs the result.

[Note: for the sake of brevity I am going to omit the error handling. The source code contains all the error checks though.]

The first thing we have to do is to create new Chakra JavaScript runtime. The runtime represents a complete JavaScript execution environment and has a single thread of execution.

JsRuntimeHandle runtime;
JsCreateRuntime(JsRuntimeAttributeNone, JsRuntimeVersion11, nullptr, &runtime);

Once we have created a runtime we have to create an execution context. There can be multiple execution contexts that are active on a thread at the same time.

JsContextRef context;
JsCreateContext(runtime, nullptr, &context);
JsSetCurrentContext(context);

Now, it is time to execute the script. This is done via JsRunScript function.

wstring script(L"native.printf('number=%#x string=%s\\n', 255, 'test')");
JsValueRef result;
JsSourceContext contextCookie = 0;
JsRunScript(script.c_str(), contextCookie, L"source", &result);

Right now, you are probably guessing where this native.prinft thing would come from. That’s right, I missed that part on purpose because I want to show the very basic workflow:

  • create a runtime
  • create a context
  • run the script

Let’s see what is needed to make native.printf to work. Every JavaScript runtime environment has one root object called global. This global object is the object that holds all top-level objects. So in our case we have to create a new object and make it accessible through native property on the global object. Then we have to create another object, actually a function, and make it accessible through printf property on the native object.

JsValueRef global;
JsGetGlobalObject(&global);

JsPropertyIdRef nativeProp;
JsGetPropertyIdFromName(L"native", &nativeProp);

JsValueRef nativeObj;
JsCreateObject(&nativeObj);

JsPropertyIdRef printfProp;
JsGetPropertyIdFromName(L"printf", &printfProp);

JsValueRef printfFunc;
JsCreateFunction(PrintFormat, nullptr, &printfFunc);

JsSetProperty(nativeObj, printfProp, printfFunc, true);
JsSetProperty(global, nativeProp, nativeObj, true);

The final missing thing is the PrintFormat callback function.

JsValueRef CALLBACK PrintFormat(JsValueRef callee, bool isConstructCall, JsValueRef *arguments, unsigned short argumentCount, void *callbackState)
{
	const wchar_t *format;
	size_t length;
	JsStringToPointer(arguments[1], &format, &length);

	VARIANT variant;
	JsValueToVariant(arguments[2], &variant);

	const wchar_t *str;
	JsStringToPointer(arguments[3], &str, &length);

	wprintf(format, variant.intVal, str);

	return JS_INVALID_REFERENCE;
}

That’s all. We implemented all the functionality required to execute native.printf function.

In closing I would say that using Chakra engine is fairly easy. The API is not object-oriented and some less experienced developers may find it as a drawback. On the other hand it is easy the incorporate Chakra C-style API in existing OO code base or use it from a script language like Python.

zipicon1Download source code

Further reading:

[1] MSDN

Declarative mocking

Mocking complements the test-driven development (TDD) allowing developers to write small and concise unit tests for components with external dependencies that would otherwise be hard or impossible to test. As the software becomes more and more distributed and loosely coupled, mocking becomes an intrinsic part of TDD process. While there are good tools and established best practices for mocking in .NET, most of the currently widely used approaches are imperative. Imperative code tends to be verbose, less expressive and describes how a mocking behavior is achieved rather what behavior is desired. On the other hand, nowadays new technologies make it possible to build declarative mocking tools and frameworks.

Let’s start with a few examples (I try to avoid artificially examples, e.g. ICalculator-like, because they don’t explain the properties of real projects). Suppose you work on a mobile social app that consumes external weather service. The app sends the current coordinates (latitude and longitude) and gets back JSON data as a string. You define the service interface as:

public interface IWeatherService
{
    string GetCurrent(float latitude, float longitude);
}

The service implementation does a REST call to get the data. For my current location, the REST call looks like the following:

http://api.openweathermap.org/data/2.5/weather?lat=42.7&lon=23.3

Once the app gets the data it should suggest places where the user can meet with friends. Depending on the current weather, the app should suggest an indoor or outdoor place. A possible implementation of this feature may look like the following:

public enum Sky
{
    Cloudy,
    PartiallyCloudy,
    Clear
}

public enum PlaceType
{
    Indoor,
    Outdoor
}

public class WeatherModel : Model
{
    private readonly IWeatherService weatherService;

    public WeatherModel(IWeatherService weatherService)
    {
        if (weatherService == null)
        {
            throw new ArgumentNullException("weatherService");
        }
        this.weatherService = weatherService;
    }

    public PlaceType SuggestPlaceType(float latitude, float longitude)
    {
        var sky = this.GetCurrentSky(latitude, longitude);

        return sky.Equals(Sky.Clear)
                ? PlaceType.Outdoor
                : PlaceType.Indoor;
    }

    private Sky GetCurrentSky(float latitude, float longitude)
    {
        var data = this.weatherService.GetCurrent(latitude, longitude);

        dynamic json = JsonConvert.DeserializeObject(data);

        var value = json.weather[0].main.Value as string;

        var sky = (Sky)Enum.Parse(typeof(Sky), value);

        return sky;
    }

    // the rest is omitted for brevity
}

The implementation is quite straightforward. It provides a simple design for dependency injection via WeatherModel constructor and SuggestPlaceType method keeps the logic simple by delegating most of the work to a private method.

As we said before, the implementation of IWeatherService does a REST call. This requires that the test server(s) should have internet connection available. This is a serious restriction because most test environments are not internet-connected.

To solve this issue we can use any modern mocking framework (e.g. Moq, JustMock, NSubstitute, FakeItEasy and so on). In this case I am going to use JustMock.

[TestMethod]
public void TestSuggestedPlaceType()
{
    // Arrange
    var weatherSvc = Mock.Create<IWeatherService>();

    var latitude = 42.7f;
    var longitude = 23.3f;
    var expected = "{'coord':{'lon':23.3,'lat':42.7},'sys':{'country':'BG','sunrise':1380428547,'sunset':1380471081},'weather':[{'id':800,'main':'Clear','description':'Sky is Clear','icon':'01d'}],'base':'gdps stations','main':{'temp':291.15,'pressure':1015,'humidity':72,'temp_min':291.15,'temp_max':291.15},'wind':{'speed':1,'deg':0},'rain':{'3h':0},'clouds':{'all':0},'dt':1380439800,'id':6458974,'name':'Stolichna Obshtina','cod':200}";
    Mock.Arrange(() => weatherSvc.GetCurrent(latitude, longitude)).Returns(expected);

    // Act
    var model = new WeatherModel(weatherSvc);
    var suggestedPlaceType = model.SuggestPlaceType(latitude, longitude);

    // Assert
    Assert.AreEqual(PlaceType.Outdoor, suggestedPlaceType);
}

I prefer Arrange-Act-Assert (AAA) pattern for writing unit tests because it makes it simple and easy to read. As we can see, in this scenario the unit test is quite concise: 2 lines for the arrangement, 2 lines for the action,1 line for the assertion and a few lines for local variable definitions and comments. In fact, any modern mocking library can do it in a few lines. It doesn’t matter if I use JustMock or Moq or something else.

The point is, in such simple scenarios any mocking framework usage will result in simple and nice to read unit tests. Before we continue, I would like to remind you that both JustMock and Moq are imperative mocking frameworks. So are NSubstitute and FakeItEasy and many others. This means that we have explicitly to command the mocking framework how the desired behavior is achieved.

So far, we saw that imperative mocking frameworks do very well in simple scenarios. Let’s see an example where they don’t do well and see how declarative mocking can help. Suppose you work on invoice module for a CRM system. There is a requirement that the invoice module should send an email when there are more than 3 delayed invoices for a customer. A possible implementation may look as it follows:

public interface ISpecializedList<T>
{
    void Add(T item);

    void Reset();

    uint Count { get; }

    // the rest is omitted for brevity
}

public interface ICustomerHistory
{
    ISpecializedList<Invoice> DelayedInvoices { get; }

    // the rest is omitted for brevity
}

public class InvoiceManager
{
    private readonly ICustomerHistory customerHistory;

    public static readonly uint DelayedInvoiceCountThreshold = 3;

    public InvoiceManager(ICustomerHistory customerHistory)
    {
        if (customerHistory == null)
        {
            throw new ArgumentNullException("customerHistory");
        }
        this.customerHistory = customerHistory;
    }

    public void MarkInvoiceAsDelayed(Invoice invoice)
    {
        var delayedInvoices = this.customerHistory.DelayedInvoices;

        delayedInvoices.Add(invoice);

        if (delayedInvoices.Count > DelayedInvoiceCountThreshold)
        {
            this.SendReport(invoice.Customer);
        }
    }

    private void SendReport(Customer customer)
    {
        // send report via email

        this.ReportSent = true;
    }

    public bool ReportSent
    {
        get; private set;
    }

    // the rest is omitted for brevity
}

Let’s write the unit test. I am going to use JustMock.

[TestMethod]
public void TestSendReportWhenDelayOrderThresholdIsExceeded()
{
    // Arrange
    var history = Mock.Create<ICustomerHistory>();

    uint count = 0;

    Mock.Arrange(() => history.DelayedInvoices.Add(Arg.IsAny<Invoice>())).DoInstead(new Action(() =>
    {
        Mock.Arrange(() => history.DelayedInvoices.Count).Returns(++count);
    }));

    // Act
    var invoiceMananger = new InvoiceManager(history);
    invoiceMananger.MarkInvoiceAsDelayed(new Invoice());
    invoiceMananger.MarkInvoiceAsDelayed(new Invoice());
    invoiceMananger.MarkInvoiceAsDelayed(new Invoice());
    invoiceMananger.MarkInvoiceAsDelayed(new Invoice());

    // Assert
    Assert.IsTrue(invoiceMananger.ReportSent);
}

This time the unit test looks quite complicated. We have to use DoInstead method to simulate the internal workings of ISpecializedList<T> implementation. Said in another words we have code duplication. First, there is a code that increments the Count property of ISpecializedList<T> implementation we use in production. Second, there is a code that increment the Count property in our test for the sole purpose of the test. Also, note that now we have count local variable in our test.

Let’s compare the two scenarios and see why the last test is so complicated. In the first scenario we don’t have a mutable object state while in the second one we have to take care for the Count property. This is an important difference. Usually a good program design says that a method with a return value doesn’t change the object state, while a method without a return value does change the object state. After all, it is common sense.

Suppose we have to write a unit test for the following method:

public void CreateUser(string username, string password) { ... }

This method doesn’t return a value. However, it changes the system state. Usually, when we write a unit test for void method we assert that the system is changed. For example, we can assert that we can login with the provided username and password.

Another option is to change the method signature so the method returns a value:

public bool /* success */ CreateUser(string username, string password) { ... }
// or
public int /* user id */ CreateUser(string username, string password) { ... }

However this is not always possible or meaningful.

So, we see that mocking even a simple interface like ISpecializedList<T> complicates the unit tests. This is a consequence of imperative mocking approach. Let’s see a hypothetical solution based on FakeItEasy syntax.

[TestMethod]
public void TestAddItem()
{
    // Arrange
    var list = A.Fake<ISpecializedList<Invoice>>();

    A.CallTo(() => list.Add(A<Invoice>.Ignored).Ensures(() => list.Count == list.Count + 1);

    // Act
    list.Add(new Invoice());
    list.Add(new Invoice());
    list.Add(new Invoice());

    // Assert 
    Assert.AreEqual(3, list.Count);
}

In this case we removed the need of count local variable and made the test shorter and a more expressive. The Ensures method accepts a lambda expression that describes the next object state. For example, we can arrange Reset method as follows:

A.CallTo(() => list.Reset()).Ensures(() => list.Count == 0);

Let’s see two more examples. We can arrange a mock for IDbConnection as follows:

IDbConnection cnn = ...;

A.CallTo(() => cnn.CreateCommand()).Returns(new SqlCommand());
A.CallTo(() => cnn.Open()).Ensures(() => cnn.State == ConnectionState.Open);
A.CallTo(() => cnn.Close()).Ensures(() => cnn.State == ConnectionState.Closed
						&& A.FailureWhen<InvalidOperationException>(() => cnn.BeginTransaction()));
A.CallTo(() => cnn.Database).FailsWith<NotImplementedException>();
A.CallTo(() => cnn.BeginTransaction()).FailsWhen<InvalidOperationException>(() => cnn.State != ConnectionState.Open);

This code fragment shows how we can describe the state machine behind IDbConnection instance. Similarly, we can arrange a mock for TextReader as follows:

TextReader reader = ...;

A.CallTo(() => reader.Read()).Requires(() => reader.CanRead);
A.CallTo(() => reader.Read()).Returns(0);
A.CallTo(() => reader.Close()).Ensures(() => A.FailureWhen<InvalidOperationException>(() => reader.Read()));
A.FailureWhen<Exception>(() => reader.ReadBlockAsync(null, 0, 0));

While a fluent API can help with declarative mocking it surely has limits. Both Requires and Ensures methods describe invariants but the lambda expressions become harder to read when they grow in size. So I started looking for improvements.

First, I decided to try Roslyn. It turns out that Roslyn is quite good framework for my purposes. My current approach is to define mock types as regular classes (I find some limitations of this approach and more research is needed). Instead of using fluent API I can define a mock type in the source code.

public mock class MockList<T> : ISpecializedList<T>
{
    public void Add(T item)
        ensures Count == Count + 1;

    public void Reset()
        ensures Count == 0;

    public uint Count
    {
        get;
    }
}

I borrowed ensures clause from Spec# and added mock modifier to the class definition. Then I used Roslyn API to hook and emit a simple throw new NotImplementedException(); for every method.
screenshot

I also emitted DescriptionAttribute for every ensures clause. I guess, it will be better to emit a reference to a custom attribute defined in another assembly but for now I decided to keep it simple. Now we can rewrite the previous TestAddItem test as follows:

[TestMethod]
public void TestAddItem()
{
    // Arrange
    var list = new MockList<Invoice>();

    // Act
    list.Add(new Invoice());
    list.Add(new Invoice());
    list.Add(new Invoice());

    // Assert 
    Assert.AreEqual(3, list.Count);
}

With the current implementation this test will fail with NotImplementedException but the test itself is short and easy to read. For further development I see two options. The first one is to make Roslyn to emit the correct ILASM corresponding to the expressions defined via requires and ensures clauses. The second option is to emit an interface rather a class and to keep requires and ensures clauses encoded as attributes. Then, at runtime the mocking API can create types that enforce the defined invariants. I think the second option is more flexible than the first one.

Besides Roslyn, there is another approach that can make mocking easier. Recently I came upon the concept of prorogued programming. Using this technique the developer can train the mocks used in the unit tests so that the desired behavior is achieved during the test runs. While this approach may seem semi-automated I find it very attractive. I think it has a lot of advantages and if there is a good tooling support it may turn out this is a better way to go.

What’s next? I will research the Roslyn approach further. There are two options:

  • (static) using Roslyn API to emit ILASM at compile time
  • (dynamic) using Roslyn to emit interfaces and metadata and then using mocking API at runtime to provide the actual implementation

Both options have tradeoffs and a careful analysis is needed. Prorogued programming seems very suitable technique to make mocking easier so I need to investigate it further. Stay tuned.

Further reading: