Wednesday, March 19, 2014

Embedded containers + ATDD = the end of unit testing?

Things seem to be shifting in the world of Java web application and web services development.

The impact of node.js on the wider software community and the rising adoption of bootstrap frameworks such as Dropwizard and Spring Boot are nudging us to think differently about how we deploy our applications.  It's also an opportunity to think about how we develop these applications, so let's consider how some intrinsic efficiencies in these approaches might allow us to rethink some sacred cows.

Most of the teams and projects I've ever been on have considered tests as two distinct battles in a war for quality.

Whether we call them "black box" or "acceptance" or "integration" tests (or even just "the specs"), we mean that we are testing if the feature is acting according to the specification, testing the entire application using some form of public API, like RESTful web services, a CLI or a UI.

On the other hand, we write lower level unit tests that are really about something different: does each small component of my implementation do what it's supposed to do?  Good unit testing should help drive us toward good class, function and component design: S.O.L.I.D., etc.

Having the opportunity to build some fun tools recently using Dropwizard and Spring Boot, I found myself reflecting on whether I really needed any unit tests.  Heresy?  It sure felt dirty to me too for a bit, but I'll enumerate some facts about this project to help illustrate the discussion:
  • The application is powered by Spring Boot, uses RESTful web services, VertX and WebSockets heavily.
  • The application starts in less than 4 seconds using embedded Tomcat (Jetty was even faster)
  • As is true with many apps we write, we provide a rich javascript UI that is completely powered by our public RESTful web services.
  • The JSON media type objects that the production code consumes and produces are shared by the testing code.
  • Using IntelliJ, and writing tests in TestNG (JUnit would have been fine too).
  • Because our IDE was not only running the tests but spinning up the in-process application, we got production code coverage even testing through the RESTful API.  Stop and consider that.
  • Using MongoDb, and for tests we use an embedded MongoDb.
  • Our tests use a Jersey client to talk to the running server on localhost over HTTP.
  • Fairly simple domain model ~ 10 core domain entities.
Being strong advocates of TDD (that is, test first development), we'd write some simple failing tests against a new RESTful endpoint.  As we developed to make the test pass, we wrote as little code as we needed to, while still pausing to think about good design and nice separation of concerns.  We wrote very few unit-level tests of these implementation classes.

Our first working code had no persistence at all, it simply used some VertX event bus handling.  Our next iteration used a file system, "inside" VertX.

Some new requirements caused us to think further about persistence and we chose MongoDb.  Some of the few unit tests we did write were to test our MongoDb repository, but even here it's questionable as to call them unit tests- in fact we used a running embedded MongoDb for these tests, giving us good confidence we were persisting and retrieving correctly, exercising our persistence repository API.

Near our initial release to our team we stepped back and examined the catalog of tests we'd created: perhaps 10% could qualify as unit tests.

Because our tests instead exercised the public API, we truly didn't have to change them as our implementation changed underneath:  Refactoring is easier with this approach.

Perhaps the single most important data point to consider here is that because of the embedded server tests, we were able to see our production code coverage while running our tests in TestNG.  We hadn't necessarily been aiming at it, but we achieved almost 100% coverage, purely testing through the public RESTful API.

This was revelatory; we had managed to create a high quality application with very few unit tests but still achieved almost perfect code coverage.  And the implementation was good too: certainly not the most pristine class design we've ever implemented, but much better than just good enough.

I don't think it's controversial that if you had to choose between acceptance tests and unit tests, you would choose acceptance tests:
  • they test the true system
  • they tend to give you much more confidence as to the health of the application
  • yup, refactoring really is easier
Some of the real disadvantages of ATDD are also mitigated by this approach.
  • Deploying your application to external systems takes time and resources (instead, your app is embedded in the test)
  • Having your application run on external systems increases brittleness due to network failures and other external system failures
  • Having your tests talk to external systems by some exercising public API is much slower than a unit test. True, even with an embedded server it's still slower than a true unit test, but without any of the brittleness.  And you can more easily take advantage of existing parallelization tools built into testing frameworks and CI systems with an embedded approach.
  • At least two extra steps in a continuous delivery pipeline are mitigated (deploy for test environment and acceptance tests)
So getting back to that choice between acceptance and unit testing: maybe it's a false choice.

Perhaps ATDD under these circumstances with these new approaches is more than just good enough, maybe it's worth aiming for.

No comments: