Wednesday, March 19, 2014

Embedded containers + ATDD = the end of unit testing?

Things seem to be shifting in the world of Java web application and web services development.

The impact of node.js on the wider software community and the rising adoption of bootstrap frameworks such as Dropwizard and Spring Boot are nudging us to think differently about how we deploy our applications.  It's also an opportunity to think about how we develop these applications, so let's consider how some intrinsic efficiencies in these approaches might allow us to rethink some sacred cows.

Most of the teams and projects I've ever been on have considered tests as two distinct battles in a war for quality.

Whether we call them "black box" or "acceptance" or "integration" tests (or even just "the specs"), we mean that we are testing if the feature is acting according to the specification, testing the entire application using some form of public API, like RESTful web services, a CLI or a UI.

On the other hand, we write lower level unit tests that are really about something different: does each small component of my implementation do what it's supposed to do?  Good unit testing should help drive us toward good class, function and component design: S.O.L.I.D., etc.

Having the opportunity to build some fun tools recently using Dropwizard and Spring Boot, I found myself reflecting on whether I really needed any unit tests.  Heresy?  It sure felt dirty to me too for a bit, but I'll enumerate some facts about this project to help illustrate the discussion:
  • The application is powered by Spring Boot, uses RESTful web services, VertX and WebSockets heavily.
  • The application starts in less than 4 seconds using embedded Tomcat (Jetty was even faster)
  • As is true with many apps we write, we provide a rich javascript UI that is completely powered by our public RESTful web services.
  • The JSON media type objects that the production code consumes and produces are shared by the testing code.
  • Using IntelliJ, and writing tests in TestNG (JUnit would have been fine too).
  • Because our IDE was not only running the tests but spinning up the in-process application, we got production code coverage even testing through the RESTful API.  Stop and consider that.
  • Using MongoDb, and for tests we use an embedded MongoDb.
  • Our tests use a Jersey client to talk to the running server on localhost over HTTP.
  • Fairly simple domain model ~ 10 core domain entities.
Being strong advocates of TDD (that is, test first development), we'd write some simple failing tests against a new RESTful endpoint.  As we developed to make the test pass, we wrote as little code as we needed to, while still pausing to think about good design and nice separation of concerns.  We wrote very few unit-level tests of these implementation classes.

Our first working code had no persistence at all, it simply used some VertX event bus handling.  Our next iteration used a file system, "inside" VertX.

Some new requirements caused us to think further about persistence and we chose MongoDb.  Some of the few unit tests we did write were to test our MongoDb repository, but even here it's questionable as to call them unit tests- in fact we used a running embedded MongoDb for these tests, giving us good confidence we were persisting and retrieving correctly, exercising our persistence repository API.

Near our initial release to our team we stepped back and examined the catalog of tests we'd created: perhaps 10% could qualify as unit tests.

Because our tests instead exercised the public API, we truly didn't have to change them as our implementation changed underneath:  Refactoring is easier with this approach.

Perhaps the single most important data point to consider here is that because of the embedded server tests, we were able to see our production code coverage while running our tests in TestNG.  We hadn't necessarily been aiming at it, but we achieved almost 100% coverage, purely testing through the public RESTful API.

This was revelatory; we had managed to create a high quality application with very few unit tests but still achieved almost perfect code coverage.  And the implementation was good too: certainly not the most pristine class design we've ever implemented, but much better than just good enough.

I don't think it's controversial that if you had to choose between acceptance tests and unit tests, you would choose acceptance tests:
  • they test the true system
  • they tend to give you much more confidence as to the health of the application
  • yup, refactoring really is easier
Some of the real disadvantages of ATDD are also mitigated by this approach.
  • Deploying your application to external systems takes time and resources (instead, your app is embedded in the test)
  • Having your application run on external systems increases brittleness due to network failures and other external system failures
  • Having your tests talk to external systems by some exercising public API is much slower than a unit test. True, even with an embedded server it's still slower than a true unit test, but without any of the brittleness.  And you can more easily take advantage of existing parallelization tools built into testing frameworks and CI systems with an embedded approach.
  • At least two extra steps in a continuous delivery pipeline are mitigated (deploy for test environment and acceptance tests)
So getting back to that choice between acceptance and unit testing: maybe it's a false choice.

Perhaps ATDD under these circumstances with these new approaches is more than just good enough, maybe it's worth aiming for.




Monday, April 23, 2012

Experiment: an A/B or Switch Testing framework

I've been reading and thinking a lot about Lean Startup lately, and the techniques and technology and infrastructure required for those techniques.

One of the great concepts, "validated learning" can be accomplished via A/B testing: pushing out a change or feature that some users get and some don't, and then measuring the success of that change.

I consider this an experiment.  So I decided to start to build a little experiment framework, in Java, that will (eventually) also have a RESTful web application element to it to be able to create and consume experiments.

At this point I'm really trying to focus on a clean, easy to use API.  I've also added persistence out of the box and am doing database migrations with Liquibase and persistence using Spring JDBC (partially because I'm a little sick of ORM...)

I may use the experience to try out some different persistence technologies as well, but I've got a little more work to do to create a little web app that can be exercised with RESTful web services first.

Would you use it?

Tuesday, April 3, 2012

Buildable project on GitHub

I've created a new little Java project in GitHub called buildable.

It's a set of build-time Java annotations and an annotation processor to easily create POJO builders with fluent interfaces, as described in "Growing Object Oriented Software, Guided by Tests"

I've been using this simple little pattern for several years across a few companies and many projects, and it's one of those things that can greatly simplify writing unit tests, especially on a domain model.

The GitHub wiki page explains more.

Sunday, December 5, 2010

Attraction vs Elegance, Fighting Software Entropy

I've recently joined a team working on some neat applications in a high tech domain, and my interests in OO quality and craftsmanship have been renewed. After watching some older videos (here and here) between Corey Haines and J.B. Rainsberger, I wanted to expand on one of J.B.'s ideas here: the notion of attraction.

Originally I was just thinking that I prefer that adjective over something hifalutin, empty, (and often undeserved), like elegance. But after starting to grok what they were talking about, it really has a more meaningful purpose. Attractive software:
  • Attracts other code to it.
  • Attracts developers to use it.
  • Attracts developers to share attractive patterns.
  • Fights entropy.
I'll try to demonstrate below, using Java and a simple problem domain: sending a user a welcome message. And though I practice TDD religiously, I'm going to omit all tests for the sake of brevity.

Let's start with two simple Java classes:
public class User {
  private String emailAddress;
  private String name;
  public User(String name, String emailAddress){
    if (!isValid(emailAddress)){
      throw new IllegalArgumentException("Not a valid email address.");
    }
    this.name = name;
    this.emailAddress = emailAddress;
  }
  public void sendWelcomeMessage(EmailSender sender){
    sender.send("greeter@mysite.com", emailAddress, "welcome", "Welcome to the site " + name + "!");
  }
  private Boolean isValid(String emailAddress){
    // do some validation
    return valid;
  }
}

public interface EmailSender {
  void send(String fromAddress, String toAddress, String subject, String body);
}

Ok, so if we're developers responsible for working on this User class, we'll notice a few smells:
  1. It has too many responsibilties, modeling a user but also email validation.
  2. It has some knowledge about whom is supposed to send the welcome message.

And maybe it doesn't make sense to send emails from the User, but I like this API as it doesn't expose the internals of the User- it's telling it to send an email, not asking for its emailAddress. I guess that's an example of a curried object.

Let's address the first by introducing a Value Object for an email address. I know I could use the javax.mail.InternetAddress, but I want something simpler and easier to work with. In fact I'll use it for validation and weaken its checked AddressException down to the more friendly unchecked IllegalArgumentException.

import javax.mail.InternetAddress;
import javax.mail.AddressException;

public class EmailAddress {
  private InternetAddress value;
  public EmailAddress(String address){
    try {
      this.value = new InternetAddress(address); // validates for us
    } catch (AddressException ae){
       throw new IllegalArgumentExcpetion(ae);
    }
  }
  public String value(){
    return value.toString();
  }
}

Now let's refactor the User class, using this new abstraction.
public class User {
  private String name;
  private EmailAddress emailAddress;
  public User(String name, EmailAddress emailAddress){
    this.emailAddress = emailAddress;
    this.name = name;
  }
  public void sendWelcomeMessage(EmailSender sender){
    sender.send("greeter@mysite.com", emailAddress.value(), "welcome", "Welcome to the site " + name + "!");
  }
}
Ok, this is better, we've removed one of the smells, that the class had too many responsibilities. Now it only deals with sending a welcome message.

But I'd argue this is the stage where some attraction starts happening. The EmailAddress class might be a better place to hang the greeter email address. Perhaps it won't be the final place in our application, I could certainly think of better places to put it, but I'd argue it's better than in the User class.

public class EmailSender {
  public static EmailSender GREETER = new EmailSender("greeter@mysite.com");
  ...
}
And refactoring the User class to use it...
public void sendWelcomeMessage(EmailSender sender){
    sender.send(EmailAddress.GREETER.value(), emailAddress.value(), "welcome", "Welcome to the site!");
  }

Ok, that's better, but there' still more attraction going on. If someone responsible for the EmailSender interface were to look at it, might they realize that (re)using our EmailAddress object makes their API better too?

public interface EmailSender {
  void send(EmailAddress from, EmailAddress to, String subject, String body);
}

So we'd refactor our User to be even better...
...
  public void sendWelcomeMessage(EmailSender sender){
    sender.send(EmailAddress.GREETER, emailAddress, "welcome", "Welcome to the site!");
  }
...
But now I think I see another type of attraction going on: this Value Object being extended to other parts of the EmailSender API. Value object's are a great place to put validation- so why not make some for email Body's and Subject's, where (for example), we could isolate the logic for an email subject's maximum length?

public interface EmailSender {
  void send(EmailAddress from, EmailAddress to, Subject subject, Body body);
}

This was intentionally a trivial and contrived example application, and I'd never call this code elegant. Nor would anyone else. But perhaps they might call it attractive.  Or at the least would they say it's more attractive than it was.

And we would have made some small steps fighting some entropy in our application, don't you think?

Thursday, February 25, 2010

The other build. The one that SHOULD be broken.

I was recently on site with a client where I was sharing my experiences with Test Driven Development (TDD), and how it can help with different types of problems. Note that this client has nothing but sharp, enthusiastic, and friendly employees. Unfortunately their development process has severely limited how quickly they can release new products to the market. A large new infusion of capital was called for to modernize their systems and their practices. My little software consultant shop was hired to help with both endeavors: help improve the software and help improve the process.

This particular week, I met with the head of the QA department and had a good one hour discussion about the merits of TDD and specifically my experiences in different types of organizations: teams with technologically-savvy QA staff and those without.

This positive first meeting lead this manager to ask me to demonstrate to a larger audience a very specific technique: how do you test something that isn't built yet?

Developers practicing disciplined TDD do this often, many times a day, with our unit and integration tests. We write tests that at first don't even compile, much less pass and “go green”. Our IDE's feedback mechanisms are practically screaming at us to fix these problems. We get to the point where we don't even consider this practice odd (or "backwards") at all. Many of us come to like it so much that we never want to go back. But from a product QA perspective, where they're (mostly) thinking about acceptance testing, I realized that as a community- perhaps we Agilistos have done a poor job communicating how this can happen. For example, I was asked, "How in the world can I test a web page that hasn't been built yet?".

Having the most experience solving this particular problem using Selenium, I rehearsed for about 20 minutes before the meeting to make sure I could pull off something useful. I made sure I coded a few simple, but non-trivial acceptance tests that failed (because the web page didn't exist yet), and then incrementally developed the (dreamed up) feature so that within 20 minutes or so all the tests passed.

Then the meeting: I proceeded to do the same demonstration for the larger audience, which took about two hours. Why so much longer? Well, first of all- it was less a demonstration and more of a discussion: we discussed the techniques; the technology (Java, Spring, Groovy, Selenium, MVC); switching roles between product owner, tester, and developer; and encouraging them to do the same. They also came up with a much better idea for a feature to build in this demonstration, which happened to be slightly more complex but still feasible. I resisted the urge to fall back on exactly what I had rehearsed, but felt a little like I was living without a net.

I'll share the feature because I think it's important to illustrate just how small of scope we aimed for: the beginnings of a Help section of a web site. We wanted a Help-Menu page with links to other Help sections: help with cookies and help with signing in. They wanted the page accessible when the user was signed in and when they weren't yet. As we continued in product owner role, we realized we didn't want the "Trouble signing in?" link on the page in the case when you've already signed in. And lastly, we needed a link, from the top navigation, to this new Help-Menu page, visible in both user states (signed in and not). That was it. Pretty simple.

The real meat of the "How to test a web page that doesn't exist yet?" was mostly explaining how we can use Selenium's API and locators to expect certain text or behavior on any web page, even one that hasn't been built. We had healthy discussion about how some locators (such as overly specified XPath expressions) end up influencing and restricting how the developers build the page. In our case, I recommended using the link text as the locator, so that the developer would be free to develop the markup however she saw fit. So after writing just a few Selenium/JUnit test cases, we now had 4 tests that failed. 4 red lines. Perfect.

Switching now to the developer role, I designed and implemented the smallest increment of work I could think of. And of course- it was the smallest amount of work I could think of to make just one test pass. Boom- it did. 3 red lines. One green.

I really wanted to hammer the importance of this home to this team: this was, on a very small scale, the very essence of TDD. Product owners, testers and developers working very closely together early in the life-cycle of the feature. The feature isn't releasable on some arbitrary subjective metrics: it's ready when the tests pass. Who wrote the tests? The product owner, tester, and developer- together. And also- visible progress.

Then within another 20 minutes we finished the entire story and made all 4 tests go green.

We then attempted some extrapolation: imagine what our daily standup meetings would be like if we agreed to always have them around a dedicated projector that was showing two screens: yesterday's red and green bars, and today's red and green bars. I realized we had happened onto something important that isn't expressed often at all in Agile development: a Continuous Integration build that we intend to be broken. Most of the mindshare around Continuous Integration is having a build that is always green- and when the build does break (and it will rarely), it is addressed immediately. But that's a different sort of build: that one is extremely valuable from a development integration perspective, but arguably worthless from a "delivering business value" perspective. And our "epiphany" was that most executive sponsors and stakeholders don't give the smallest damn about development integration- they care about progress. They shared how their executives are often frustrated because they don't see the results of all their hard work until just before the team releases to production. Which, at the moment, is infrequent.

Now imagine having this projector always on, centrally located in an open and informative workspace, where stakeholders can look at it on their way to get coffee. They'd see red bars with items like "Signed in users do not see the 'Trouble signing in?' section on the Help Menu." - but they see green bars for items like "Both signed in and non-signed in users can navigate to the Help Menu easily from the top navigation area". The next day (or an hour later) they walk by and see all green for the tests for the Help Menu feature.

How could there not be a huge increase in trust between the stakeholders and the team if this happened daily or hourly? How could this not improve communication between all roles on the team? How could this not shift the end-of-iteration burden off QA's shoulders (where it is in every single waterfall process). And wouldn't it be fun too?

Of course, these practices have been written about- Steve Freeman and Nat Pryce write eloquently and practically about it in their excellent "Growing Object-Oriented Software, Guided by Tests" book- perhaps my favorite of 2009.

And some thought leaders like Jeff Patton are starting to talk less specifically about Agile when trying to help teams. Instead they argue that visibility is everything: just make all these unintentionally hidden things visible, and most good software teams will invent their own improvements without needing to know anything about Agile.

In conclusion, TDD is practiced at several levels in the life-cycle of shipping new business value: unit testing, integration testing, acceptance testing, even deployment testing. Let's start sharing more about the other levels- the other builds, not just the development integration builds. The ones that fail most of the time, but get greener and greener as we move through the iteration: illustrating progress, quality and commitment.

What do you think?

Tuesday, August 11, 2009

Quality Goverance goes down a little easier when automated

It's been a while since my last post. I've recently started a new job contracting for a company in the health care industry. As one of the earliest developers on the project, I was tasked with setting up some continuous integration (we decided on Hudson) and the beginnings of some software quality governance (PMD for static analysis and Cobertura for test coverage).

Our client has some fairly strict requirements in terms of static analysis and test coverage, but fortunately they have a PMD ruleset already defined- and their test coverage metrics amount to 95% coverage.

As we're early in the project, we're ramping up the team and still doing lots of 'infrastructure' work like defining ANT tasks to automate certain build tasks (although realistically that sort of thing never ends on an evolutionary project). I've never worked on a team where there was much "formal" governance- the aspirations of the team, in terms of code quality, were stated and agreed upon, but never governed with strict rules. We tended to pair program occasionally and scrum regularly so that any wide discrepancies in practices were ferreted out.

But having automated our static analysis rules by using the Hudson PMD plugin, and automating our test coverage using the Cobertura plugin, I must confess I'm liking the strict governance. Now, the build fails not just when all the tests don't pass, but also when there aren't enough tests (coverage metrics failed), or the static analysis metrics fail.

I've been a big fan of Test Driven Development for a quite a few years now, and although I've been fortunate to work on teams where pair programming wasn't out of the ordinary, it's always been a struggle to maintain high code coverage as the project evolves. Having a client that mandates challenging goals and strict governance will actually be a blessing.

As new developers join the team, it won't seem like such an artificial or arbitrary goal to have 95% coverage and strict analysis metrics. For one thing, it's being mandated by the client. But more importantly, it's now (and forever more) baked into the build- these developers will quickly understand that their work will not be allowed unless it passes the build- and in this case that means governance as well as functionality. I'm really hoping it makes the overall quality goals consistent, maintainable and achievable.

It also self reinforces the Test Driven approach. For example, say up to this point we have an initial domain model, some services, some controllers and some stubbed out objects. The moment we add a persistence layer, we'll have to test it. The code coverage rules will fail if we don't. It might be easy (and very tempting) to just whip up some simple CRUD operations on a DAO without any tests (and let's face it, with frameworks like Hibernate, we know pretty much that they'll work). But our build process won't allow us. This means we have to think about how we're going to test our persistence layer very early- do we go with DBUnit? Do we grow something ourselves? Those are nice questions to have answered (and working) before we actually have any persistence code working. Not to say that we must have it perfect out of the gate- but we must have it covered.

Having the governance automated makes swallowing those sort of pills just a little bit easier.

Wednesday, October 8, 2008

New release of SocialVino

Today I released the first revision to www.socialvino.com since the launch.

The three major new features are:
  • changing your password
  • pagination of browsing members
  • pagination of browsing wines
Soon I'll be adding welcome emails, but I need to set up an asynchronous messaging service, like OpenMQ and get that working with the Grails plugin. I also plan on adding the ability to flag wines as suspicious, so that abusers don't spam the site with nonsensical wines. Also, to flag wine reviews as offensive. My ideal would be to allow the members to police themselves. And honestly, I don't feel the site really needs this yet, but it seems like an interesting challenge.