Mocking Groovy's HTTPBuilder

I ran into a head-scratcher today when trying to unit test some Groovy code. The code under test interacts with an HTTP web service using Groovy's great HTTPBuilder, which wraps Apache's HttpClient. Obviously, I wanted to mock the interaction with the HTTP server to limit the scope of my tests.

Groovy makes it easy to create simple mocks using maps. To mock a class with a map, one must create a map which is keyed by the methods names to be tested and storing closures for the mock method implementation. For example, if we wish to mock out the HTTPBuilder, which has a "post" method, we can accomplish it using the map defined by mapMock.

class HTTPBuilder {
    def post(...) { /* real implementation */ }

def mapMock = ["post": { /* mock implementation */ }]

This map-mock approach was working great for mocking out the post, put, and delete methods in HTTPBuilder, but the get method was giving me quite a bit of trouble. The closure in my get method mock was never executed.

After taking a step back, I realized that the map's get method (the one used to return the value at a specific key) was getting called instead of the key within the map called get.

The simple solution was to switch to use an Expando mock instead of a map mock.

def expandoMock = new Expando()
expandoMock.get = { /* mock implementation */ }

I know I'm late to the train, buy Groovy is a breathe of fresh air compared to Java.

Master's Thesis & Open-Source Tool

On July 15th, I successfully defended my Master's Thesis in Biomedical Informatics at Vanderbilt University. This defense was the culmination of 2 years of work. The thesis focuses on extracting organizational structure and relationships from the audit logs of clinician information systems. This work has potential applications in the improvement of delivery of care and improving the security of patients private medical data.

As part of this work, I developed an open-source tool for analyzing audit logs. Licensed under an Apache 2.0 License, the Healthcare Organizational Relational Network Extraction Toolkit (HORNET) is a Python framework for plugins that analyze healthcare audit logs. The tool is fully functional, but is not yet polished enough for use by healthcare administrators.

The project is hosted on Google Code ( You can visit the project site as well as view the latest documentation

I am writing a journal publication that describes this tool, its methods, and results from Vanderbilt University Medical Center. I will link to that publication when it is available, but until that time, I can release my thesis abstract.

A Framework for the Automatic Discovery of Policy from Healthcare Access Logs

by John M. Paulett

Healthcare organizations are often stymied in their efforts to prevent insider attacks that violate patient privacy. Numerous high-profile privacy breaches involving celebrities have brought this deficiency to the public's attention. In response, recent legislation aims to improve this situation by means of regulations and sanctions. While the public and government may demand more privacy safeguards, the current state-of-the-art tools in healthcare security, such as access control and auditing, will still be limited in their ability to solve the issue technically. These technologies are theoretically sound and tested in other industries, yet are suboptimal because no feasible methods exist for generating the policies these systems must act upon, due to the inherent complexities of modern healthcare organizations.

To address this shortcoming, we present a novel open-source framework, which mines low-level statistics of how users interact within the organization from the access logs of the organization's information systems. Our framework is scalable and capable of handling real world data integrity issues. We demonstrate the use of our tool by modeling the Vanderbilt University Medical Center. Additionally, we compare our framework's model to traditional experts who would attempt to manually generate a similar model.

Programming Clojure Review

When Stuart Halloway's Programming Clojure came out in May, I picked up a copy and have been reading through it and practicing with the Project Euler problems.

First off, it is a great book! Second off, it introduces a seriously interesting programming language.

Clojure is a Lisp dialect designed to run on Java Virtual Machine (JVM). This combination is what makes Clojure very powerful: you get the power of a mature virtual machine with access to any existing Java libraries, combined with the dynamic, functional style of Lisp. Imagine being able to continue to use the code and libraries you any others have spent years developing from a new programming environment.

Layering a language on top of the JVM is not a new concept. Jython, JRuby, Groovy, and others did it years ago. But to some extent, these languages serve as a mere face-lift to the verbose syntax of Java. These languages were ported or created for the JVM to harness the power of existing Java libraries and platforms, while providing a prettier language.

While Clojure does offer a new syntax, it has a much more fundamental contribution to the Java world: strong concurrency primitives. (It should be noted that Scala offers this benefit as well.)

Clojure takes a hard-line approach to the arch-enemy of concurrency: shared state. Clojure allows programmers to easily write concurrent programs that can execute on multiple processors or cores. This ability comes from several facets of Clojure:

  • Immutable data
  • Preferring "pure" functions by making the programmer explicitly state where shared state is accessed
  • Multiple models for transactions and locks

Almost anyone who has experience writing threaded Java code, knows how difficult it is to ensure that multiple threads can execute in parallel without causing awful race conditions and subtle bugs. Luckily, Clojure addresses these shortcomings by using its own concurrency models.

Stuart's book begins by discussing the syntax of Clojure and demonstrates Clojure's ability to interact with regular Java classes. The book moves into the list-based world of Lisp with functional programming techniques, including lazy evaluation. The book then moves into advanced topics, including concurrency, macros, and Clojure's form of polymorphism, multimethods. The book concludes with a short chapter on testing Clojure code, working with SQL databases, and doing web development.

Through the book, we work on building an Ant replacement in Clojure. The most interesting take-away from this ongoing example is the use of actual Clojure code for the build DSL, removing the need for Ant's build.xml. The code-as-data concept is very elegant, resulting in a DSL that is very clear yet lacks XML's verbosity.

I also found the Snake game to be an excellent example of an application sharing state in a safe way using the Clojure transaction primitives.

The book gave me a great appreciation of the Lisp family of languages. The only wart that bothered me about Clojure was that it seems that at times the programmer must be too aware of the specific implementation of Clojure on the JVM. For instance, Clojure's recursion is at times hampered by the lack of Tail Call Optimization on the JVM. Because of this lack, the programmer must determine which work-around is most appropriate for his problem. Regardless, Clojure feels very clean and precise.

The book also clearly provides best practices and examples of idiomatic Clojure.

I look forward trying Clojure out in my projects. As I mentioned, I have been working through the Project Euler problems (my answers are definitely not ideal).

I would highly recommend the book to anyone who works in Java. I also believe the book is an excellent introduction to functional programming--I have read the Real World Haskell and Programming Erlang books with some difficultly, but Programming Clojure just clicked in my mind.

Install Eclipse Galileo (3.5) on Ubuntu Jaunty (9.04)

Eclipse 3.5, codenamed "Galileo," was released this week! While there is a team actively working on building an Ubuntu deb package, they do not yet have a package yet for Eclipse 3.5. I put together some super simple instructions for installing Eclipse 3.5.

I am going to perform a per-user installation into my home-directory. If multiple people use eclipse on the same computer, you may want to modify these instructions to install into /opt/. I am going to put the installable in ~/bin/packages/eclipse3.5. First, create the installation directory (change according to your own tastes)

mkdir -p ~/bin/packages
cd ~/bin/packages

Now download the appropriate tar.gz file from eclipse. I am going to grab them from Amazon's Cloudfront.

For 64-bit Ubuntu:


For standard 32-bit Ubuntu:


Now unzip, and rename the directory (I want multiple versions of Eclipse):

tar xzvf eclipse-java-galileo-linux-gtk*.tar.gz
mv eclipse eclipse3.5

Great, almost there. I am going to create a file so that I can launch eclipse from the command line. Create a new file ~/bin/eclipse, and in that file, put:

`~/bin/packages/eclipse3.5/eclipse -vmargs -Xms128M -Xmx512M -XX:PermSize=128M -XX:MaxPermSize=512M &> /dev/null` &

(You can later change these values if you get out of memory issues from Eclipse.) Lastly, make the file executable:

chmod u+x ~/bin/eclipse

Install plugins

Yet again, Eclipse has changed its update manager (each time it gets better). I am going to add a few plugins for Python, Clojure, and Mercurial. If you go to Help > Install new software, click the "Available Software Sites" link, and add your update sites. For me they include:

Add Icon to the Panel I like having an icon on my panel to quickly launch Eclipse, like so:

To do so, right click on your panel in a place with no other panel tool. Select "Add to Panel" then create a "Custom Application Launcher". You can enter /home/<USERNAME>/bin/eclipse (put in your username) as the command to run, and if you click the icon on the left, you can use the Eclipse icon in ~/bin/packages/eclipse3.5/.

Leave a comment if you run into issues or have a better method! You can also see my previous instructions for Eclipse 3.4, if you run into any issues--there were lots of great comments!

Vanderbilt Projects in the News

Some of the projects that I have been involved in at Vanderbilt University's Informatics Center have been featured in the news recently:

  • The New York Times had a feature about our use of the ILOG (now IBM) business rules engine to send out pager or SMS messages to physicians when patients have critical lab results.
  • Infection Control Today had an article about our Sepsis detection application which monitors patients' vital signs and lab results in real time to alert physician to patients who may have sepsis.