Master's Thesis & Open-Source Tool

On July 15th, I successfully defended my Master's Thesis in Biomedical Informatics at Vanderbilt University. This defense was the culmination of 2 years of work. The thesis focuses on extracting organizational structure and relationships from the audit logs of clinician information systems. This work has potential applications in the improvement of delivery of care and improving the security of patients private medical data.

As part of this work, I developed an open-source tool for analyzing audit logs. Licensed under an Apache 2.0 License, the Healthcare Organizational Relational Network Extraction Toolkit (HORNET) is a Python framework for plugins that analyze healthcare audit logs. The tool is fully functional, but is not yet polished enough for use by healthcare administrators.

The project is hosted on Google Code ( You can visit the project site as well as view the latest documentation

I am writing a journal publication that describes this tool, its methods, and results from Vanderbilt University Medical Center. I will link to that publication when it is available, but until that time, I can release my thesis abstract.

A Framework for the Automatic Discovery of Policy from Healthcare Access Logs

by John M. Paulett

Healthcare organizations are often stymied in their efforts to prevent insider attacks that violate patient privacy. Numerous high-profile privacy breaches involving celebrities have brought this deficiency to the public's attention. In response, recent legislation aims to improve this situation by means of regulations and sanctions. While the public and government may demand more privacy safeguards, the current state-of-the-art tools in healthcare security, such as access control and auditing, will still be limited in their ability to solve the issue technically. These technologies are theoretically sound and tested in other industries, yet are suboptimal because no feasible methods exist for generating the policies these systems must act upon, due to the inherent complexities of modern healthcare organizations.

To address this shortcoming, we present a novel open-source framework, which mines low-level statistics of how users interact within the organization from the access logs of the organization's information systems. Our framework is scalable and capable of handling real world data integrity issues. We demonstrate the use of our tool by modeling the Vanderbilt University Medical Center. Additionally, we compare our framework's model to traditional experts who would attempt to manually generate a similar model.