January 2009

Parsing HL7 with Python

I've had a need to parse Health Level 7 (HL7) version 2.x messages from Python, thus I created python-hl7. The library allows for easy, key-based access of all the elements in an HL7 message. See the release announcement for download information.

HL7 is a communication protocol and message format for health care data. It is the de facto standard for transmitting data between clinical information systems and between clinical devices. The version 2.x series, which is often is a pipe delimited format is currently the most widely accepted version of HL7 (version 3.0 is an XML-based format).

As an example, let's create a HL7 message:

>>> message = 'MSH|^~\&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01|CNTRL-3456|P|2.4\r'
>>> message += 'PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|196203520|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292||(206)3345232|(206)752-121||||AC555444444||67-A4335^OH^20030520\r'
>>> message += 'OBR|1|845439^GHH OE|1045813^GHH LAB|1554-5^GLUCOSE|||200202150730||||||||555-55-5555^PRIMARY^PATRICIA P^^^^MD^^LEVEL SEVEN HEALTHCARE, INC.|||||||||F||||||444-44-4444^HIPPOCRATES^HOWARD H^^^^MD\r'
>>> message += 'OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105|H|||F\r'

We call the hl7.parse() command with string message:

>>> import hl7
>>> h = hl7.parse(message)

We get a n-dimensional list back:

>>> type(h)
<type 'list'>

There were 4 segments (MSH, PID, OBR, OBX):

>>> len(h)

We can extract individual elements of the message:

>>> h[3][3][1]
>>> h[3][5][1]

We can look up segments by the segment identifer:

>>> pid = hl7.segment('PID', h)
>>> pid[3][0]