by Luke Shulman

How do you find a patient?

It seems all too common that a provider or health analyst would say “How many patients are taking a statin?” or “How many patients didn’t have their physical?”. These are the fundamental questions of population health. The advent of electronic health data was supposed to make answering these questions easy and compared to a system on paper it does. So, when we want to know

Percentage of the following patients - all considered at high risk of cardiovascular events - who were prescribed or were on statin therapy during the measurement period: “ –CMS347

as in a quality measure, what is this referring to?

This week CMS announced a new technical standard for asking these questions called Clinical Quality Language. If you think we didn’t need a technical standard for analytical questions think again and let me show you why we might?

Consider how for your EHR would you query “How many patients are taking a statin?”


This screen shot, taken from the user manual, of an EHR I will not name illustrates the problem. The default is to search by individual drug name. Meaning the user needs to lookup the over 49 separate statin medications. Also is it searching brand names or generic names?

In defense of this EHR, there is an option to search by drug class. But statins aren’t a drug class option. atorvastain, one of the most commonly prescribed, is classified as an “antilepimic agent”. So while, we all know its a statin thats not the formal name.

Enter the population health products. This definitely helps. Here is the query to identify patients who were on a statin in 2016.


join CONCEPT on DRUG_ERA.drug_concept_id = CONCEPT.CONCEPT_ID 

where  YEAR(drug_exposure_end_date) >= 2016 and YEAR(drug_exposure_start_date) <= 2016
and concept_code IN (259255,404011,404013,476351,597984,597990,597993,617311,757705,859419,859751,197903,197904,310404,310405,312962,314231,433849,476345,757702,861643,904458,904467)

This is fairly straightforward SQL. The data warehouse, in this case an instance of the OMOP `Common Data Model, has a table called DRUG_ERA that has a calculated range of exposure for each drug a patient is taking. Therefore, we simply setup a query for all patients whose exposure started in 2016 or prior and did not end until 2016 or later.

However there are several things that my query uses that either don’t exist or would be different in other systems. * YEAR() this function in SQL server returns the year as integer of any input date. On other systems, the syntax would be EXTRACT(YEAR from drug_exposure_end_Date). * DRUG_ERA: This is obviously a great table for research and population health as it has calculated periods of exposure to certain drugs. But, do I have guarantee that every system queried will have the same structure. * RxNorm Codes: RxNorm codes are generally the default vocabulary for drug normalization. But nothing in my syntax declares that I am using RxNorm codes. I just need to know that the primary concept used here is not Medispan or other encoding.

Ok, so now we see the problem. The basic tools in the EHRs lack the expressiveness to do this reporting reliably. The population health products, even the open source ones, do so in a way that isn’t going to be consistent across multiple systems. There are have been several efforts to help address this problem in the past and some notable success stores.

The Sentinel Initiative has deployed the exact same data model and SAS analytical code at multiple research partners in both the public and private space. Sponsered by the FDA, the program the agency or other drug researcher to deploy the same analytical package to multiple sources and get the same result. Now in its 10th year, the group has number of success stories and maybe in a small way has helped change the way drugs are monitored and approved.

Other initiatives has tried to leverage various HL7 standards to achieve the same results.

Enter Clinical Quality Lanuage

So, this week CMS announced that going forward all of its quality measure standards would be expressed in a new technology Clinical Quality Language (CQL). How does CQL help our problem? Well lets take a look at the syntax for this which I borrowed from the CMS Measure 347 Draft.

using QDM

//PART 1
valueset "High intensity statin therapy": 'urn:oid:2.16.840.1.113762.1.4.1047.97'
valueset "Low intensity statin therapy": 'urn:oid:2.16.840.1.113762.1.4.1047.107'
valueset "Moderate intensity statin therapy": 'urn:oid:2.16.840.1.113762.1.4.1047.98'
valueset "Statin Grouper": '2.16.840.1.113762.1.4.1110.19'

//PART 2
parameter "Measurement Period" default Interval[DateTime(2011,1,1), DateTime(2016,12,31)]

context Patient

//PART 3
define "Statin Active":
	["Medication, Active": "Low intensity statin therapy"]
		union ["Medication, Active": "Moderate intensity statin therapy"]
		union ["Medication, Active": "High intensity statin therapy"]
		union ["Medication, Active": "Statin Grouper"]

define "Statin Discharge":
	["Medication, Discharge": "Low intensity statin therapy"]
		union ["Medication, Discharge": "Moderate intensity statin therapy"]
		union ["Medication, Discharge": "High intensity statin therapy"]
		union ["Medication, Discharge": "Statin Grouper"]

Here we see a full expressive query language solely to express questions about patients. In part 1, we are able to state what groups of raw codes we want to use complete with globally unique identifier to the grouping. That means, the developer never has to copy and paste a specific drug, procedure, or diagnosis code.

In part 2, we have the period defined as a year long interval. In part 3, we can define a new data concept, “Active Statins” which refers to our code grouping and also refers to “Medication, Active” which is a known data type of all active medications the patient is taking across various periods, similar to the drug_era table from our SQL query above. I also can check if the patient was given the medication at discharge separately, “Medication, Discharge”. A patient going home with a few doses is very different from a patient seeing their PCP who is on a monthly dose and that difference is easy to express here.

As a query language, CQL does fix those three issues. I highlighted above. But, it’s could also be seen as a completely new programming language whose implementation across a myriad of health systems is by no means assured. An EHR vendor, commenting on their pilot of CQL for use in their system, was very direct in criticism:

In summary we find that CQL is probably not a viable approach given the open-ended nature of what appears to be a new fully-blown programming language. As a software vendor, we probably cannot devote the effort needed to create an interpreter, compiler or translator to enable evaluation of the entirety of the CQL/ELM. Supporting the full power of the CQL would probably be too expensive an undertaking. source

CQL will eventually be rooted in FHIR, the new resource and RESTFUL interoperability standard. Its got an expressive (if verbose syntax) and successfully abstracts the logic of clinical questions away from underlying architecture of the systems that it runs on. Later versions of FHIR include an endpoint for arbitrary CQL to be executed across the available resources.

With the medical directors and informatics folks I work with, I’d love to be able to say “ok define what you mean by taking a statin.” I’d love to be able to show them how we differentiate between a statin “order” and a statin “dispense event”. I’d love to be able to define in a single artifact all of the tricks and logical loops that medical science can come up with.

For those reasons, I love CQL and am really excited about its adoption. But there are few problems, first, that CQL I wrote above with elegant expressions and definitions. Yea, I can’t run it. I don’t know if it works. I am pretty sure it would work. I checked its syntax and it works when tested on Bonnie, a testing tool for measures and CQL. But, there isn’t a good reference implementation for executing CQL across a large clinical store. The main library for execution is CQL_Execution_Framework runs on staged flat files as a prototype. A few groups have tied it in to JSON document databases but these don’t seem fully baked and none really extend outside the javascript realm. This is something that’s really missing. How would CQL integrate into a broader execution framework? Hopefully with this CMS announcement, we will see some of those start to bear fruit.

By the way, here is my result, from Bonnie, happily returning my one fake patient of Sam Sifton.


ONC has kept the development process for CQL and the quality measures very open a stance that I wholeheartedly approve. The quote I used above came from a public discussion on that forum but since I am not sure that the author (or their company) ever meant for it to be publicized I’ve posted it without their name attached.