My new favorite web application is the Google Books N-gram viewer, which I first learned about while reading “Uncharted” by Erez Aiden and Michel Jean-Baptiste (2013). The N-gram viewer allows the user to easily document word or phrase usage over time as a percent of total words represented in the vast digital archive of Google Books. As a quick exercise, I examined the usage of the two phrases – “electronic medical record” versus “electronic health record” (the comparison is a frequent water cooler conversation in informatics circles) and realized that the output is worth sharing. The results were very clear – in 2001 there was an inversion in usage. Before 2001 the expression “Electronic Medical Record” was the most common, but after 2001 “Electronic Health Record” became the most frequently used expression. This shift significantly precedes national policy changes aimed at promoting electronic health record technologies. Below is the interactive view, sorry for the whitespace at the bottom of the embedded N-grams.
Why 2001? Unfortunately, N-gram analysis is a descriptive tool and won’t answer this question. But I can offer some informed speculation. In the late 1990s, the EMR industry was pre-occupied with concerns about Y2K. If I rerun the analysis, but this time include “y2K”, it shows the following trends. Fortuitously, the first time I ran this, I used a lower case “y” in y2K. You can re-run it with “Y2K” and see the trend in mentions of “Y2K” but it will wash out the EHR/EMR trend lines.
While we can’t infer causality, the convergence in the drop of Y2K discussion and the increase in EHR in 2001 is striking. With the Y2K issue behind them, EMR/EHR suppliers could focus on the future and realized that they needed to broaden their scope. In the early 2000s, in both industry and academia, the vision for medical informatics began to change from a focus on systems designed to manage disease (hospital and physician practice) to supporting personal health and other broader concepts with the long term potential to prevent disease.
There are a number of important caveats to this N-gram (and others). First and foremost, usage is not an indicator of which phrase is the “best” or most accurate. The N-gram is simply a measurement of behavior of authors of published documents, specifically books in this analysis. Secondly, the rate at which Google Books can scan documents introduces a lag of 5-6 years, so 2008 is the most recent year available to include in an analysis. The technology world has experienced immense change during the 5-6 intervening years. Since 2008 there have been major national policy changes aimed at promoting the adoption of EMR/EHR technologies, especially the Meaningful Use. Therefore, I would expect that the total volume of mentions for both EMR and EHR is likely to be much higher now, though I would conjecture that EHR is still the more frequent term.
By definition, Google Books is limited to information found by scanning information in books and does not include the popular or industry press or peer-reviewed journals. In order to assess whether “electronic health record” was utilized earlier in books or peer-reviewed journal articles, I performed quick query of PubMed. This showed that “Electronic Health Record” didn’t begin to spike in journal references until 2004, though PubMed is limited to the text in the title and abstract.
I experimented with comparing N-grams for the acronyms, EMR against EHR. While the trend for EHR matches that of “Electronic Health Record”, with a sharp spike after 2001, usage of EMR varies, most likely because it also represents other commonly used expressions, including “electromagnetic radiation”.
Why does this matter? The inconsistent use of these terms generates considerable confusion, even among experts. This can impair productive discussions about the sources of healthcare (N-gram below) data for research. The Electronic Medical Record is meant to describe the information systems used within a healthcare organization and is considered to be a legally binding “source of truth”, with stringent regulatory expectations. The Electronic Health Record is meant to leave room for the broader vision of integrating personal health information, captured outside of clinical settings, with EMR and other information sources. Likewise, EHR is intended to convey a new level of interoperability that transcends EMRs. While the usage of EHR has clearly surpassed EMR, reflecting the very appealing vision for the EHR, the majority of actual data available for research remains in the EMR subset of the EHR. In fact, it would be very interesting to assess the impact of Meaningful Use on the proportion of actual data stored in the EMR side of the equation, reflecting the successful expansion of use of the EMR capabilities compared to the EHR functionalities (personal health record, interoperability). Efforts to develop the health record component have generally been unsuccessful (Google Health, Microsoft Health Vault), though the integration of patient portal functionality with commercial EMR systems is gaining traction due to recent expansions of Meaningful Use requirements. Early, but primarily anecdotal successes, for Health Information Exchanges (HIEs) have provided a taste of the interoperability vision of the EHR. Ultimately, my preference has become to aim for an appropriate level of precision – if I really am referring to information exclusively derived from an EMR, I should use that expression and shouldn’t inflate the scope of what I’m doing by using the EHR label. If I am indeed discussing broader interoperability or patient provide data sources, than EHR is the more appropriate term. It will be interesting to revisit this N-gram in 5-6 years to see how usage patterns have changed to reflect either the vision or the reality.
My primary goal in this exercise was to demonstrate an accessible but informative example of a new tool, the N-gram viewer. This query showed that the N-gram viewer can provide supporting information for a wide variety of discussions and can lead into a variety of lines of inquiry (why 2001, not 2004, 1998 etc..). Comparison to PubMed can supplement an N-gram analysis. In later posts I’ll share some N-gram analyses that evoke a variety of new questions. But for now, I’ll conclude with another intriguing N-gram, the impending inversion of “health care” and “healthcare”.
Please share your comments on this post:
[contact-form-7 404 "Not Found"]
One thought on ““EMR” to “EHR” terminology transition in 2001”