HMS - Countway Library of Medicine - Director's Blog: Informatics

Showing posts with label Informatics. Show all posts

2014-09-23

Worrisome Trends in Ebola

My colleague John Brownstein just shared these worrying projections for Ebola epidemiology.

2014-08-06

Near real-time tracking of Ebola

If you want to see just how Ebola is spreading (and soon regressing?) , this article provides the view and insight of how it’s done. Ten years ago, these data would percolate slowly, compiled by public health authorities and eventually appear in an academic or lay publication Now through the foundations of computational epidemiology established by Dr. Brownstein, we are able to directly monitor this in-progress disaster. Now that we have a good afferent circuit, let’s see hope the efferent circuit is even more robust.

2014-03-27

Classification system worthy of legislative intervention?

The ICD-10 diagnostic classification system is due to be adopted by US healthcare systems this year. However, new legislation would defer that change by at least a year. Who would have thought that a group of statisticians working on a list of the causes of death in 1891 would cause billions of dollars to be spent and thousands of hours invested in debating what that list should look like?

This fine system of classification that would have made Linnaeus’ head spin includes classic diagnoses such as:

V91.07xD (burn due to water-skis on fire, subsequent encounter)
W56.22xA, (struck by orca, initial encounter.)
W56.12XA (getting struck by a sea lion)
V97.33XD (Sucked into jet engine, subsequent encounter)

The orcas may have to wait another year before getting their medical recognition.

2013-07-17

Innovation to grow (and track the growth of children).

I've written in the past about the bottleneck in innovation in electronic health records and how the design of such systems with substitutable "apps" would go a long way towards overcoming that bottleneck. In that context, hats off to the design team at Fjord who just won the prestigious Red Dot design award for their growth chart/growth tracking applications that are now being adopted by several vendors.

2013-04-08

Getting Big About Mapping Dengue

Here's a very nice application of lightly used data sources about Dengue, a scourge of underdeveloped countries. As in so many areas of public health, this huge health burden is woefully under-documented. In the absence of a vetted vaccine, understanding where it is endemic is essential for the application of scarce preventive resources. This group of investigators have cleverly used a number of public but under-used data sources, including the published literature, to create a predictive map of where 390 million infections per year are occurring.

)

2013-01-08

Epidemic or epiphenomenon?

A number of crowd-sourced infection monitors such as FluNearYou (by our own Dr. J. Brownstein) have reported an apparent upsurge in influenza-like illnesses over the last week. The CDC has not yet reported the same trend. If the CDC then confirms this early warning, it will represent a transition from proof of principle of the citizen as health monitor (see here and here) to general public health utility. If not, then we may be witnessing an outbreak of hypochondria.

2012-12-03

Take this ontology and shove it. Or, why classification matters.

I was recently called out by one of my colleagues for saying that ontologies were boring, this despite my own doctoral work on knowledge representation. Motivating my glib comment was an image of a group of pasty-faced individuals gathered around a large boardroom table and discussing which angel fit on which pin. Events from this past weekend are a reminder why such glibness is not helpful.

The American Psychiatric Association has just approved a set of updates, revisions and changes to the reference manual (DSM5) used to diagnose mental disorders. Among the changes are those redefining the inclusion and exclusion criteria for autistic disorders. By changing which children are classified as having an autistic disorder, parents will be made to feel more or less comfortable having a child carrying the diagnosis. Just as importantly, insurance companies and school programs might shift their criteria that determine which child and family gets what kind of support and at what cost. In the near term, clinical trials for the treatment of autism may not include the same patients as they would have prior to this retaxonomization.

So, are ontologies boring? Perhaps. But they certainly belong to the class of hugely important societal constructs.

Hat tip: David Osterbur.

2012-11-20

Learning from the FDA

It is not too often that one is driven to read a report from a regulatory agency. Even rarer are the instances when we find prismatic examples of engineering and organizational leadership in these reports. That makes this strategic plan of the FDA Information Management and Office of Information Management all the more remarkable. As an inducement to read the full report, here are some results that the informatics and IT departments of many organizations, academic and industrial, would envy.

Reducing the number of servers from 397 to 265 (by a virtualization and hosting approach).
Not coincidentally, availability (i.e. not downtime) increased from 98.3% to 99.9996% (the difference between 30 seconds of unscheduled downtime and over 6 days unscheduled downtime).
Billions of intrusion attempts against FDA IT Systems annually with no major information security breaches.
Supporting annual 5-15% increases in IT capability without increases in budget.
Training budget for personnel eliminated and training activities of personnel increased based on savings from reduced external consultant fees.
Annual decrease in cost of data storage.

2012-08-08

Billions and billions of gene expression measurements.

Let's say you are looking for a disease biomarker. Hopefully, one better than prostate specific antigen. Next time you or your student reach for a pipette to see if a gene is expressed in a particular tissue or disease, perhaps you should first check with the public databases of gene expression. As outlined in this article, we now have hit the one million array mark. That is, one million arrays measuring gene expression across thousands of conditions (tissues, diseases, pharmacological or environmental perturbation). And each array has tens of thousands of genes so these corpora have billions of gene expression measurements. That means you'll immediately be able to see if your favorite gene is uniquely expressed in a tissue in a specific disease. Or not.

Another way to think about these corpora is that they constitute one of the largest open access biomedical libraries. A model for clinical research to emulate?

Hat tip: Atul Butte.

2012-07-24

Unstandardized standards

An insightful naïf learning about the difficulty of sharing one electronic health record from one hospital to another might reasonably ask "Why don't they just create a standard for data sharing so that I can install or delete health apps at will and view my data on several different electronic health record systems?" An expert will then inform that impertinent naïf that it's much more complicated than she understands and that the standards already exist. When challenged, the expert will cite several august committees which have ratified standards such as the Continuity of Care Document (CCD). At this point, our naive protagonist should refer the expert to this blog entry by Josh Mandel. If by then, the expert is not holding his hands to his ears, he will explain that all standards are evolving entities and that these challenges are just the expected missteps on the path of convergent evolution to interoperable samadhi.

2012-03-01

Get paid to play

Earlier, I described the SHRINE distributed query system across 6 million patients with 10 billion facts. If you are a member of the Harvard Medical School faculty (with employment at one of the affiliated hospitals) you now have the opportunity to get money and glory (more the latter than the former) to spin clinical data into biomedical gold. Details on the context can be found here: http://catalyst.harvard.edu/services/pilotfunding/shrine.html

If you have questions, use this email contact.

2012-02-24

This terminology goes one louder.

There has been considerable controversy about the merit and risk of upgrading the terminology that is used in the USA to bill for most healthcare transactions: ICD9 to ICD10. However, given some of the concerns about the adequacy of ICD10, many are now advocating that we skip ICD10 (with costs of millions of dollars per large hospital and tens of billions of dollars, nationwide) and immediately proceed "one louder" to ICD11. It is argued that the investment will then be far more durable and with a more favorable impact on cost and quality accounting in healthcare. Others argue that we should go for the bird in the hand. No doubt many librarians could opine knowledgeably about the costs and benefits of changing classification systems and Linnaeus would be impressed by how many now labor to classify diseases, drugs and procedures.

Hat tip: Ken Mandl

2012-02-16

Research by the numbers

What if you could mine the 10 billion medical facts across 6 million (anonymous) patients in five Harvard affiliated hospitals to ask an important and timely question? What are the other diseases or disorders associated with autism? How has the pharmacological treatment of inflammatory diseases changed over the last five years? Are there gender differences in prevalence of the infections in autoimmune diseases? How is the prevalence of diabetes mellitus changing in young adults?

Now, for the first time, if you are an eligible faculty member (or one of their fellows) in one of the five hospitals, you can now productively seek answers to these questions. The Shared Health Research Information Network (SHRINE) helps researchers overcome one of the greatest problems in population-based research: Compiling large groups of well-characterized patients. Eligible investigators may use the SHRINE web-based query tool to determine the aggregate total number of patients at participating hospitals who meet a given set of inclusion and exclusion criteria. The criteria are currently demographics, diagnoses, medications, and selected laboratory values. Because counts are aggregate, patient privacy is protected.

So, whether you are seeking a study cohort, preliminary studies for a grant proposal, or evaluating an epidemiological hypothesis, take this new tool for a spin and start translating this large mass of hard-won data into useful biomedical knowledge.

2011-10-09

Faster, cheaper and in control

Let's say you have a problem (e.g aligning the world's literature to defining the phylogenesis of the components of the current world-wide written corpus for scholarly attribution and automatic detection of plagiarism) that requires a computational solution. But it's taking days for the software to run. Buying a faster, bigger computer might provide some speed up, but what if you could get a 1000 fold improvement through a better implementation of the algorithm at the core of your software? Here's your chance to see if it can be done through a contest hosted by the Harvard Catalyst. Will the Overmind answer your most difficult computational questions?

2011-09-27

Out in the Open

This impressive compilation from GOOD (the data issue) documents the impressive growth of Application Programming Interfaces that provide third party software developers with access to, and the ability to repurpose, large and very useful data sets. This growth is driven both by altruism and self-interest and represents a dramatic refutation of the skepticism towards the open data movement of merely a decade ago.

Hat tip David Kreda

2011-09-26

Take two aspirin and an algorithm and call me in the morning.

This note from the American Medical Association nicely summarizes the recent approval of certification in Clinical Informatics by the American Board of Medical Specialties. It represent the closest encounter between clinical training and librarianship to date. We'll see what it portends for relative compensation.

2011-09-08

Weighty searches

Billions of Google searches may seem to be evanescent, ephemeral, electronic abstractions but this article suggests that they leave a weighty, grimy residue. The company’s electrical consumption (mostly the data centers) is said to create a carbon footprint of one million five hundred thousand tons in a single year. That is possibly much less than the footprint left by the car/bus trips and phone calls that have been made unnecessary by web searches. But it does suggest that search engines that will be better (i.e provide the sought for answer in fewer searches) will also be greener, even without more efficient computational hardware.

2011-07-22

The unbearable effectiveness of data

Researchers in artificial intelligence (AI) of the 1980’s, librarians and aficionados of the Semantic Web have a shared faith: The unique value of human-designed knowledge structures whether they be taxonomies, ontologies or metadata. These knowledge representations are seen as providing important leverage in information retrieval, knowledge discovery, and decision-support. In this context, I was recently reminded by Alal Eran of an article by researchers at Google about the value of BIG data. These researchers (one of whom wrote a wonderful book on Common Lisp—Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp—widely appreciated by the AI community, which includes applications for expert systems) describe how statistical methods applied to trillion-word corpora can automatically support the aforementioned information tasks without requiring human annotation/categorization. It may be that the combination of human-derived annotations (whether crowd-sourced from the web or carefully curated in the monasteries of the ivory tower) can be used synergistically with the purely statistic-learning methods, but that has yet to be convincingly demonstrated. Until then, those of us working on genomic research will see how far we can get just with data, particularly those obtained in the course of healthcare.

For those of us in libraries and those of us who are librarians, there is now an active debate that has yet to achieve resolution on what value there is in human annotations and metadata. If there is value, at what cost? And if it is cost-effective, how do we demonstrate the efficacy? Our Universities' leaders will be interested in the answers and so will our colleagues at Google.

2011-03-08

Let the games begin!

Do you think that you can create the new software app that will revolutionize healthcare? Do you agree that substitutability will allow us all to innovate healthcare practice? As detailed on the challenge.gov website, there is now a very short term opportunity to "walk the talk" for a modest prize and immodest glory.

2011-03-03

Neat or scruffy?

Is your desk topped by the monumental accreta of your work or does it retain it's pure sheen of Scandinavian simplicity? It turns out that the dichotomy between the "Neats" and the "Scruffies" cuts across several broad swathes of the human condition. Among these are the archane arts of taxonomization and representation so well known to librarians, botanists, and engineers working on electronic health record interoperability. On the latter topic, the President's Council of Advisors on Science and Technology, (PCAST) report has issued a report on how health information technology will or will not be effectively used to improve healthcare. Given the work we are pursuing on substitutability, our own Ben Adida shared a perspective on the report.