Just over ten years ago, a pivotal piece of legislation, the HITECH act, was created to motivate the implementation of electronic health records (EHR) and supporting technology. Today, with EHR adoption at over 90 percent, we are facing the new challenge of generating real-world data at the quality, completeness and accuracy necessary for the most valuable applications across the healthcare ecosystem.
A major hurdle in surfacing real-word data comes from interpreting the complex clinical context and nuances within a patient’s health record. Due to their expertise, trained clinicians are best suited for this. However, as researchers seek data on rarer cohorts, statistical techniques like machine learning are helping to scale data curation efforts by humans. These same techniques are also emerging as valuable tools for researchers as they explore novel approaches to increase data quality and completeness, and to identify meaningful patterns in large sets of data. Together, these tools are proving essential for realizing the promise of real-world data.
Learn more about emerging technology in RWE
Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research
A team at Flatiron developed a technique that combines machine learning and natural language processing with human review to build large-scale research cohorts without sacrificing quality. See how it was applied to the selection of metastatic breast cancer patients.
The Role of Machine Learning & NLP in Real-World EvidenceExclusive Content 45:37
Flatiron team members discuss the evolving role of machine learning and natural language processing (NLP) in real-world evidence along with appropriate applications for these technologies. Learn more about machine learning and AI.
Opportunities and challenges in leveraging electronic health record data in oncologyBerger, M. L., et al. (2016). Future Oncology, 12(10), 1261–1274.
Two case studies help to show both the opportunities and challenges with extracting structured and unstructured data from EHRs.Read Journal Article