A major hurdle in surfacing real-word data comes from interpreting the complex clinical context and nuances within a patient’s health record. Due to their expertise, trained clinicians are best suited for this. However, as researchers seek data on rarer cohorts, statistical techniques like machine learning are helping to scale data curation efforts by humans.
These same techniques are also emerging as valuable tools for researchers as they explore novel approaches to increase data quality and completeness, and to identify meaningful patterns in large sets of data. Together, these tools are proving essential for realizing the promise of real-world data.
Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research
A team at Flatiron developed a technique that combines machine learning and natural language processing with human review to build large-scale research cohorts without sacrificing quality. See how it was applied to the selection of metastatic breast cancer patients.
Flatiron team members discuss the evolving role of machine learning and natural language processing (NLP) in real-world evidence along with appropriate applications for these technologies. Learn more about machine learning and AI.
Opportunities and challenges in leveraging electronic health record data in oncology
Berger, M. L., et al. (2016). Future Oncology, 12(10), 1261–1274.
Two case studies help to show both the opportunities and challenges with extracting structured and unstructured data from EHRs.Read journal article