Zach Weinberg: Hi, everyone. Really sucks to follow Amy. It's like the hardest job in the world. I always like to start with a little story. So, I called my wife last night and I said I've got two really big problems. Number one, Amy is talking in front of me, and number two, I don't have a joke. And I need a joke. And she goes, "Well, who's in the audience?" And I said, well, it's mostly physicians and scientists and engineers. She gave me the best advice possible. She said just, "Well, pander to the audience." So, here's my joke. It's not gonna be good. Why are statistical programming languages the best programming languages? Because they are. Okay. It's a pretty good one, I think. So, first and foremost, I want to join Amy in welcoming everyone to our first Research Summit. Our expectation, our hope is that this is gonna be a recurring summit, something we do multiple times, either maybe once a year or once every two years, but thank you everyone for joining us today.
So, Amy talked about where real world evidence is going, what the 40,000 foot view looks like over the next five years. What I wanna do is kind of zoom in and talk about the 400 foot view, and in particular, what we're gonna do over the next 48 hours and how we're gonna talk about real world evidence today and tomorrow. So, I want to start, first and foremost, with why are we here in Washington, D.C.? Or, I guess, technically, Maryland. Two trends I want to cover. The first trend has to do with the growing demand for evidence in oncology. And for folks that have worked with Flatiron before, you've probably seen some version of this slide, but generally speaking, we have three key trends that are increasing the demand for evidence in oncology and, in particular, in oncology therapeutics.
First and foremost, we have more drugs coming to market than ever before. R&D is getting better, tools for scientists are getting better, and we expect to see the number of new therapies that are in development grow exponentially. Second is we're beginning to use those therapies in combination with each other, with different mechanisms of action. And then, finally, we have better diagnostic tools such as next-gen sequencing in order to sub segment individual populations. So, this is like a great problem to have, right? This is the science moving ahead. But when I look at this, and when we first learned about this problem six or seven years ago, I thought this is a math problem. This is multiple therapies in combination with each other in really novel cohorts. The number of questions that we're gonna need to answer is going to grow, and it's not gonna grow linearly, it's gonna grow exponentially. And this is what's coming.
What's interesting about this is it's not just a pharma problem, it's not just a biotech problem or a physician problem or a patient problem. This is a regulatory challenge, as well. And groups like the FDA or EMA or payers or health authorities are all recognizing that this is going to be a challenge for them over the next 10 years. How do we make decisions? What's the dataset that we have to make those decisions?
So, Amy talked about this rapid adoption of electronic health records in oncology. It is actually true in healthcare, broadly. This is the second trend. And what's interesting about this trend, and I'll tie it back to Flatiron for a second, had this trend not happened, had the HITECH Act not been passed, I don't think Flatiron as a company would have been able to exist. This is the infrastructure. This is kind of the fundamental shift in how data is captured that's allowed Flatiron as a company to even appear in the first place. And I've said this publicly and with Nat and our investors. If we had tried to start this company even maybe three or four years before we did, I don't think it would have worked. But now, what's interesting about this EHR adoption is we can think about aggregating source data or electronic health record data at scale, and we can do this without having to get on a plane and go to a cancer center and pull a chart out of a manila folder. We can actually begin to use computers to aggregate this data.
But as everyone knows, source data is only as good as the source data, and there's a tremendous amount of post-processing and information that we need to do from source to regulatory-grade. And in particular, there's two problems here. One is rigorously processing the individual data points for high quality. The other is developing analytic methods in order to use this data because it is observational and it is retrospective.
So, the goal for the next 48 hours and subsequent conferences, as well, is to focus, in particular, on this middle layer where that question mark sits. So, for Flatiron, and again for the next 48 hours, what we want to do is bring transparency to this process. What is in that question mark? What are all the steps that are happening? How are we doing things? And actually show, both with Flatiron examples as well as with our partners' examples, of what's going well and also what's not going well.
I want to talk for two seconds about what this process looks like. I pinged our product team and I said, hey, can you give me just a timeline view of what happens from source data to analytics and interpretation? And she pinged back and said, "Well, this is gonna be a really complicated thing to put on the slide because there's actually a tremendous number of steps." This is like the 20,000 foot view of what actually goes on. We take an initial study design, we select a cohort, we do a whole bunch of work to actually generate that dataset, we do QA and QC on the dataset, it loops back around, and then, finally, we do a set of analytics and interpretation. But there is a tremendous amount of depth and detail in each and every single one of these steps, and this is what we're gonna focus on is actually going into that detail.
So, before we do that, I wanna talk for two seconds about what the conference is not. This is not meant to be a Flatiron sales conference. You've heard from Amy and myself, and from here on out we wanna talk about the science. It's not a dog and pony show. Even though it may look like one. That's not a real dog with a real hat, by the way. So, while we do have Flatiron presentations, we also have over 19 non-Flatiron presenters for this conference, and for future conferences we hope that number increases over time. We want the percentage of people up on the stage that are Flatiron employees to actually come down as the science gets better.
So, talk about the goals for two seconds. Three core goals, these won't come as a surprise. First and foremost, we wanna pull back the curtain on real world data and analytic methods. So, how do we actually do this? Let's show some of the details. The second is we wanna talk about applications of real world evidence. So, Amy laid out a compelling five-year vision. What are the current applications? What's working and then also what's not working? Where does RWE not actually have a place? And then, finally, what we're excited about, and getting everyone in a room together, is this is the first inning. This is a long journey here over the next 10 years, and we wanna hear from our partners and our customers on different ideas, ways that we can share both data and methods and learnings so we can improve this and do this work faster.
So, I'm gonna anchor, hopefully, the rest of the conference on this idea called the regulatory-grade quality checklist. Some folks may be familiar with this idea. We published in December, I believe, and we've made some tweaks and improvements based on feedback and additional learnings, so this is a publication that we expect to live and to grow as we learn more. So, I wanna start with what the checklist looks like today. And, in particular, what I wanna do is define each of these individual terms. So, what do we mean when we say clinical depth or provenance or scalability?
So, I'm gonna start with clinical depth. Clinical depth is asking the question does the dataset capture the important features and clinically relevant characteristics of disease, of treatment, and of outcome? For completeness, is the information available on a sufficient proportion of patients to enable clinically meaningful analyses? Have we benchmarked this data against some sort of gold standard? Longitudinality, do we have the ability to follow an individual patient longitudinally over time throughout their disease course? So, from diagnosis through outcome. Timeliness or recency, can we monitor the evolving treatment landscape and can we do analyses and access insights on a timely cohort? So, is this a cohort that is 30 days old or two years old or five years old? Provenance, do we have traceability through the stack? Can we actually take an individual data point and trace it all the way back to the initial source document? Scalability, can we actually take this data model and can we scale it to larger cohorts, to different populations? Can we do this across different source systems, so across multiple electronic health records? Generalizability, is the population representative of the broader patient population of interest? Have we found potential biases? Have we identified them? Have we addressed them? And then maybe most importantly, quality monitoring. So, do we have processes and methods to actually check for data accuracy and quality?
So, this is a public document. This is something we are putting out into the field, and we hope this guides the conversation around real world evidence over the next few years. So, for the next two days, our goal is to take this checklist and bring it to life through the various sessions and into the details. So, before we go to the actual substance, I wanna give just a quick preview of some of the upcoming sessions and then how they relate back to the checklist. This is not meant to be an exhaustive overview. These are just a few sessions that we're gonna cover, but I just wanna highlight a few different things. In the next session, so the one following me, we're gonna talk about leveraging real world evidence for regulatory use. We're gonna focus on a few key features of regulatory-grade RWE. We're gonna talk about completeness of data, we're gonna talk about provenance of data, and we're gonna talk about quality monitoring systems. You're gonna hear from leaders from FDA, from Amgen, from Janssen, and get their perspectives on using real world evidence for regulatory purposes. Maybe the one I'm most excited about is our machine learning and NLP session. So, our Flatiron engineering and product teams are gonna talk about where ML and NLP work, where they don't work, what are the right ways to apply these kind of modern technologies, and we're gonna talk about how Flatiron uses these approaches to reduce the burden of data processing, how we scale our cohorts, how we identify sub-cohorts using kind of novel machine learning techniques.
Tomorrow, there's gonna be two sessions, in the morning I believe, about constructing external control arms. This is, obviously, one of the most impactful if not the highest impact opportunity in real world evidence but it's really early days. And so we wanna share our learnings for what's working and what's not working. We're really optimistic about the opportunity to use real world evidence as an external control, both as a complement or potentially a supplement to clinical trial data, and we wanna talk about how we're actually doing this.
Maybe the most interesting thing at the conference is our Abstraction Lab. So, most folks know about unstructured data processing and they realize that we, at Flatiron, use a kind of massive team of distributed workers to go through, who are experts, and look through this data, and we call that process abstraction. What most people have never actually seen is how that process works underneath the hood. They've never seen our tools. And so what we have right outside the conference are a set of, essentially, fake patients where you can come in and actually use the Flatiron tool set to do abstraction. So, most important in all of this is we're gonna track who is the best at this. Gotta create a little bit of competition here. We're also gonna give out prizes to the top three winners. I was asked earlier if the prize would be free data. I asked our CFO. He laughed and didn't even give me an answer, so I think that's a no. But we are gonna give out prizes to the top abstracters during the session.
And then, finally, maybe for tomorrow afternoon, we're gonna close the conference with Dr. Bobby Green who you saw in the video earlier. Bobby is a medical oncologist at Flatiron. He's a Flatiron employee. He oversees the clinical accuracy and the feature set for physicians in our electronic health record. What's really interesting about Bobby is he also is a practicing medical oncologist. He sees patients one day a week at clinic in Palm Beach in Florida, and he's actually an active user of the EMR. So, when you watch Bobby and you see him do rounds with patients and you see him do documentation, the system that he's putting that data into is Flatiron's own EHR. And he's gonna talk about the potential impact of real world evidence on patients and how to get better therapies to market faster.
So, we are really excited to have everybody here. We have over 150 attendees. We thought it was gonna be more like 100, we got to about 170 or so, which is great. There's 33 different organizations. We have 13 case studies. So, this conference has somewhat already exceeded our expectations in terms of attendance and people and cases. Most important for us is the non-Flatiron presenters, and we have 19 of them. As I mentioned before, we hope in future sessions the majority of speakers are non-Flatiron employees and this is not one slide, or maybe the heads get really small. Some way we can fit 50 or 60 people on the slide. And then finally, before I hand it off, I just wanna thank everybody for attending, for coming, and in particular for sharing. I know many of the sessions we're gonna talk about things that typically customers would not share with each other. This is what we're really excited about. We think these case studies, these examples, these methods need to come out into the open and we need to see transparency in terms of how everyone is doing things.