Current Research and Scholarly Interests
We develop methods to analyze large unstructured data sets for data-driven medicine. We use ontologies to annotate, index and analyze Big Data in biomedicine for enabling data-driven decision making in medicine and health care. Our research group is part of the Center for Biomedical Informatics Research at Stanford and the National Center for Biomedical Ontology.
Data driven medicine: The goal is to combine machine learning, text-mining, and prior knowledge in medical ontologies to discover hidden trends, build risk models, drive data driven decision making, and comparative effectiveness studies. We have developed methods that transform unstructured patient notes into a de-identified, temporally ordered, patient-feature matrix (Imagine it as row = patient, column = medical concept, 1 = present, 0 = absent). With the resulting high-throughput data, we can monitor for adverse drug events, learn drug-drug interactions, identify off-label drug usage, generate practice-based evidence for difficult-to-test clinical hypotheses, and generate phenotypic fingerprints as well as build predictive models. We have efforts around combining multiple information sources for drug safety surveillance, which were recently the focus of a commentary titled Advancing the Science of Pharmacovigilance.
Annotation Analytics: In order to understand the “gene lists” from analysis of high-throughput data, researchers routinely use Gene Ontology based analyses. With available methods for automated annotation and the existence of over 200 biomedical ontologies, we can stop using just GO and move to enrichment analysis using disease ontologies.