Shah Lab

Our group develops methods to analyze large unstructured data sets for data-driven medicine. We use ontology based approaches to annotate, index and analyze Big Data in biomedicine for enabling data-driven decision making in medicine and health care. We have developed methods that transform unstructured patient notes into a de-identified, temporally ordered, patient-feature matrix. With the resulting high-throughput data, we can monitor for adverse drug eventslearn drug-drug interactionsidentify off-label drug usage, generate practice-based evidence for difficult-to-test clinical hypotheses, and generate phenotypic fingerprints as well as build predictive models. We have an active effort around combining multiple information sources for drug safety surveillance. We also apply our methods to understand the “gene lists” resulting from analysis of high-throughput data. Researchers routinely use Gene Ontology based analyses, and we believe that it’s time to stop using just GO and move to enrichment analysis using disease ontologies.