In this issue, we present five review articles focusing on active and emerging areas of bioinformatics. We identified these areas as ones in which there has recently been a critical mass of initial publications that set the direction of the field, and lay out the key scientific challenges going forward. Much of current activity in bioinformatics is informed and inspired by the recent increased interest in ‘precision medicine' that US President Barack Obama highlighted in his January 2015 State of the Union address. It is clear to all that the computational techniques will be mandatory in the design and delivery of precise healthcare.

Li et al. provide an overview of the progress in computational methods for drug repositioning. The cost of drug development is very high, and once approved a drug can be used by healthcare providers for ‘off label' indications. Although most drugs are approved based on their safety and efficacy in the context of a limited set of disease indications, it is also possible that they have salutary effects for other diseases. Indeed, the unwanted ‘side effects' of a drug may be an important clue to its utility for other diseases. In this review, the authors summarize the key data sources available for computational approaches, review the computational techniques used to associate drugs with new indications and provide a perspective on the most compelling current use cases. Success in drug repositioning leverages the huge investment in drugs that are on the market, and also open up opportunities for drug combinations that are useful.

Tyler et al. provide a useful review on our current understanding of pleiotropy—the association of a single genetic locus with multiple phenotypes. There has been great interest in pleiotropy as an indicator of potential shared genetic architecture between phenotypes that may previously have been considered unrelated. The discovery of relations can suggest new molecular mechanisms, clinical correlations and potentially unexpected drug response (because of the role of a target in multiple phenotypes). This review focuses on computational methods to discover and characterize pleiotropy, and examines how these characterizations are useful for understanding the underlying genetic architecture and opportunities for novel discoveries.

Khare et al. summarize the current interest in using crowdsourcing in biomedical research. Crowdsourcing generally refers to the activity of engaging large numbers of people who make efforts on behalf of science, often without formal scientific credentials. The authors define two groups of users: those who provide data based on their behavior (e.g. search logs, Facebook posts, tweets) that can be used for discovery or hypothesis generation, and users who actively provide labor to support scientific research. Each of these user types creates challenges and opportunities for scientists (while raising interesting issues of authorship and human subjects research). This review summarizes recent work in this relatively recent and fascinating phenomenon.

Gonzalez et al. provide a useful review of text and data mining for biomedical discovery, particularly in the context of precision medicine. After defining the basic concepts and methods in text mining, they review recent emerging applications including the extraction of molecular pathways, the prediction of gene function, drug repositioning, data integration and pharmacogenomics. As long as our scientific colleagues insist on reporting their results in natural language, there will be a challenge of computational analysis of text, and the creation of structured databases of information reported using natural language.

Finally, Greene et al. provide a review of the competencies required for professionals working in biomedical data science or ‘big data' in biomedicine. Existing curricula focusing on computational biology and biomedical informatics have recently been challenged to expand and augment to accommodate the great academic and industrial interest in biomedical data science. The authors discuss the typical differences between modern data science curricular needs, and those present in more traditional programs. Notably, they suggest additional courses that may augment existing curricula, including in ‘biological information flow', ‘statistical challenges of big data' and ‘computational challenges of big data'.

We hope you enjoy these reviews of important emerging areas, and join us in marveling at the fantastic opportunities and challenges that continue to provide a rich scientific research agenda for bioinformatics.