Stanford houses a substantial set of corpora for all sorts of linguistic research. Follow the links below for more information.
Our corpora are available in digital copy on the Stanford AFS system and in hard copy from the Corpus TA.
We have most of the corpora released by the Linguistic Data Consortium, as well as a number of other corpora and databases.
We provide tools and utilities for interacting with corpora, and instructions on how to do this.
Corpora can contribute to research across all subfields. Diverse research projects utilizing corpora at Stanford are featured at the annual CorpusFest lunch.
The Corpus TA can answer your questions and help with any stage of corpus-related projects.