This map shows the dispersal of phonemes in comparison with dispersal of genetic traits.

This map shows the dispersal of phonemes (solid arrow) compared with dispersal of genetic traits (dashed arrow). Click image to enlarge. (Illustration: Creanza et al.)

Human dispersal and the evolution of languages show strong link, Stanford biologists find

In the largest comparison of genetic and linguistic data ever attempted, Stanford biologists find that features of language show a strong link to the geographic dispersal of human populations.

Geneticists have famously tracked small differences in the human genetic code to trace the evolution and spread of humans out of Africa. Languages can change more quickly than genes and are not necessarily inherited from one's parents, although linguists are able to follow similar clues to uncover how languages have changed and migrated over millennia.

Now, scientists at Stanford and other universities have combined large databases of globally distributed linguistic and genetic data, revealing in greater detail how languages might change in parallel with genes.

The results were recently published in Proceedings of the National Academy of Sciences.

The researchers incorporated genetic data from 246 worldwide populations with 728 phonemes from 2,082 languages. Phonemes are the minimal sound components that can distinguish meaning between two words. "To" means something different than "do," so "t" and "d" are distinct phonemes, said lead author Nicole Creanza, a postdoctoral fellow at Stanford. There are roughly 40 phonemes in the English language.

Through an advanced statistical analysis, the authors found that geographic distance was linked to both genetic and phonemic distance. On average, the closer together two languages or two genetic samples were to one another, the more similar they were, even when the languages compared were not in the same language family.

This suggests that some nearby languages may have borrowed sounds from one another even if they are not closely related.

In general, phoneme differences paired well with patterns of genetic variation on a local scale, which Creanza said might suggest a connection between historic human dispersals and patterns of linguistic variation. While the relationship between genes and geography represents a global pattern, there was a limit to the distance over which such phonemic differences corresponded to geographic distance.

"When language samples were more than 10,000 kilometers apart, the relationship between phoneme differences and geographic distance broke down," Creanza said. "Outside of that radius, two languages' locations did not give us information about how similar their sounds would be. Because languages can change quickly, we didn't know in advance how fast this signal would degrade."

Another interesting difference was that in contrast to the well-established detrimental effect of geographic isolation on genetic diversity, geographically isolated languages actually showed greater variance in their phonemes than languages with many neighbors.

The authors said that future studies could bring light to the extent to which genetic and geographic relationships can help explain phoneme evolution.

"Studies of human evolutionary history benefit from a multipronged approach and drawing on many disciplines that study the human past," said corresponding author Sohini Ramachandran of Brown University. "This study's integration of genetic data with linguistic data, and methodologies for studying the geographic distribution of variation in both data sets, highlight an integrative approach that we hope will be used by more researchers in the future."

Bjorn Carey, Stanford News Service: (650) 725-1944, bccarey@stanford.edu

Nicole Creanza, Department of Biology: creanza@stanford.edu

Marcus Feldman, Department of Biology: mfeldman@stanford.edu