Phonotactics & Language Change

Phonotactics—the set of constraints governing permissible sound sequences within a language—represents one of the most dynamic and revealing frontiers in historical linguistics. While phonemic inventories often receive primary attention in language documentation, the rules that dictate how those sounds may combine, succeed one another, or be forbidden reveal deeper cognitive, articulatory, and sociolinguistic pressures. Over time, phonotactic systems rarely remain static; they undergo gradual or abrupt shifts driven by internal sound laws, external contact, and functional pressures such as perceptual clarity or articulatory economy^[1].

This entry examines the interplay between phonotactic constraints and language change, exploring the mechanisms that reshape syllable structures, the historical trajectories of major language families, and contemporary computational approaches to modeling phonotactic evolution.

What is Phonotactics?

Phonotactics refers to the language-specific rules that constrain the ordering of phonemes. These constraints operate at multiple levels:

Inventory constraints: Which phonemes exist in the language.
Sequential constraints: Which phonemes may follow or precede others (e.g., English allows /st/ but not /ts/ word-initially).
Syllable structure constraints: Rules governing onset, nucleus, and coda configurations (e.g., maximal syllable templates like CVC in Spanish vs. CCCVCC in Polish).

Example: English vs. Japanese Phonotactics English permits complex clusters: /strɪŋ/ (string) Japanese restricts onsets to CV or CQV: /sɯɯˈteɯ/ (stringu via epenthesis)

These constraints are not arbitrary; they emerge from the interaction of articulatory ease, perceptual distinctiveness, and diachronic sound change. As languages evolve, phonotactic boundaries shift, often leaving stratified layers of native vs. borrowed vocabulary that preserve older or foreign constraints^[2].

Mechanisms of Phonotactic Change

Phonotactic evolution is rarely caused by a single factor. Instead, it results from the cumulative effect of phonetic processes, morphological restructuring, and language contact. Three primary mechanisms dominate:

Assimilation & Cluster Reduction

Assimilation—the process by which a sound becomes more similar to a neighboring sound—frequently triggers phonotactic simplification. Over generations, frequent co-articulation patterns can fossilize, eliminating previously permitted sequences. For instance, Latin /pl/ and /bl/ clusters underwent assimilation in Vulgar Latin, eventually yielding Spanish /tl/ → /t/ or /d/ in certain environments, reshaping Iberian phonotactics^[3].

Epenthesis & Simplification

When languages encounter foreign words with illicit consonant clusters, they often insert vowels (epenthesis) or delete segments to comply with native constraints. This process, when generalized, can permanently alter the phonotactic inventory. Modern Japanese extensively uses /u/ epenthesis for English loanwords: club → kuraabu, reinforcing a strict CV(C) syllable template^[4].

Language Contact & Borrowing

Intense bilingualism or substrate influence can introduce new phonotactic possibilities. The Celtic substratum in British English, for example, is hypothesized to have influenced the development of certain consonant cluster realizations and stress patterns, demonstrating how contact can expand rather than restrict phonotactic space^[5].

Historical Case Studies

"Phonotactic change is rarely catastrophic; it is usually the sedimentation of micro-adjustments over centuries." — H. J. M. Broekhuis, Diachronic Phonology (2018)

Latin → Romance Languages: Classical Latin permitted complex codas and onset clusters (e.g., psychōlōgus). Vulgar Latin and early Romance underwent systematic cluster simplification, yielding languages like French and Italian with markedly more restrictive phonotactics. This shift correlates with stress accent generalization and the loss of quantitative vowel distinctions^[6].

Germanic Consonant Shifts: Grimm's Law and subsequent shifts altered not just individual phonemes but entire cluster configurations. Proto-Germanic /sb/ → /sp/ and /ð/ → /d/ in certain environments restructured onset possibilities, laying the groundwork for the divergent phonotactics of English, German, and Scandinavian languages^[7].

Slavic Cluster Preservation: Unlike Romance simplification, many Slavic languages preserved or even expanded consonant clusters through morphological fusion and reduced vowel inventories. Polish, for example, permits word-initial /psztsch/ and word-final /-rtsw/, demonstrating that phonotactic complexity can increase under specific morphological pressures^[8].

Computational & Cross-Linguistic Research

Recent decades have seen a surge in data-driven phonotactic research. Projects like PhoIBLE and Lexibank have enabled large-scale cross-linguistic analysis of syllable structures, revealing statistical universals and language-specific outliers. Machine learning models now predict phonotactic well-formedness with high accuracy, using features such as sonority sequencing, place of articulation, and feature geometry^[9].

Key Insight

Computational phylogenetics suggests that phonotactic complexity tends to correlate inversely with community size and trade network density, supporting the hypothesis that communicative efficiency drives cluster simplification in large, diverse speech communities^[10].

These approaches have revolutionized our understanding of language change, shifting the focus from descriptive cataloging to predictive modeling of phonotactic trajectories.

References

Greenberg, J. H. (1966). Language Universals. Mouton.
Clements, G. N. (1990). "The role of the sonority cycle in core syllabification." In Papers in Laboratory Phonology I, Cambridge UP, 283–333.
Thomason, S. G., & Kaufman, T. (1988). Language Contact, Creolization, and Genetic Linguistics. University of California Press.
Kenstowicz, M. (2001). "Phonemes and features." In The Handbook of Phonological Theory, Blackwell, 143–172.
Lahiri, A., & Horeckova, K. (2019). "Celtic substratum and British English phonotactics." Journal of Historical Linguistics, 9(2), 211–245.
Harris, J. W. (2001). "The Romance languages." In Language Family and Language Change, Oxford UP, 89–122.
Antonsen, E. H. (1974). An Etymological Dictionary of Old Norse. Clarendon Press.
Mielke, J. (2008). "The emergence of phonotactics from competition among signs." Phonology, 25(1), 97–154.
Monahan, C. (2021). "Machine learning approaches to cross-linguistic phonotactic prediction." Natural Language & Linguistic Theory, 39, 445–482.
Wichmann, S., et al. (2018). "Community size and phonological complexity." Proceedings of the Royal Society B, 285(1875), 20172961.