
Statistical learning by 8-month-old infants
J R Saffran et al. Science. 1996.
Learners rely on a combination of experience-independent and experience-dependent mechanisms to extract information from the environment. Language acquisition involves both types of mechanisms, but most theorists emphasize the relative importance of experience-independent mechanisms. The present study shows that a fundamental task of language acquisition, segmentation of words from fluent speech, can be accomplished by 8-month-old infants based solely on the statistical relationships between neighboring speech sounds. Moreover, this word segmentation was based on statistical learning from only 2 minutes of exposure, suggesting that infants have access to a powerful mechanism for the computation of statistical properties of the language input.
During early development, the speed and accuracy with which an organism extracts environmental information can be ex- tremely important for its survival. Some species have evolved highly constrained neural mechanisms to ensure that environ- mental information is properly interpreted, even in the absence of experience with the environment (1). Other species are depen- dent on a period of interaction with the environment that clarifies the information to which attention should be directed and the consequences of behaviors guided by that information (2). Depending on the developmental status and the task facing a particular organism, both experience-inde- pendent and experience-dependent mecha- nisms may be involved in the extraction of information and the control of behavior.
In the domain of language acquisition, two facts have supported the interpretation that experience-independent mechanisms are both necessary and dominant. First, highly complex forms of language produc- tion develop extremely rapidly (3).Second, the language input available to the young child is both incomplete and sparsely rep resented compared to the child's eventual linguistic abilities (4). Thus, most theories of language acquisition have emphasized the critical role played by experience-inde- pendent internal structures over the role of experience-dependent factors ( 5 ) .
It is undeniable that experience-depen- dent ~nechanis~nasre also required for the acquisition of language. Many aspects of a particular natural language must be ac- quired from listening experience. For exam- ple, acquiring the specific words and pho- nological structure of a language requires exposure to a significant corpus of language input. Moreover, long before infants begin to produce their native language, they ac- quire information about its sound properties (6). Nevertheless, given the daunting task of acquiring linguistic information from lis- tening experience during early develop- ment, few theorists have entertained the hypothesis that learning plays a primary role in the acquisition of more complicat- ed aspects of language, favoring instead experience-independent mechanisms (7). Young humans are generally viewed as poor learners, suggesting that innate fac- tors are primarily responsible for the ac- quisition of language. Here we investigate the nature of the experience-dependent factors involved in language acquisition. In particular, we ask whether infants are in fact better learners than has previously been assumed, thus po- tentially reducing the extent to which ex- ~erience-indevendentstructures must be posited. The results demonstrate that infantsuossess~owerfumlechanismssuitedto learning the types of structures exemplified in linguistic systems. Experience may there- fore play a more important role in the ac- quisition of language than existing theories suggest.
One task faced by all language learners is
the seglnentation of fluent speech into words. This process is particularly difficult because word boundaries in fluent s ~ e e c are marked inconsistently by discrete acoustic events such as pauses (8).Although it has recently been demonstrated that from fluent speech and subsequently recog- nize them when presented in isolation (9), it is not clear what information is used by infants to discover word boundaries. This problem is complicated by the variable acoustic structure of speech across different languages, suggestiAg that infants must discover which, if any, acoustic cues correlated with word boundaries are relevant to their native language (10); there is no invariant acoustic cue to word boundaries present in all languages.
One important source of information
that can, in principle, define word bound-
aries in any natural language is the statisti-
cal information contained in seauences of
sounds. Over a corpus of speech there are
measurable statistical regularities that dis-
tinguish recurring sound sequences that comprise words from the more accidental sound sequences that occur across word boundaries (11). Within a language, the transitional probability from one sound to the next will generally be highest when the two sounds follow one another within a word, whereas transitional orobabilities spanning a word boundary will be relatively low (12). For example, glven the sound sequence pretty#baby, the trans~t~onparlob- ability from pre to ty 1s greater than the transitional probability from ty to ba. Pre- viously, we showed that adults and children can use information about transitional probabilities to discover word boundaries in an artificial language corpus of nonsense words presented as continuous speech, with no acoustic cues to word boundaries (13).
We asked whether 8-month-old infants can extract information about word bound- aries solely on the basis of the sequential statistics of concatenated s ~ e e c h W. e used the famillarization-preference procedure de- veloped by Jusczyk and Aslin (9). In this procedure, infants are exposed to auditory
rnaterial that serves as a potential learning experience. They are subsequently present- ed with two types of test stimuli: ( i ) items that were contained within the familiariza- tion rnaterial and (ii) items that are highly similar but (by some critical criterion) were not contained within the familiarization material.Duringaseriesoftesttrialsthat immediately follows familiarization, infants control the duration of each test trial by their sustained visual fixation on a blinking light (14). If infants have extracted the crucial information about the familiariza- tion items, they may show differential du- rations of fixation (listening) during the two types of test trials (15). We used this procedure to determine whether infants can acquire the statistical properties of sound sequences from brief exposures.
In our first experiment, 24 8-month-old infants from an American-English language environment were familiarized with 2 min of a continuous speech stream consisting of four three-syllable nonsense words (hereaf- ter, "words") repeated in random order (16).The speech strealn was generated by a speech synthesizer in a monotone fernale voice at a rate of 270 syllables per minute (180 words in total). The synthesizer pro- vided no acoustic information about word boundaries, resulting in a continuous strealn of coarticulated consonant-vowel syllables, with no pauses, stress differences, or any other acoustic or prosodic cues to word boundaries. A sample of the speech strealn is the orthographic string bidakupadotigola- bubidaku. . . . T h e onlv cues to word bound- aries were the transitional probabilities be- tween syllable pairs, which were higher within words (1.0 in all cases, for example, bida) than between words (0.33 in all cases, for example, kupa).
To assess learning, each infant was pre-
sented with repetitions of one of four three-
svllable strings on each test trial. Two of
L. these three-syllable strings were "words" from the artificial language presented dur- ing familiarization, and two were three-syl- lable "nonwords" that contained the same syllables heard during familiarization but not in the order in which they appeared as words (17).
T h e infants showed a significant test- trial discrimination between word and non-word stimuli (18), with longer listening times for nonwords (Table 1). This noveltv preference, or dishabitllation effect, ind(-cates that 8-month-olds recoenized the diffuference between the novel and the familiar orderings of the three-syllable strings. Thus, 8-month-old infants are capable of extract- ingserial orderinformationafteronly2 lnin of listening experience. Of course, simple serial-order informa- tion is an insufficient cue to word bound- aries. The learner must also be able to ex- tract the relative freauencies of co-occur- rence of sound pairs, where relatively low transitional probabilities signal word boundaries. Our next experiment examined whether 8-month-olds could oerform the more difficult statistical computations re- quired to distinguish words (that is, recur- rent syllable sequences) from syllable strings spanning word boundaries (that is, syllable sequences occurring more rarely). T o take an English example, pretty#baby, we wanted to see if infants can distinguish a word- Dinternal syllable pair like pretty from a word- external syllable pair like ty#ba.
Another 24 8-month-old infants from
an American-English language environ-
ment were familiarized with 2 min of a
continuous s ~ e e c h strealn consisting of
u three-syllable nonsense words similar in structure to the artificial language used in our first experiment (19). This time, how- ever, the test items for each infant consisted of two words and two "vart-words." T h e part-words were created by joining the final svllableofawordtothefirsttwosvllablesof another word. Thus, the part-words con- tained three-svllable seauences that the in- fant had heaid duringLfamiliarizationbut statisticallv, over the corvus, did not corre- spond to words (20).
These part-words could onlv be iudped as novel if the infants had learned the words with sufficient spec- ificitv and comoleteness that seauences crosskg a word 'boundary were reiatively ilnfamiliar.
Despite the difficulty of this word versus part-word discrimination, infants showed a significant test-trial discrimination between the word and part-word stimuli (21), with longer listening times for part-words (Table 1).T~ILI2S,min of exposure to concatenat- ed speech organized into "words" was suffi-cient for 8-month-old infants to extract information about the sequential statistics of syllables. Moreover, this novelty prefer- ence cannot be attributed to a total lack of experience with the three-syllable sequenc- es forming part-words, as was the case with the nonwords in the first experiment. Rath- er, infants succeeded in learning and re- membering particular groupings of three- syllable strings-those strings containing higher transitional probabilities surrounded by lower transitional probabilities.
Theinfants'performanceinthesestud- ies is particularly impressive given the im- poverished nature of the familiarization speech stream, which contained no pauses, intonational patterns, or any other cues that, in normal speech, probabilistically supplement the sequential statistics inher- ent in the structure of words. Equally im- pressive is the fact that 8-month-old in- fants in both experiments were able to extract information about 'sequential sta- tistics from only 2 min of listening expe- rience. Although experience with speech in the real world is unlikely to be as concentrated as it was in these studies, infants in more n a k ~ r a lsettings presum-, ably benefit from other types of cues cor- related with statistical information.
Our results raise the intriguing possibil- ity that infants possess experience-depen- dent mechanisms that may be powerful enough to support not only word segmen- tation but also the acquisition of other as- pects of language. It remains unclear wheth- er the statistical learning we observed is indicative of a mechanism specific to lan- guage acqu~sitionor of a general learning mechanism applicable to a broad range of distributional analyses of environmental in- put (22). Regardless, the existence of com- putational abilities that extract structure so rapidly suggests that it is premature to assert a priori how much of the striking knowl- edge base of human infants is primarily a result of expertence-independent mecha- nisms. In particular, some aspects of early development may turn out to be best char- acterized as resulting from innately biased statistical learning mechanisms rather than innate knowledge. If this is the case, then the massive amount of experience gathered by infants during the first postnatal year may play a far greater role in development than has previously been recognized.