The widespread common-sense view considers that children learn their language from their parents as a direct product of their interaction with them. Yet, it remains to clarify how this happens and specifically which cognitive mechanisms become involved in this learning process. The common knowledge would suggest that children learn their language from parents by means of imitation alone. However, before imitation could even take place, children’s ability to scan the communicative context should be considered. Moreover, relatively little effort was made to investigate the issue of imitation in children language learning process. Literature about imitation in language is mentioned in some outdated works, such as Tiedemann (1897) who observed the attempts of his six-month child to follow his mother gaze and to pronounce the syllable “ma”. The more recent literature on the topic shows that infants who perform better in a gaze following task appear to acquire vocabulary at a surprisingly rapid pace. ‘Gaze following’ is defined as the ability to follow the attentional focus of another person, which emerges during the second half of the first year of life as confirmed by several contributions such as Butterworth and Jarrett’s work (1991) and Gredebäck, Fikke and Melinder’s one (2010).
During the 1970s, the child’s linguistic environment was a major focus in language acquisition research. However, this interest subsequently decreased and the focus shifted to the child himself. One main reason for this lack of interest is the use of Child Directed Speech (CDS). According to Dominey et al. (2004, p. 125): “In this CDS, ‘motherese’, or ‘parentese’, the segmental structure is deformed in favour of the exaggerated prosodic structure”. In other words, CDS is a distinct register in comparison to Adult Directed Speech (ADS) due to the exaggerate intonation which produces “great swooping curves of sound over an extended pitch range” (Saxton, 2017, p. 88). If on one hand the segmental structure pertains the articulatory characteristics of an utterance, on the other the prosodic structure serves the function of grouping and giving prominence to the elements which make up the speech signal (Harrington et al., 2014). For instance, mothers seem rather prone to raise their pitch when children are emotionally engaged. Therefore, CDS enables caregivers to emphasise the information conveyed by prosodic cues — i.e. the information retrieved by the musicality of speech — and adapt it to the child’s perceptual capabilities. Some syllables in a speech seem in fact produced with more energy — usually they are referred to as “stressed” — due to differences in prosodic cues such as amplitude, duration or pitch. Nonetheless, the validity of CDS — which is both a phonological and lexical register — was undermined by the assumption that it is a privilege of a minority of Western mothers. Indeed, the frequency and quality of CDS were proven to be strictly related to the family socio-economic status (Schwab et al., 2016), which is averagely higher in Western families. Nonetheless, the cross-cultural research presented in Saxton (2017) suggests otherwise, attributing to CDS the pivotal role of obtaining the child’s attention. Therefore, a child’s environment is still attention worthy as far as language learning mechanisms is concerned. One of the most recent and prominent non-nativist account of syntax acquisition, also known as ‘usage-based’ theory, considers critical factors in language development such as the use of pointing and the emergence of collaborative engagement in a shared goal. On the contrary, the nativist approach reduces the child’s environment to a matter of limited exposure to key linguistic forms — which then triggers language acquisition. The principle which underpins the usage-based theory is that our language knowledge is obtained from the use of language itself, as said in Langacker (1987). In his Cognitive Grammar (2008, p. 16), Langacker affirms that
Automatization is the process observed in learning to tie a shoe or recite the alphabet: through repetition or rehearsal, a complex structure is thoroughly mastered, to the point that using it is virtually automatic and requires little conscious monitoring.
The process of automatization is thus seen as one of many basic phenomena that are evident in many facets of cognition. Therefore, phenomena such as association, schematization, categorization and automatization are seen as independent cognitive processes recruited for language usage.
Hence, the usage-based theory of language development is firmly rooted on the prominence of the social act of communication. One main reason that accounts for his position is that abilities such as reading and sharing intentions of others are in fact uniquely human abilities. The communication skills of the infant go through a flourishing period of development over their first year of life. The interest of the child in attending to the mother’s face and voice is immediately followed by an increase in smiling and cooing. Moreover, the adult’s role in capturing and maintaining the child’s attention is crucial for the language development process. Between 11 and 14 months of age, developmentally adequate children usually reach a sophisticated ability to follow changes in the direction of an adult’s gaze as firstly confirmed by Scaife and Bruner (1975). This very same ability is just one feature of shared attention. In addition, the importance of social cognition in general is underlined by Tomasello et al. (2005, p. 683) where the authors state that
conversation is an inherently collaborative activity in which the joint goal is to reorient the listener’s intentions and attention so that they align with those of the speaker.
Consequently, face scanning assumes a crucial role in providing important information for language learning. A study by Young et al. (2009) demonstrates how children who attend to their mother’s mouths during communication — observed by means of eye-tracking methodologies — tend to have a larger vocabulary size in toddlerhood together with a more successful final language attainment. Arguably, information around the speaker’s mouth can foster the child’s ability to associate mouth shapes with speech sounds — e.g. a round-shaped mouth with the [o] sound. Although abilities such gaze following and attention to the mouth during face scanning have both been found to be relevant to later language attainment, they could be considered as different developmental functions. According to Tenenbaum et al. (2015, p. 3)
face scanning could reflect an infant’s sensitivity to linguistically relevant information during speaking, whereas gaze following could reflect aspects of social cognition that also relate to language.
Therefore, face scanning and gaze following could either be considered as having a common underlying factor — thus being capable of predicting language outcome — or could reflect different processes — thus having unique relations to language outcomes.
In conclusion, face scanning and gaze following are useful predictors of language, however they give information about the child’s attitude to scan the communicative scene as well. Therefore, even if the importance of imitation is always minimised in theories of language development, the abilities which underlie imitation itself have been found to have their relevance. Imitation alone does not provide a complete explanation for the acquisition of grammar. Nonetheless, imitation could be defined, as in Saxton (2017, p. 96), “the reproduction of another person’s behaviour” and infants have been described as “more prolific imitators than the young of every species” (Marshall and Meltzoff, 2014, p. 1). Moreover, since every time a child uses a new word which was uttered by an adult in the first place is a case of deferred imitation, an even partial role of this mechanism in lexical development can be held true. Lastly, it is rather frequent that a child partly incorporates the utterances of an adult in their responses, hence proving the ability of a selective imitation (Snow, 1981).
Butterworth, G., & Jarrett, N. (1991). What minds have in common is space: Spatial mechanisms serving joint visual attention in infancy. British Journal of Developmental Psychology, 9(1), 55–72. https://doi.org/10.1111/j.2044-835x.1991.tb00862.x Dominey, P. F., & Dodane, C. (2004). Indeterminacy in language acquisition: the role of child directed speech and joint attention. Journal of Neurolinguistics, 17(2–3), 121–145. https://doi.org/10.1016/s0911-6044(03)00056-3 Gredebäck, G., Fikke, L., & Melinder, A. (2010). The development of joint visual attention: a longitudinal study of gaze following during interactions with mothers and strangers. Developmental Science, 13(6), 839–848. https://doi.org/10.1111/j.1467-7687.2009.00945.x Harrington, J., & Tabain, M. (2014). Speech Production: Models, Phonetic Processes, and Techniques (Macquarie Monographs in Cognitive Science) (1st ed.). Psychology Press. Langacker, R. W. (1987). Foundations of Cognitive Grammar: Volume I: Theoretical Prerequisites (1st ed.). Stanford University Press. Langacker, R. W. (2008). Cognitive Grammar. Oxford University Press. Marshall, P. J., & Meltzoff, A. N. (2014). Neural mirroring mechanisms and imitation in human infants. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1644), 20130620. https://doi.org/10.1098/rstb.2013.0620 Saxton, M. (2017). Child Language: Acquisition and Development (2nd ed.). SAGE Publications Ltd. Scaife, M., & Bruner, J. S. (1975). The capacity for joint visual attention in the infant. Nature, 253(5489), 265–266. https://doi.org/10.1038/253265a0 Schwab, J. F., & Lew-Williams, C. (2016). Language learning, socioeconomic status, and child-directed speech. Wiley Interdisciplinary Reviews: Cognitive Science, 7(4), 264–275. https://doi.org/10.1002/wcs.1393 Snow, C. E. (1981). The uses of imitation. Journal of Child Language, 8(1), 205–212. https://doi.org/10.1017/s0305000900003111 Tenenbaum, E. J., Sobel, D. M., Sheinkopf, S. J., Malle, B. F., & Morgan, J. L. (2015). Attention to the mouth and gaze following in infancy predict language development – CORRIGENDUM. Journal of Child Language, 42(6), 1408. https://doi.org/10.1017/s0305000915000501 Tiedemann, D. (2018). Dietrich Tiedemanns Beobachtungen Über die Entwickelung der Seelenfähigkeiten bei Kindern (German Edition). Forgotten Books. Tomasello, M., Carpenter, M., Call, J., Behne, T., & Moll, H. (2005). Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences, 28(5), 675–691. https://doi.org/10.1017/s0140525x05000129 Young, G. S., Merin, N., Rogers, S. J., & Ozonoff, S. (2009). Gaze behavior and affect at 6 months: predicting clinical outcomes and language development in typically developing infants and infants at risk for autism. Developmental Science, 12(5), 798–814. https://doi.org/10.1111/j.1467-7687.2009.00833.x
Figure 1. Shkraba, A. (2020). Foto di donna, figlia, giocando, giocattoli [Photograph]. Pexels. Retrieved from https://www.pexels.com/it-it/foto/donna-ragazza-seduto-giocando-6288105/ Figure 2. Hall, N. (2021). Woman in black dress sitting on chair [Photograph]. Unsplash. Retrieved from https://unsplash.com/photos/9rBmRhysF9Y Figure 3. Du Preez, P. (2017). Boy touching page of book [Photograph]. Unsplash. Retrieved from https://unsplash.com/photos/-mCXEsLd2sU Figure 4. Picsea. (2017). Person carrying baby while reading book [Photograph]. Unsplash. Retrieved from https://unsplash.com/photos/EQlTyDZRx7