A new beginning? A bibliometric analysis of L2 vocabulary research in 1985

This paper uses a co-citation analysis to examine the research on L2 vocabulary acquisition that was published in 1985. This year seems to mark a kind of transition in the field. Unlike the earlier years analysed in this series of papers, 1985 shows signs of a coherent L2 vocabulary research front developing. The number of papers that qualify for inclusion is much greater than in previous years, and the analysis suggests that recognisable research themes are beginning to be clearly articulated.


Introduction
This paper is the fifth in a series of studies which attempt to plot the way research in L2 vocabulary acquisition has progressed over the last fifty years. Earlier papers have analysed the research outputs published in , 1983in , 1984in and 2006in . (Meara 2012in , 2014in , 2015in , 2016. This paper follows on directly from my analyses of the 1983 and 1984 data, published in LingBaW, and it uses the same bibliometric techniques that were used in the earlier papers -principally the co-citation methodology developed by Small (1973) and White and Griffith (1981). This methodology is summarised in Appendix 1 for readers who are not familiar with the approach. The long term aim of this series of papers is to provide a historical account of the way a small number of themes have come to dominate L2 vocabulary research, and to document how the thinking of a small group of researchers has become an orthodoxy in modern research.
My analyses of the 1982-1984 data drew attention to the fact that it was very difficult to identify anything like a coherent L2 vocabulary research program at this time. The strongest feature in the 1984 map was a very dense and coherent cluster of influences all dealing with psycholinguistic aspects of bilingual speakers, but this cluster of influences is very selfreferential, and was not generally cited alongside the research published by linguists. Work of this latter type was actually quite scarce. Although considerably more relevant research was published in the period 1982-1984 than had been published in previous years, the total volume of research remained remarkably small. Indeed, 1984 had seen something of a reduction in research outputs compared to 1982 and 1983, and the broad picture consisted principally of small-scale one-off studies.
At first sight, 1985 seems to mark the beginning of a change in the field. 1985 sees a very large increase in the number of relevant research outputs, and for the first time the bibliometric data suggests that L2 vocabulary research is beginning to establish itself as a respectable research topic with signs of clearly delineated research priorities. 1985 also sees the emergence of research clusters that address research themes that would be recognisable to modern researchers.
The VARGA database (Meara n.d. Accessed June 2017) identifies a total of 111 relevant research outputs during 1985 -almost three times the desultory total of the 1984 output. There are some important qualitative differences between the 1985 data set and the data analysed in our earlier reports. Particularly noticeable is that the 1985 data set contains a number of papers published in French and German whereas papers of this sort were conspicuously absent from the earlier analyses. More importantly, the 1985 research includes a number of works of a type that were very infrequent in the earlier data sets. In 1985 we find one issue of a journal dedicated to vocabulary (Les langues modernes), one edited collection of papers (Ilson), three books (Corson, Lyne and Schouten van Parreren), and some research-motivated teaching material (Daams-Moussault, Rudzka et al.). We also find four PhD theses (Kelly, Laufer, Locus and Mansouri awarded respectively by the Universities of Louvain-la-Neuve, Edinburgh, Leuven and Sheffield). All this activity supports the view that there is a groundswell of change in this year's literature -a new beginning perhaps? Table 1 lists the 92 eligible sources which were used in the bibliometric analysis that follows in Section 3. Eligible sources are research papers published in journals or as book chapters -theses, bibliographies, and monographs are conventionally excluded from bibliometric analyses because they cite research in a way that is different from what appears in shorter, more focussed research papers. Theses and monographs tend to contain very large bibliographies that range very widely over several research areas. The size of these bibliographies makes them unmanageable in conventional bibliometric studies. Similarly, bibliographies also tend to be much larger than conventional research papers, and they often prioritise size and comprehensive coverage over other considerations, again making them unmanageable. Conventional practice in bibliometric research is to exclude sources of this kind, and to focus on more traditional research papers which cite a fairly restrained set of bibliographical sources. This convention has been followed in the present paper. The theses and the bibliographical work have been excluded. Also excluded were two papers that I was unable to obtain copies of. The remaining 92 journal articles and book chapters together make up a comprehensive record of the research that was published in 1985. However, this should not be taken as implying that all 92 sources are homogeneous. Many of these publications are little more than think-pieces which report their author's opinions rather than meticulously collected data. Additionally the papers also differ markedly in their citation practices. For example, about 10% of all the eligible papers do not cite any evidence for the claims that they put forward, while at the other extreme, Paradis' comprehensive review paper cites a total of 108 different authors.

Laufer, B and DD Sim
Measuring and explaining the threshold needed for English for Academic Purposes texts. Foreign Language Annals 18 (1985), 405-413.

Laufer, B and DD Sim
Taking the easy way out: nonuse and misuse of contextual cues in EFL reading comprehension. English Teaching Forum 23,2(1985), 7-10.

Paradis, M
On the representation of two languages in the brain. Language Sciences 7,1(1985), 1-39. A preliminary analysis of this data identifies a total of 114 unique authors who contribute to the data set -about two and a half times the number of authors identified in 1984. As in previous years, the figures show that the field continues to be dominated by one-off studies: 98 of the 114 authors -86% of the total -contributed to only a single paper in the dataset. Four authors contributed to three outputs (Alfes, Laufer, Meara and Ringbom), while twelve additional authors contributed to two outputs. These figures represent a major advance on 1984, where two papers was the highest number of outputs that any of the authors contributed to. The majority of authors who contributed to more than one output (Alfes, Ringbom, Broeder, Béjoint, Coenen, Coltheart, Extra, Masterson, Paillard, Palmberg, and Thoiron) did not publish any relevant work in 1984, and are therefore new contributors to the data set. Furthermore, of the four authors who contributed more than one paper to the 1984 collection, only Arnaud also appears as author of more than one paper in 1985. Bahrick, Bensoussan and Mägiste, who all made multiple contributions in 1984, do not appear at all in the 1985 author list. Clearly, there is a lot of churn here, another indication that the field as a whole is far from stable.

Pons-Ridler, S and F McKim
Although more authors are producing multiple papers in 1985, the actual figures are surprisingly low. The number of authors contributing N works to a research corpus generally follows a simple power law that is summarised by Lotka's Law (Lotka, 1926). Given the number of authors contributing just one paper in the data set, the law suggests that we might have expected five or six authors to be contributing to four or more outputs, but this is not the case for our data. Although the number of authors generating multiple outputs has increased a little in 1985, this increase has not kept up with the overall increase in the number of outputs. The field as a whole seems to remain short on Big Hitters with a substantial number of outputs.

Analysis
The citations from this data set were analysed using the same methodology that we used in the earlier papers -the co-citation method described in Appendix 1. This analysis identified a total of 1215 sources who were cited in the data set -again about twice as many as the sources identified for 1984. As usual, most of these sources are cited in only a single paper, but a number of sources are cited much more frequently than this. The most cited sources are Richards (10 citations) and Cohen (9 citations), followed by Meara (8 citations) and Krashen (7 citations). Six sources were cited six times (Corder, Faerch, Mackey, Nation, Ringbom and West) and seven sources were cited five times (Kolers, Lambert, Levenston, Macnamara, Oller, Palmberg and Rivers). The complete distribution is shown below in Table 2: The large increase in the size of the data set means that we have no problem in identifying which citations should be included in the analysis for 1985. Conventional practice in co-citation analysis is that we identify the 100 most cited authors in our data-set. This figure is a conventional compromise. With the 1985 data, it would be theoretically possible to analyse the co-citation patterns of all 1215 sources who are cited in the data set, but the resulting maps would be too complex for us to interpret in any meaningful way. Similarly, we could work with a smaller number of sources, say the 50 most cited authors, but this risks oversimplifying the analysis. Given the distribution in Table 2, the best choice seems to be to work with the 88 sources who are cited in three or more 1985 papers. This threshold is slightly lower than we would like -included authors are cited in only about 4% of the total output -but it is a considerable improvement on what was possible in 1984 where it was necessary for us to include authors cited only twice in order to make up the numbers. Again, this is a sign that the field is maturing, though it clearly still has a long way to go. The basic citation data was recomputed to identify all the co-citations between these 88 sources, and 1032 co-citation links were identified. These co-citations were mapped using the Gephi software package (Bastian, Heymann & Jacomy, 2009), and our preliminary analysis is shown in Figure 1. In clear contrast with the maps for 1982, 83 and 84, a very straightforward narrative emerges from this analysis. Gephi has found five clusters in this data set, four strongly connected clusters and a small detached one consisting of two members.
Davoust and Bouscaren, the two members of Cluster I, detached at the top of the map, produced a number of books covering English vocabulary for L1_French speakers at the end of the 1970s. This work is cited by a number of contributions in the special issue of Les langues modernes.
The other small cluster, Cluster II, in the SouthEast corner of the map comprises two sources concerned with semantics and vocabulary acquisition (Ostyn, Harvey) and two sources who deal with dictionaries (Quirk and McArthur). This cluster seems to be the succcessor to a cluster in the 1984 map that dealt with componential analysis and prototype theory.
Cluster III, in the NorthEast corner of the map, will be familiar from our earlier analyses. It is made up largely of social and developmental psychologists, and seems to capture some of the theoretical background work that the other two large clusters call on. In previous analyses, we have noted the importance of the Montreal group (here represented by Macnamara and Lambert), and a neurolinguistics strand (Albert, Obler and Paradis). We also find a child language development strand in this group (Eve Clark, Herbert Clark, Brown and Lenneberg) and a dyslexia group (Coltheart and Baron). The remaining members of this cluster are predominantly influential and frequently cited linguists. This cluster is clearly the successor to the psycholinguistic cluster that figured so strongly in the 1984 map, but it is much reduced in size, and is much less self-referential than was the case in the earlier analysis. At the same time, this cluster has many links to cluster IV and cluster V described below, and seems to be fairly well integrated into the L2 vocabulary research effort.
The two remaining clusters are a new phenomenon -a coherent and substantial set of sources which are all concerned specifically with L2 vocabulary acquisition. What seems to differentiate these two clusters is that cluster IV, in the NorthWest of the map, is principally concerned with English as a second language, while cluster V at the Southern edge of the map is concerned with other languages, notably French as a second language.
Cluster IV consists mainly of Dutch and Scandinavian sources, though the two central figures in this cluster are based in Israel: Andrew Cohen and Eddy Levenston, as also is Batia Laufer. Again, as in 1984, we need to note the influence of The Interlanguage Studies Bulletin: Utrecht in this cluster. Levenston, Melka-Teichroew, Bialystok and Ringbom had all published influential vocabulary research in this journal (Ringbom 1978;Levenston 1979;Melka Teichroew 1982;Bialystok 1980) in the preceding couple of years. However, this work is not narrowly concerned with English Language Teaching: much of the work cited is interested in broader theoretical issues in vocabulary teaching, for example Melka Teichroew's work on Receptive and Productive vocabulary, Cohen's work on association and mnemonic methods of learning vocabulary, and particularly Corder and Selinker's theoretical work on Interlanguage.
Cluster V is also directly concerned with teaching foreign languages, but at first glance, this cluster seems to be mainly concerned with the teaching of languages other than English, particularly French. On reflection, however, I think that the main concern of this cluster is in fact the role of frequency counts in vocabulary teaching, rather than the teaching of French in particular. The 1985 special issue of Les langues modernes contained several articles addressing this question, and there was clearly some strongly expressed dissension about the importance (or not) of frequency counts among French linguists and language teachers at the time. Inevitably, most of these papers refer to the earlier seminal work of Gougenheim, Michéa, Rivenc and Sauvageot who authored a particularly important set of studies on basic French vocabulary Le français fondamental (Gougenheim et al., 1964). This work was heavily criticised by Galisson, while  also published a critical account of this work and its usefulness for language learners. Richards, who had also worked independently in the vocbulary area (Richards, 1974;1976), actually emerges as the dominant influence in this cluster, but I think this is at least partly the result of an edited volume of readings dealing specifically with vocabulary teaching (Richards, 1980), and a particularly influential paper dealing with what it means to know a word (Richards, 1976;1985). Figure 2 shows the same data as Figure 1, but in this figure I have deleted all the weakest co-citation links (the links that appear only once in the mapping) in order to make the connections between the clusters easier to follow. This simplified map strongly emphasises the emergence of the Scandinavian influences on the research and the importance of Andrew Cohen as the lynchpin of this group. The simplified map emphasises too the strong links within the Montreal group, and the central role of word frequency in Cluster V. The map also highlights the anomalous status of Macnamara, Meara and Blum: nominally these three sources form part of cluster V, but in fact their strongest links are to other clusters, suggesting that they may be playing an important role as mediators between the clusters. Clearly, 1985 has seen some significant shifts in the research, with a number of new themes emerging that were not present in the 1984 data. However, we also have a great deal of continuity at the same time. Figure 3 shows the 1984-85 "survivors" -sources who are cited both in the 1984 data set and the 1985 data set. These survivors number 32. This total figure is higher than the equivalent figure for 1984, despite the fact that the 1985 map contains fewer nodes than the 1984 map. In percentage terms, about 36% of the sources survive from 1984 to 1985. I take this as an indication that the field is becoming less volatile, and more stable. Gephi finds that these 32 survivors fall into two clusters -a small psycholinguistic cluster dominated by Macnamara, and a larger L2 vocabulary cluster dominated by Cohen, Richards and Nation. This is the first time that a coherent and easily identifiable L2 vocabulary group has appeared so clearly in the data, and it strongly suggests that a proper framework for thinking about L2 vocabulary acquisition is beginning to emerge. The psycholinguistics cluster is about the same size and has roughly the same composition as the psycholinguistic survivors from 1983-1984, though both the major influences from 1984 (Lambert and Kolers) seem to be less important in this data. Their place has been taken by another member of the Montreal group, John Macnamara, who is more of an educational psychologist than a psycholinguist. I take this to mean that this cluster is a stable group of influences as far as the L2 vocabulary research is concerned, but that the way this group is influencing the work of L2 vocabulary researchers may be shifting slightly. It is noticeable that the new L2 vocabulary group has a few stronger cocitations with the psycholinguistics group, mainly mediated through Herbert Clark, and the very tenuous links that characterised the 1984 data now appear to be greater in number and more varied.
Alongside this firming up of the core part of the map, 1985 also saw a significant number of new sources who had not appeared in the 1984 map. A separate analysis of these new entrants is shown in Figure 4. This figure is basically the same map as Figure 2, but with sources that appear in both the 1984 and the 1985 map removed. Gephi has re-analysed the clusters that appear in this smaller map, and as a result a new cluster of influences has been identified, splitting cluster IV in two. This new cluster comprises Schreuder, Levelt, Sciarone, Bongaerts, and Hatch. With the exception of Hatch, this is clearly a Dutch group whose members are more often cited alongside the psycholinguists than is the case for the other members of cluster IV.
Overall, we have 56 new entrants to the 1985 map, which means that nearly two thirds of these sources are new-comers. This figure is slightly lower than the equivalent figure in the 1984 data. In the 1984 map, 86 nodes were newcomers who had not appeared in the 1983 mapsome 78% of the total. For 1985, the new entrants represent 64% of the total sources. This is still a high level of churn, but it suggests that the overall picture might be stabilising. It is difficult to get a clear picture of the dynamics of L2 vocabulary from a detailed analysis of a single year's data, and the comparisons with the 1984 data are difficult to interpret because the 1984 data has a looser criterion for inclusion (two citations rather than three.) This problem will become less important in future reports as the number of papers available for analysis gets larger -in each of the years 1986, 1987, 1989 and 1990 the number of papers published exceeds 100. Meanwhile the trend identified here does look significantly different from the main trend that we identified in 1984. In that year, the majority of new influences were single researchers who did not form strong clusters. In 1985 what we have is the sudden emergence of whole new clusters of influence. This feels to me like a major change in the way the field is operating. Clearly, the best interpretation of this map is that two new streams of influence have emerged in the 1985 data. The new "Scandinavian" cluster does indeed represent a genuine innovation in the field, and size of the cluster as well as the strength of the co-citation links within this cluster suggest that it is unlikely to be an insubstantial flash in the pan. On the other hand, the cluster dominated by Richards is not really a new research area. Rather, the debate that is identified by this cluster -the role of frequency counts in language teaching -is one which is on-going in the literature. It surfaces from time to time, usually in response to changes in the language teaching curriculum imposed by an Education Ministry. The earliest discussion of the use of word frequency counts in curriculum design dates back at least as far as the 1920s (e.g. Henmon, 1924;Buchanan, 1927). In that sense, this cluster of influences appears to be a rehashing of old arguments, rather than a proper new development, although the sudden appearance of Jack Richards as the most significant influence in this cluster does perhaps suggest that something new is going on.

Discussion
So far, our discussion has been concerned with the obvious features of the co-citation maps in Figures 1-4, but it is important not to lose sight of other aspects of the 1985 data which do not appear explicitly in the maps.
Firstly, although the 1985 data set contains a number of papers in German, there is not really a German presence in the maps. Unlike the French sources, which tend to cite a common set of influences, the German outputs appear to be less focussed on a single topic, and more promiscuous in who they cite. As a result, the sources they cite do not reach the threshold for inclusion in the larger map. Additionally, although the German outputs do occasionally cite English language research, writers in English rarely cite German language research. This means that the German research, substantial though it appears to be, is largely invisible. This trend is exacerbated by the very different citation practices that are to be found in the German literature compared with the English language papers. For example, Alfes, who authored three papers in the 1985 data set, cites a total of 43 sources in his three papers, but only two of these sources are cited more than once (Kruppe and Thiems) in his own work. Of the other 41 sources cited in Alfes, only three are also cited by other papers in the dataset (Ilson, Kastovsky and Stein), and Alfes himself is cited by only one other author (Scherfer). In general, the German research seems to be detached from the rest of the work in this data set (though Scherfer does have some shared citations with the French research that we identified in Cluster V).
Much the same comment could be made about the French language publications in 1985. More than half of these publications have no citations that contribute towards the map. The outstanding author in this French group is Pierre Arnaud, who mainly cites English language sources, and is only infrequently cited by his French language colleagues.
Second, it is worth noting that a number of significant influences in the 1984 map are not cited often enough for them to appear in the 1985 map. The most notable loss is, of course, the psychologists who appear in a densely connected cluster that completely dominates the 1984 map. Only a handful of these sources survive into 1985, principally the members of Montreal group. This group remains influential, largely because of a shared interest in L1 vocabulary acquisition. Nevertheless, the co-citation links between this group and the new Scandinavian vocabulary research cluster remain weak, and it seems likely that their influence will wane in the future. Most of the active researchers in the Scandinavian group seem to look to Corder and Selinker, and other linguists for their inspiration, rather than the models and theories developed by psychologists. Neither memory research nor neurolinguistic research figure in the 1985 map. The only new entrants in this group are Baron and Coltheart who are cited by authors working on L2 word recognition. This may represent a genuinely new research interest, but only the weakest co-citation links connect these two sources to the rest of the network, so that they appear as a detached subgroup in Figure 3. This does not bode well for the long-term growth of this sector of the field.
Thirdly, the 1984 cluster dealing with vocabulary attrition that we identified as a possible growth area (Bahrick, Bahrick and Wittlinger) has also failed to develop into a substantial research strand. Attrition does not figure as a major research interest in the 1985 map. Also missing is an obvious dictionary use cluster which looked as though it might emerge in the 1984 data, and most of the theoretical sources that deal with semantics no longer figure in 1985 either -Ryberg and Rosch who both looked to be central influences in 1984 were not cited by anyone in the 1985 data set. Likewise, corpus linguistics plays no obvious role in the 1985 dataset. We also identified an incipient L2 reading cluster in 1984, and that too has failed to materialise in 1985.
Finally, it is worth commenting on the position of Krashen in the 1985 data. Krashen was a minor figure in the 1984 data, with a few weak links to a number of the clusters. By 1985, Krashen has emerged as a very significant influence in L2 vocabulary acquisition research, second only to Jack Richards in terms of the betweenness centrality measure. Oddly, Krashen is only rarely cited alongside other L2 vocabulary researchers, and his strongest co-citation links are to be found with the pyscholinguistics group. Much of Krashen's later work (e.g Krashen 1989;Dupuy and Krashen 1992) would be explicitly concerned with L2 vocabulary acquisition, but by 1985 this was not the case. Rather he appears to be cited in this data set for his two books on second language acquisition, and his work on the Monitor Model (Krashen 1981 and That is, Krashen appears here as a general L2 acquisition theorist, rather than as an L2 vocabulary acquisition researcher.

Conclusion
This paper has presented a bibliometric analysis of the L2 vocabulary research that appeared in 1985. The dramatic rise in the number of research outputs for this year suggests that there was a renewed interest in L2 vocabulary research compared with previous years, and some significant shifts in the research priorities. Some of the promising research identified in 1984 seems to have petered out, or at least failed to develop in the way we might have expected, and a large number of the Significant Influences identified in 1984 no longer qualify as such in 1985. The field as a whole continues to be dominated by one-off studies, rather than on-going research programs.
Nevertheless, there is some evidence that a distinctive L2 vocabulary research program is beginning to emerge. The number of "survivors" from 1984-1985 is larger than the equivalent figure in 1984, and the proportion of new entrants into the list of most significant influences is a bit smaller in 1985 than it was in 1984. Both these trends may indicate that the L2 vocabulary field is entering a new phase. The number of research clusters that we identified in 1985 is smaller and more focussed than what we find in the 1984 map. This strongly suggests that the field is beginning to stabilise. We noted the reduced importance of the psycholinguistic influences in the 1985 map, and the emergence of a distinctive new cluster of Applied Linguistics influences centred on Andrew Cohen and Eddie Levenston, backed up by a group of Scandinavian researchers. All this suggests that an autonomous L2 vocabulary research thrust is about to become active in 1986. We will explore this development in our next paper in this series.