Linguistic Contribution to the History of Sub-Saharan Africa

Themes and actions

Team webmaster : Hadrien GELAS, Rebecca GROLLEMUND, Jean-Marie HOMBERT

    The first originality of this project is that it proposes to focus primarily on those languages which can be rightfully suspected to be isolates, according to a majority consensus among specialists (cf. list in Task 1). We thus intend to collect and collate all published and unpublished linguistic data on those language isolates, by consulting published sources, old and recent, and by contacting scholars having done first-hand research on them. We also intend to proceed ourselves with collecting first-hand material on some of those languages, in our areas of expertise (e.g. on the Irimba language, spoken by a group of Pygmies in Southern Gabon). Our project then, aims explicitly at achieving a new classification of African languages, which could supersede Greenberg's, now in use for the last 40 years.

     The second original aspect is the application of new methodologies to these corpora. For instance the method of virtual reconstructions (i.e. the reconstruction of the shape of the source item for lexical items putatively derived from a proto-language) can be a powerful tool for the identification of loan- words and the stratification of the lexical material found in the language(s) under examination - this, it must be said, applies first and foremost to Bantu languages and secondarily to other well- reconstructed groups, like Eastern and Southern Nilotic, Eastern Cushitic, and a few others. Virtual reconstructions have been developed and applied by members of our laboratory working on cultural vocabularies of Gabonese languages (with the pioneering studies of Mouguiama-Daouda on ichthyonyms (1995), and, jointly with Hombert, on mammal names).

     A third novel angle will be the taking into consideration of ecological issues in the dynamics of language evolution. In conformity with Nettle's approach, we expect to find maximum diversity of languages in the wet equatorial areas. The GARP software (Genetic Algorithm for Rule-Set Prediction) originally developed for determining the ecological niches of plant and animal species, can be applied to archaeology (cf. Banks et al. 2008), by correlating geographical coordinates of known archaeological sites with their estimated ages. Their paleo-environment can thus be determined with considerable precision, which should have momentous consequences for our understanding of past population movements in sub-Saharan Africa. In particular a precise knowledge of ancient forest- savanna ecotones will help us better to assess the dynamics of hunting-gathering groups - and later on of agriculturalists also, since it has been shown that most cultivating groups in Africa also made extensive use of wild products at their disposal, and this at the forest-savanna ecotone in Central Africa as well as on the slopes of the East African highlands.

     Consequently, the scientific contribution made by this project will be a more precise reconstruction of the history of African peoples, and a new classification of their languages in and around the equatorial forest through the use of recently developed tools for historical linguistics and language ecology.

     The possible obstacles that we have to face are obvious : the segmentation of the field between various disciplines (history, archaeology, anthropology, population genetics and of course linguistics). However, we shall benefit from the very serious progress achieved over the last five years by the Eurocores Program "Origins of Man, Language and Languages" whose very aim was to enable practitioners of the various sciences to find a common ground and learn to communicate with one another. As former participants in the program, we feel assured that considerable progress has indeed be made in this direction - witness, as far as Bantu Africa is concerned, the most interesting results of the "Gabon languages and genes" project, conducted at DDL under the direction of Professor Lolke van der Veen (mentioned above under 1.1).

  Task 1 - Classification : Task 1: Towards a new classification of African languages

    As mentioned under 1.1., African linguistic diversity is much less than that exhibited by other language areas. There are several possible explanations for this. The first might be that extant classifications underestimate actual diversity.
     Ever since Greenberg (1963), the Africanist scientific community has by and large accepted his classification of African languages in four large phyla : Niger-Congo (aka Congo-Kordofan or Niger- Kordofan), Nilo-Saharan, Afro-asiatic and Khoisan. This classification was admittedly far from revolutionary in its broad outlines, since the Khoisan phylum was more or less identical with the click language group recognised at least since Bleek, Afro-asiatic was equivalent to the old Hamito-Semitic family, to which Greenberg boldly added not only the "Chado-Hamitic" languages of Westermann (long felt to be related to Hamito-Semitic, e.g. by the French semiticist M. Cohen), but also the Chadic ("non-Hamitic") languages that Westermann wanted to keep separate from the others on account of their lack of grammatical gender opposition.
     The other two phyla "Niger-Congo" and "Nilo-Saharan" were largely based on Westermann's West- and Ostsudansprachen, with a few modifications, like moving the Songhai group (Mali, Niger) into Nilo-Saharan, but most notably the provocative inclusion of Bantu languages as a sub-branch of the Benue-Congo family within Niger-Congo (although Greenberg himself acknowledged that Westermann tacitly supported this interpretation, while the French traditional Africanist school - Homburger, Delafosse, etc. - considered Bantu and "Sudanic" languages as the two branches of a "Negro-African" phylum).
     There does not seem currently to be any doubt left as to the inclusion of Bantu into Niger-Congo, nor indeed about the cohesion of the latter, albeit with a great many disagreements about its internal structure. The unity of the other three phyla however is still disputed, least of all Afro-asiatic, where only the place of Omotic (South-Western Ethiopia), or indeed its inclusion within the phylum is still open to queries, at the very least for some of the component languages. But most Khoisan specialists do not now regard all click languages as being related : the inclusion of Hadza (Tanzania) is generally rejected, whereas of the other sub-groups the only one to form an accepted genealogical grouping is Khoe (which admittedly contains the largest number of languages), all other languages being potential isolates.
     But the most disputable phylum appears to be Nilo-Saharan : even setting apart the case of Songhai, whose inclusion is rejected by many specialists, it is notable that constitution of the phylum by Greenberg took place fairly late: in his original classification, published in the early '50's, he only recognized an "Eastern Sudanic" group, included along with "Central Sudanic" and two smaller language groups, Kunama and Berta, within a larger "Macro Sudanic" or "Chari-Nile" phylum. Only in 1963 was the decision taken to join "Macro Sudanic" with "Central Saharan" (Kanuri, Teda, Zaghawa), Songhai and three smaller groups of the Ethiopia-Sudan region (Maban, Fur and Koman) which had been left isolated in the first version.
     One of the most telling proofs of the problematic status of "Nilo-Saharan" is that two recent attempts at reconstructing "proto-Nilo-Saharan" (Bender 1996 and Ehret 2001) end up with two very different - in fact incompatible - internal classifications of the phylum. Even "Eastern Sudanic" which should presumably prove most resistant to restructuring does not escape entirely unscathed: Greenberg's "Teuso", (nowadays more generally called Kuliak, a remnant language group in eastern Uganda) is taken by Bender outside of the"Eastern Sudanic" family altogether, whereas Ehret firmly retains it (in fact many contemporary researchers would consider Kuliak an isolate).
     If we now turn to those languages which most specialists would hesitate to classify or declare outright to be isolates, we can propose the following list (cf. Blench, Maho, etc.)

     - Hadza (Tanzania): Khoisan for Greenberg
     - Sandawe (Tanzania): Khoisan for Greenberg
     - Kwadi (Angola): Khoisan for Greenberg
     - Jalaa (Nigeria): not mentioned by Greenberg, possibly Niger-Congo
     - Laal (Chad): not mentioned by Greenberg, possibly Niger-Congo
     - Kujarge (Chad): not mentioned by Greenberg
     - Ongota (Ethiopia): not mentioned by Greenberg, possibly Afro-asiatic
     - Shabo (Ethiopia): not mentioned by Greenberg, possibly Nilo-Saharan
     - Gomba (Ethiopia): not mentioned by Greenberg, possibly Afro-asiatic
     - Bangi-me (Mali): not mentioned by Greenberg
     - Pre (Ivory Coast): not mentioned by Greenberg, probably Niger-Congo
     - Irimba (Gabon): not mentioned by Greenberg

     As mentioned above, the study of linguistic isolates should prove of particular interest in view of establishing the existence of a former diversity in the African continent. By 'linguistic isolates', should not only be meant those 12-odd languages impossible to classify with any certainty - which might of course be partly due to defective information - but also the original stratum of languages which can be classified - albeit with considerable hesitation as to their exact hierarchical position. A notable example might be that of Dahalo, an undoubtedly Cushitic remnant language of coastal Kenya, which has in its lexicon almost 100 lexical items containing clicks - these being of course unknown in other Cushitic languages and unrelated to other click languages of Eastern and Southern Africa. One should also keep in mind in this respect, the particular lexicon identified by Bahuchet in various pygmy languages.
     The first task then will consist in collecting and collating data on all known language isolates in Africa as explained in 1.3. All our international collaborators, who are specialists in several of these languages and have accumulated considerable data on them (e.g. Gueldemann for Khoisan, Mous on Afro-asiatic, etc.) will naturally be involved in this task, but we will also contact other scholars with first-hand experience of the other languages, in order to insure as broad a coverage as possible. This joint cooperative effort will be carried on in the first year of the project, at the end of which an international conference should come up with the new proposed classification.

  Task 2 - Reconstruction : Task 2: Historical inferences of linguistic reconstructions

     The second task aims at determining the specific modes of contact between food-producers and hunter-gatherers throughout Central, Eastern and Southern Africa.
     Since, as mentioned in 1.1, it appears likely that the expansion of food-producing peoples was conducive to a reduction of linguistic diversity, there remains the apparent paradox that Central African hunter-gatherers all speak languages identical or at least closely related to those of their agricultural neighbours, whereas this is not the case in Southern Africa where "San" hunter-gatherers have maintained a considerable amount of linguistic diversity (all the more so, since "Khoisan" as we saw in 1.4.a is probably not a valid genealogical grouping). Nettle's explanation as we saw in 1.2. has to do with environment, where hunter-gatherers in arid environment rely on mobility to escape stress and so are better able to retain their own linguistic and cultural identity. One should however notice that in no less arid areas of East Africa, linguistic diversity among hunter-gatherers is less, although still fairly significant - in this case, the time factor is doubtless decisive, interaction between East African hunter-gatherers and food-producers predating by at least 1,000 years - probably more - similar contacts in Southern Africa.
     Second, the concrete trajectories followed by the expanding food-producing populations need to be precisely determined. To take one very significant example, it is not known - and has been in dispute over many years - whether Bantu-speaking food-producers traversed the entire breadth of the rain forest before emerging in East Africa, or followed a path north of it, or again crossed it on a roughly north-western / south-eastern orientation to emerge in the savannas south of the forest. We favour the second interpretation, and will thus attempt to support it by selecting specific languages situated at various strategic points on the northern rim of the rain forest (e.g. the Grassfields languages, languages of the Congo Interior basin and languages of the Great Lakes area in East Africa.
     In order to achieve this aim, we will have to set up data bases for significant domains of specialised vocabularies in the languages under study: these would concern wild fauna and flora, hunting and fishing techniques, agriculture, animal husbandry, pot-making, iron-working, etc.
     A not insignificant amount of cultural vocabulary has already been identified for Proto-Bantu. Some crop names can be confidently reconstructed: cowpea (Vigna unguiculata and / or V. sinensis), yams (Dioscorea spp.); Bambara groundnut (Vigna (= Voandzeia) subterranea); and some species of pumpkin (Cucurbita spp.). It is noteworthy that no cereal names are reconstructible at this level, and indeed expanding Bantu populations do not seem to have cultivated grain crops until they reached East Africa (Ehret 1998). The case of bananas is more complex: they provide the staple crop of most groups in the equatorial forest as well as a substantial number in the moister areas of East Africa, but it is very difficult to reach a consensus on the date of their introduction; the traditional view is that they came late but some archaeological data point to their presence in Cameroon c. 500 BC, so that the actual role they played in Bantu expansion is by no means settled.
     We also have good surveys of the vocabulary of pot-making (Bostoen), fauna (Mouguiama-Daouda) etc. These mostly concern Bantu languages however. Much work remains to be done on other groups, although a fair amount of cultural vocabulary (mostly animal husbandry) has been reconstructed for Eastern and Southern Nilotic (Vossen and Rottland respectively).
     We will obviously have to complement the available data on specialised lexicon for carefully selected languages (in particular Tsogo, Mongo, Ganda and Kikuyu). This will be achieved by field trips as required.
     By applying the method of virtual reconstructions, one can propose putative ancestral forms for vocabulary items attested in present-day languages, which allows a stratification of these items in terms of their relative antiquity. As indicated in 1.3. establishing virtual reconstructions means the application of regular correspondences to items found in present-day languages so as to mimic the shape of the putative ancestral items. In some cases, these reconstructions would coincide with roots already identified (for instance in Guthrie's or Meeussen's lists), in other cases, they will reconstruct to regular proto-roots which would then have to be reconstructed to a regional proto-language (daughter language to Proto-bantu) and in others yet they will point out irregular correspondences, and probably the result of loans or other lateral influences. We intend to develop a program for the automatic generation of these virtual reconstructions, which should be fully operational during the third year of the project, the data bases themselves having been completed by the end of the second year.

  Task 3 - Modelisation : Task 3: Languages, populations and environmental constraints

     After having developed our lexical data bases and reached some idea about the temporal stratification of (regional) proto-languages, we should turn to trying to correlate these results with archaeological data. A growing number of sites have been identified in Bantu-speaking Africa, although it must be admitted that the rainforest is still deficient in this regard (but cf. work done by Clist, Oslisly et al. in Gabon). Reliable dates have been provided by the application of known techniques like C14 datation, which have shown a much greater antiquity, for instance for iron- working, than was previously recognised.
     What we would like to know is whether any correlation with our linguistic results can be achieved, and in particular whether any idea of the different densities of population exhibited by different sites can be achieved. We know that paleo-environments in Africa varied widely over the last 30 millenia or so, although more stability seems to have obtained since the end of the last glaciation.
     The GARP programme (Genetic Algorithm for Rule-Set Prediction) has been applied succesfully to European paleolithic sites, and by integrating climatic data with the dates proposed for the sites has proved able to identify different models of population density associated with them. We intend to apply the program to Central, Eastern and Southern African sites, and hope to obtain a confirmation of our linguistic results. This should constitute the final part of the project with a final conference in the fourth year.

