FULL PAPER: mtDNA Sequences Show Multidirectional Gene Flow in the Western Mediterranean

Abstract at: http://www.blackwell-synergy.com/links/doi/10.1046/j.1469-1809.2003.00039.x/abs/

Full article cached below.

Annals of Human Genetics Volume 67 Issue 4 Page 312  - July 2003

Joining the Pillars of Hercules: mtDNA Sequences Show Multidirectional Gene Flow in the Western Mediterranean
S. Plaza, F. Calafell, A. Helal, N. Bouzerna, G. Lefranc, J. Bertranpetit and D. Comas
 Summary

Phylogenetic analysis of mitochondrial DNA (mtDNA) performed in Western Mediterranean populations has shown that both shores share a common set of mtDNA haplogroups already found in Europe and the Middle East. Principal co-ordinates of genetic distances and principal components analyses based on the haplotype frequencies show that the main genetic difference is attributed to the higher frequency of sub-Saharan L haplogroups in NW Africa, showing some gene flow across the Sahara desert, with a major impact in the southern populations of NW Africa. The AMOVA demonstrates that SW European populations are highly homogeneous whereas NW African populations display a more heterogeneous genetic pattern, due to an east-west differentiation as a result of gene flow coming from the East. Despite the shared haplogroups found in both areas, the European V and the NW African U6 haplogroups reveal the traces of the Mediterranean Sea permeability to female migrations, and allowed for determination and quantification of the genetic contribution of both shores to the genetic landscape of the geographic area.

Comparison of mtDNA data with autosomal markers and Y-chromosome lineages, analysed in the same populations, shows a congruent pattern, although female-mediated gene flow seems to have been more intense than male-mediated gene flow.

Introduction

The western Mediterranean populations have experienced a long, intrincated history that, too often, has been considered separately for the African and European shores, or from an exclusively European perspective. Both the African and the European shores have acted as termini of population expansions. The independent and parallel colonisation from the East of both areas by anatomically modern humans in Palaeolithic times, and the expansion of farming during the Neolithic, have modelled the genetic landscape of both areas. Moreover other demographic events, such as the expansion of the Arabisation along the Maghrib, have also come from the East arriving in NW Africa.

Genetic diversity studies have provided a major insight into human evolution on a global scale, but they have also been useful in regional studies. Population processes such as expansions, migrations, dispersals and admixtures leave a footprint in the genetic composition of the groups that allow us to trace back population history. Several genetic markers have been analysed in the westernmost part of the Mediterranean in order to extricate such processes. The compilation of classical genetic markers (Bosch et al. 1997; Simoni et al. 1999) has shown a clear genetic differentiation between the northern and southern coasts, attributed to independent parallel expansions along the two shores followed by little gene flow across the Mediterranean. Nevertheless, there is some contradictory data, based on HLA polymorphisms, on the degree of genetic relationship between both coasts in West Mediterranean populations (Arnaiz-Villena et al. 1995; Comas et al. 1998). Analyses of autosomal STRs (Bosch et al. 2000) and Alu insertion polymorphisms (Comas et al. 2000) confirmed the genetic difference between both groups of populations, also detecting some Sub-Saharan genetic flow into NW African populations. The high-resolution analysis of Y-chromosome biallelic and STR markers (Bosch et al. 2001) has revealed clear genetic differentiation due to a major independent Upper Palaeolithic contribution in both areas, followed by gene flow from the Near East during the Neolithic, and small bidirectional gene flow across the Mediterranean. Several mitochondrial DNA (mtDNA) analyses have focused in the structure of Iberian populations (Bertranpetit et al. 1995; Côrte-Real et al. 1996; Salas et al. 1998; Pereira et al. 2000), of NW African populations (Rando et al. 1998; Brakez et al. 2001), and their relation to the Canary Islands (Pinto et al. 1996). Nevertheless, no analysis has jointly considered the population relationships of Western Mediterranean populations using mtDNA sequences.

The analysis of mitochondrial DNA diversity has been one of the most successful tools applied to unravel regional population histories. Two different approaches have been followed in order to perform mtDNA analyses: the sequencing of the hypervariable segments of the non-coding part of the molecule, the control region, and the study of the coding region through high-resolution RFLPs. The joint analysis of both kinds of markers (control region sequences and RFLPs in the coding region) has proven to be a powerful tool in studying human diversity (Torroni et al. 1996), and has led to the construction of robust phylogenies of mtDNA sequences (Macaulay et al. 1999), which allow one to elucidate human demographic scenarios.

In the present study, we have analysed the hypervariable segment I (HVSI) of the control region in several Western Mediterranean populations, and have added the information yielded by three SNPs in the mtDNA-coding region in order to ascribe the mtDNA variation to specific branches of the gene genealogy. This analysis allows us to describe the genetic landscape of the geographic region, compare it to that obtained with other genomic regions (particularly those with a clear phylogeography, such as the Y-chromosome), and interpret it in terms of external gene flow and of exchanges between the northern and southern shores of the Mediterranean.

Discussion

The phylogeographic analysis of mtDNA in the Western Mediterranean has shown the presence of a common set of haplogroups shared with the rest of Europe and the Middle East (H, J, T, U, I, W, X), plus those of probable local origin (U6, V), and others introduced by gene flow from the south (L) and east (M). In this respect, our regional study, which has gathered published and new samples, not previously jointly analysed, confirms the basic frame described by Richards et al. (2000) for Europe and by Rando et al. (1998) for NW Africa. It should be noted, though, that inferring haplogroups from HVRI sequences and three coding-region SNPs could lead to slight imprecisions in the allocation of sequences to haplogroups. For instance, although we have assigned all CRS (Cambridge Reference Sequence) sequences to haplogroup H, 1.5% of all CRS sequences in West Eurasia belong to haplogroup HV* and 3.9% to U* (Richards et al. 2000). Typing of SNP 7028 could help in resolving this ambiguity, which nonetheless affects a relatively small number of sequences.

An additional caveat that should be taken into account throughout the discussion is that, although we define our area of study as the Western Mediterranean, for some areas, such as southern France, Corsica, northern Italy and the Kabyle in northern Algeria, no HVRI sequences are available. It is likely that such missing data would refine some of the conclusions we will reach below.

Now, we will discuss in detail the phylogeographic pattern for NW Africa, Iberia and Italy, and the transmediterranean gene flow.

Northwest African mtDNA Landscape

The main difference, found through the mtDNA analysis, between the populations of the two geographical areas studied is the presence of sub-Saharan L lineages in NW Africa compared to SW Europe, up to the point that, if L sequences were removed from the analyses, most NW African populations were genetically very close to SW Europeans. Since L sequences make up almost all mtDNA lineages in sub-Saharan Africa, and particularly in the areas just to the south of NW Africa, the frequency of L haplogroups in NW Africa can be read directly as a measure of gene flow. Thus, it can be estimated that 25.9±2.1% of the NW African mtDNA pool has a sub-Saharan origin, under the assumption of negligible back flow from NW to sub-Saharan Africa. A similar estimation can be performed for Y-chromosome lineages, since E1* and E3a* haplogroups (according to the nomenclature of the Y Chromosome Consortium, 2002) found in NW Africa at a frequency of 8.0%±2.0% (Bosch et al. 2001), are of sub-Saharan origin. The female- and male-mediated estimates of sub-Saharan gene flow into NW Africa are clearly different, which could be a local consequence of a global trend to higher female than male migration (Salem et al. 1996; Seielstad et al. 1998; Pérez-Lezaun et al. 1999). Autosomal markers such as Alu insertion polymorphisms also show frequency patterns compatible with gene flow from sub-Saharan Africa into NW Africa (Comas et al. 2000), although the absence of a clear phylogeographic structure in that case prevents the estimation of gene flow without specifying a parental, non-admixed population for NW Africa.

Within NW Africa, L sequences are most frequent in Mauritanians and Saharawi, whereas their frequency is lowest in northern populations. Alu insertion polymorphism analysis in NW Africa (Comas et al. 2000) has also shown that gene flow from sub-Saharan Africa in the southern part of this geographical area was more pronounced. A similar genetic gradient was also observed in NE Africa along the Nile valley from analysing Egyptian and Nubian mtDNA sequences (Krings et al. 1999), where south-north migration (and vice versa) could be facilitated by the Nile.

Sequence frequency and diversity, and nucleotide diversity, point to NW Africa as the cradle of U6, with an estimated age of 47,000 ± 18,000 years. Such an ancient age contrasts with the limited spread of U6, which is found in N Africa, the Canaries and Iberia, and at very low frequencies in Italy, the Middle East, and the Sahel. This could be explained because, with the exception of the Moslem invasions of Iberia and Sicily, no large population expansion has been known to originate in NW Africa, and the gene tree structure for U6 does not seem compatible with a strong population expansion. U6 represents, thus, a local background in NW Africa. Its relatively low frequency (10% overall, although ranging from absence in Algeria to 28.2% in the Mozabites) is in stark contrast with the high frequency of Y-chromosome haplogroup E3b2* (64%; Bosch et al. 2001), which may also have originated (or expanded to such high frequency) locally in NW Africa. This discrepancy may be the result of ancient, random, locus-specific drift, and/or of a male-biased bottleneck or migration. A locus-specific effect may be evidenced by the fact that AMOVA between Iberian and NW African populations is much higher for Y chromosome haplogroups than for multiple autosomal Alu insertion polymorphisms or mtDNA. Since men contribute their autosomes as well, the fact that population differentiation as demonstrated by autosomal loci is much closer to that for mtDNA than to that for the Y chromosome may be taken as evidence for ancient, random, locus-specific drift affecting the Y chromosome.

NW African populations are relatively heterogeneous in their mtDNA sequence pools. The eastern populations (Algeria and Tunisia) may have received more gene flow from the east, as evidenced by the frequencies of M1. This haplogroup originated in East Africa (Quintana-Murci et al. 1999) with a frequency 20% in Ethiopians (Passarino et al. 1998), and declines north-westwards (Nubians 10% and Egyptians 8%; Krings et al. 1999), whereas its frequency in the Middle East is lower (3% in Jordanians from Amman, Richards et al. 2000; 2% Israeli Palestinians, Richards et al. 2000; 2% in Israeli Druze, Macaulay et al. 1999).

The major outlier within NW Africa are the Mozabites, a well-known Berber isolated group in Algeria, where drift may have altered haplogroup frequencies.

SW European mtDNA Landscape

The mtDNA homogeneity observed in Europe (Simoni et al. 2000a and 2000b; Helgason et al. 2000, see also Richards et al. 2002) is also seen in the present analysis of the West Mediterranean samples, and contrasts with the heterogeneity of NW African populations. All the European samples present the same set of haplotypes with similar frequencies, short genetic distances to each other, and no clear genetic structure, up to the point that populations from Iberia and Italy do not each form a neat group. It should be noted that this homogeneity is seen at the current level of phylogenetic resolution, and that a more fine-grained structure may emerge from the analysis of complete mtDNA sequences (Richards et al. 2002).

The most outstanding feature in the west Mediterranean genetic landscape is the outlier position of Sardinians and Basques shown by classical genetic markers (Cavalli-Sforza et al. 1994; Calafell & Bertranpetit 1994; Cappello et al. 1996) and Y-chromosome polymorphisms (Cagliàet al. 1997; Scozzari et al. 2001; Bosch et al. 2001), although not so pronounced in the Basques. Nevertheless, mtDNA data reveals no differences between these two populations and the rest of European populations. This has also been shown in Basques by analysis of 11 Alu insertion polymorphisms in west Mediterranean populations (Comas et al. 2000).

Genetic Exchange Through the Mediterranean

Each of the subregions analysed (NW Africa and SW Europe) shows sequences that originated on the opposite shore of the Mediterranean. This is particularly clear in the case of U6 and L in SW Europe. L sequences are found at frequencies 3% in Iberia and 2.4% in Italy. Given the relatively high frequencies of L sequences in NW Africa, it is not clear whether they were contributed by the historical populations movements from the south to the north of the Mediterranean (such as the Moslem invasions of the 7th-11th centuries), or whether its presence is associated with other processes not directly linked to NW Africa. Out of 23 different L sequences in Iberia, two were also found in NW Africa (as well as in sub-Saharan Africa), and 7 others were found in sub-Saharan Africa (in a dataset comprising 1,158 individuals from 20 populations; Graven et al. 1995, Pinto et al. 1996; Watson et al. 1996; Mateu et al. 1997; Rando et al. 1998; Krings et al. 1999; Pereira et al. 2001; Brehm et al. 2002) but not in NW Africa. Treating the set of L sequences in Iberia as if it were a population reveals genetic distances from some W African populations, such as the Senegalese and Yoruba, that are slightly smaller than those between L sequences in Iberia and NW Africa. Thus, it may be the case that gene flow from NW Africa is not entirely responsible for the presence of L sequences in Iberia.

This may be even clearer in Italy, where the frequency of U6 is much lower than in Iberia (one out of 411 individuals), and where none of the eight L sequences has been found in NW Africa. Three Italian L sequences have been described throughout Africa, and the remaining five are not found in >1,000 sub-Saharan individuals. Thus, the presence of L sequences cannot be attributed to migration from NW Africa, and may instead represent gene flow from other sources, such as the Neolithic expansion or the Roman slave trade.

In contrast to mtDNA, no sub-Saharan Y chromosomal lineages were detected in Iberia (Bosch et al. 2001), or in Italy (Rosser et al. 2000), although sample sizes in these studies (97 and 99 chromosomes respectively) may not be sufficient to rule out their presence at low frequencies.

As hinted above, the presence of haplogroup U6 in Iberia may signal gene flow from NW Africa, and those of the subhaplogroup U6b1 recent gene flow from the Canary Islands. Haplogroup U6 is present at frequencies ranging from 0 to 7% in the various Iberian populations, with an average of 1.8%. Given that the frequency of U6 in NW Africa is 10%, the mtDNA contribution of NW Africa to Iberia can be estimated at 18%, with a 95% confidence interval of 8%-26% (estimated by sampling with replacement 10,000 times in populations having the same sample sizes and U6 frequencies as Iberia and NW Africa). This is larger than the contribution estimated with Y-chromosomal lineages (7%, 95% confidence interval 1%-14%, Bosch et al. 2001). However, it should be noted that the variance due to genetic drift is not included in the estimates, and this may have had a larger effect on U6, which has a much lower frequency in NW Africa than its Y-chromosome counterpart, E3b2*. In the same way, we can estimate the Canarian female contribution to the Iberian Peninsula: the subhaplogroup U6b1 is present at a frequency of 13% in the Canary Islands, and reached a frequency of 0.2% in the Iberian Peninsula. Thus, the mtDNA lineages of the Canary Islands contributed 1.5%, with a 95% confidence interval 0-4.7%, to the genetic pool of Iberia. The presence of lineages belonging to the U6b1 haplogroup in the Iberian Peninsula suggests recent gene flow from the Canary Islands, due to recent migration or to the enslavement and deportation of the native Canarians (also called Guanches) at the time of conquest by the kingdom of Castile (15th century).

With the present data, and in conjunction with other loci, we have glimpsed the palimpsest history of the Western Mediterranean; in that history, the geographical barriers imposed by the Sahara Desert and the Mediterranean Sea might not have been strong enough to prevent a certain degree of gene flow among already differentiated populations, as they were not barriers to the flow of cultures, languages, and religions.


Part 1: Racial Mixing in Selected European Groups: Introduction

Part 2: The Black African Genetic Footprint: Sickle Cell Disease

Part 3: Racial Mixing Brought the Hemoglobin D disorder to Britain and Ireland

Part 4: The Mendelian Laws of Genetics -  dominant and recessive racially mixed genes

Part 5: European Footprint: Hereditary Hemochromatosis - a genetically inherited disease

Part 6: Genetic Evidence of Avar and Hunnish Admixture in Central Europe

Part 7: Western European Genetic Remnants in Egypt

Part 8: Genetic Evidence of Racial Mixing in Greece

Part 9: Genetic Evidence of Racial Mixing in Italy

Part 10: Genetic Evidence of Racial Mixing in Portugal

Part 11: Genetic Evidence of Racial Mixing in Spain

Part 12: Genetic Homogeneity in Poland

 

Part 13: Genetic Homogeneity in Norway

Part 14: Finland, the Lapps and the Tat-C Controversy

 

White History Main Page