The origin of the Finnic peoples


Update: Please see related post.

One of the very first things I wrote about in relation to historical population genetics was in on the origins of the Finnic peoples. The reasons are two fold:

– first, the Finns and Estonians speak language is rather peculiar in a Europe dominated by Indo-European tongues (I suspect one reason that Tolkien based Quenya, the high elvish language, on Finnish is that it is so otherworldy to the Germanic ear. The Sindarin language, which was the common tongue of elves in Middle Earth, was based on Welsh). Rather, the distribution to the Uralic languages extends to the east, as far as Siberia. Even the closest affinities to Finnish and Estonian extend eastward, as there are Karelians who live deep in northwest Russia.

– second, there were peculiarities in the genetics of the Finns which date back to the 20th century that have always been notable.

Some of the distinctiveness of the Finns clearly has to do with the demographic isolation of the recent past, and the range expansion into the north and east. I will ignore this aspect of recent drift, and focus on their deep history and phylogenetic relationships.

New molecular genetic techniques in the 1980s and 1990s which enabled the genotyping of Y and mtDNA lineages immediately yielded the fact that the paternal heritage of the Finns is very unique in comparison to their neighbors, and erstwhile hegemons, the Scandinavians. While Swedes tend to be haplogroup I (indigenous to Western Europe dating to the late Pleistocene) or one of the two R1 lineages (intrusive from the Eurasian steppe during the Bronze Age), Finns tend to be haplogroup N3, with a substantial minority of I. While 63 percent of Finns are N3, only 3 percent of Swedens are. Due through the reality of migration of Finns to Sweden, as well as the prevalence of Saami all across Northern Sweden until the early modern period, Swedish N3 may be due to gene flow in the last thousand years. The two R1 lineages are ~10% of the Finnish paternal gene pool, they’re strongly skewed toward R1a, while the ~40% of Swedish R1 lineages are balanced.

In contrast the mtDNA profiles of Finns are very similar to their neighbors. Like Sweden the dominant haplogroup is a branch of H, with the reduced fraction accountable for the fact that Finns have a higher percentage of U5, which has been associated with European hunter-gatherers. The various haplogroups (e.g., T) associated with Early European Farmers are at somewhat lower frequency in Finland than Sweden.

A simple explanation then presents itself to us: the Finns have been subjected to male mediated admixture into a “conventional” European substrate. But there has been long been controversy as to whether the Finnish N3 haplogroup was indigenous to Europe, or its presence in Northeast Europe was due to migration. If it was indigenous than the admixture model does not make as much sense. But as with many things we’ve moved very far in comparison to where we were when I first began to look at this issue in 2002.

If you read Human Y Chromosome Haplogroup N: A Non-trivial Time-Resolved Phylogeography that Cuts across Language Families the likelihood than the Y chromsomal structure of Finland is old seems low. First, Finnish N3 lineages are very young and underwent rapid expansion beginning 4 to 6 thousand years ago (this is evident in their whole genome variation pattern). Second, the most diversity of N seems to be in Western Siberia. Third, N exists in higher frequencies in parts of Siberia than even in Finland. Fourth, the range of N pushes it all the way to the Pacific Ocean. It is not implausible that it expanded from one rim of Eurasia to the other, but the most likely scenario is that it came from somewhere in the middle.

Also, it is likely that there has been admixture into Finns from an East Eurasian population. To give some examples, a derived SNP at EDAR is at very high frequency in Northeast Asians. The ancestral variant is dominant outside of East Asia and the New World. In Europe among modern Europeans the derived variant of EDAR is not present in indigenous populations. A quick check in the 10000 Genomes data shows that it’s at ~6% in Finns (in contrast, the ancestral variant of SLC24A5 is present at frequencies of ~1; this could be random, but I suspect in situ selection….). You can see that the derived variant is absent in a rather large sampling of other Europeans.

Running ADMIXTURE unsupervised it’s immediately obvious that Finnic peoples have a minority component of East Eurasian admixture. This dark blue element is absent in most of the Swedes. Not surprisingly the Russians exhibit structure depending on where you sample. Some Russian populations are clearly Slavicized relatively recently, and exhibit a genetic profile rather like Finnic peoples (this northern Russian regions also have high frequencies of haplogroup N, which is much rarer in the south or among Ukrainians).

There’s a cline that runs east to west in relation to this component. The Finn’s neighbors immediately to the east, Karelians and Veps, have a higher fraction than the Finns proper. Additionally, some Finns in the data seem to lack it totally. One might speculate that these are people of Swedish origin who eventually assimilated to the Finnish identity. This is not impossible. In the 19th century Finnish nationalism was sparked in large part by middle class activists, many of whom were Swedish ethno-linguistically due to the connections between class and language at that time. But these individuals may be evidence of older structure in Finland. More on that later.

I also ran some Treemix on a subset of the data. You see there is gene flow coming into the Finns from a Siberian group. I used Nenets (a group of Samoyeds) and Yakut because the former have more linguistically in common with the Finns, while the latter are used by companies like 23andMe (Yakuts are the most northeasterly Turkic people). Strangely the Karelians and Veps get gene flow from Nenets, while the Finns get it from Yakuts (I pruned with PCA and ADMIXTURE to remove individuals with recent European ancestry).

But the model of a single pulse admixture is probably wrong anyhow. Rather, the spread of Finnic hunters and gatherers may have gradual, and/or occurred in several pulses. On the fringe of Northern Eurasia local extinctions were probably common. The landscape of Northern Eurasia, from the Baltic to Siberia, may long have been rather dynamic, with interactions between Uralic, Indo-European and Altaic peoples.

At this point I am at a loss. The archaeology of Finland is not something I know well, and the academic literature is hard for me to track down. Some scholars believe that the Comb Ceramic Culture plays a major role in the ethnogenesis of the people we call Finns. During the Bronze Age the Corded Ware zone spread into southern Finland, bringing agriculture. The fusion between the Comb Ceramic and Corded Ware led up to the societies which are first mentioned by Classical authors.

Finland was always liminal to early agriculture, and the Corded Ware Indo-Europeans may eventually have given away to the forest Finns as the climate turned more difficult. The predominance of N3 haplogroups may be a function of the nature of patriarchal societies, where certain lineages maintain powerful long term advantages.

7 thoughts on “The origin of the Finnic peoples

  1. On the linguistic side, the current idea is that Finnic languages in Karelia and Finland are Iron Age migrants from the southern side of Gulf of Finland, and Saamis inhabited these northern regions before that. Comb Ceramic or related hunter-gatherer cultures are now seen as too old.

    This could be reconciled with genetics rather easily, the eastern ancestry component in the Finnic groups that inhabit former Saami territory can be attributed to a Saami substrate using rare alleles ( or even just ADMIXTURE or PCA while that of groups in the Volga region is of another origin.

  2. In the supplement to the big Busby paper ( Finns are modeled as 88% German, 12% North Siberian with an average mixture date of 469 A.D. Many of the admixture pairing this program comes up with are crazy, but the dating seems reliable. Norwegians have a tiny amount of the same mixture and it’s dated to the similar year of 351. Lithuanians are modeled as just 1% Siberian with a date of 155 A.D. We now know that Estonians living in the 1st millenium B.C. did not carry Y haplogroup N, so this all looks sensible to me.

  3. @Halvorson
    Are there any 2000-3000 years old samples sequenced from Estonia? The only ancient Baltic DNA from that period I’m aware of is from Kivutkalns, southern side of the Gulf of Riga.

  4. You’re right, I’m getting senile. The 1st millennium samples from Mittnik that lacked N were from Latvia and Lithuania.

  5. All Baltic samples from the Mittnik study came from Narva and Corded Ware sites. It is still possible N was present in the Baltic earlier in the Comb Ceramic culture however, this is just speculation.

    We have two Comb Ceramic Culture samples, one is female the other is a male who carried a now very rare subclade of R1a. Both were autosomally EHG.

    Another thing to consider is that N1c subclades in Lithuania and Latvia are younger and less diverse than Estonian branches.

  6. I think your idea that N3 did not spread from one extremity to another makes sense. This is also the case for R1a, it must have originated in the middle zone between its two current extreme poles of Eastern Europe and South Asia which puts it in the Asian Steppe, possibly the same region where the proto Finns and PIEs first came into contact with each other. Using samples DNA exclusively without considering archeological evidence is like reading tea leaves.

Leave a Reply

Your email address will not be published. Required fields are marked *