92% Indians are not sons of the soil but Immigrants in india

What you see is that Europeans are all equally related to Indians, but Indians exhibit a gradient of relationship to Europeans. That is, there is no European group which in particular resembles Indians via the connection with ANI; the distance between all European groups and ANI seems roughly equal. The Indians vary in their relationship to Europeans because they vary in their proportion of ANI.

In the table above there is a reference to the proportion of ANI and ASI in each Indian group. One question you might ask: how do you estimate the proportions of ancestry from groups which you don’t have any information about because they no longer exist? Europeans and the Onge can serve as proxies for the ANI and ASI respectively, but how far does this get you? Well, the methods that they used (they have three) which determine ancestral proportions can be used on populations which exist. So here is a figure which shows how their methods compare when you look at a population where we know something concrete about their ancestral populations because those ancestral populations are still extant, African Americans:


I also believe that their calculations are roughly correct because they pass the smell test. It isn’t as if this is the first study of the genetics of Indians. Though the assumptions of Structure based analysis are somewhat different, you can discern the same rank orders.

Moving back to the nature of population structure within India, as opposed to how Indians relate to non-Indians, one of the results which pops up is that South Asian groups seem to have very high Fst values relative to European ones when compared within regions or between neighbors. Remember that Fst is a rough measure of the genetic variation which occurs between groups. The famous maxim that “85% of variance is within races, and 15% between races,” is Fst based. The Fst in that is case 0.15. Corrected for region & caste, they find that South Asian groups seem to have Fst values on the order of 3-4 times higher than equivalent European groups. This isn’t too surprising, in History and Geography of Human Genes L. L. Cavalli-Sforza observes that Europeans are particularly homogeneous. Before the spate of 650 K SNP papers it was hard to find good stuff on the phylogeography of European populations because the techniques didn’t have the power to differentiate them. On the other hand, anthropologists have long thought that India was riddled with differentiation. After all, there’s the caste system. Indians are certainly physically diverse. Additionally, there is a line of thinking that India is the secondary Africa, insofar as most Eurasian and Australasian lineages go back to India. Like Africa, India may hold a great deal of diversity among its many populations because they’re old, the oldest in Eurasia and Australia (in concert with endogamy of course). The authors though have another model:

We propose that the high FST among Indian groups could be explained if many groups were founded by a few individuals, followed by limited gene flow. This hypothesis predicts that within groups, pairs of individuals will tend to have substantial stretches of the genome in which they share at least one allele at each SNP. We find
signals of excess allele sharing in many groups.

They go on:

Six Indo-European- and Dravidian speaking groups have evidence of founder events dating tomore than 30 generations ago…including the Vysya at more than 100 generations ago…Strong endogamy must have applied since then (average gene flow less than 1 in 30 per generation) to prevent the genetic signatures of founder events from being erased by gene flow. Some historians have argued that ‘caste’ in modern India is an ‘invention’ of colonialism in the sense that it became more rigid under colonial rule. However, our results indicate thatmany current distinctions among groups are ancient and that strong endogamy must have shaped marriage patterns in India for thousands of years

This is one of the places where you get some sense of time scales. In the rest of the paper they avoid this. They note in one of the figures: “Although the model is precise about tree topology and ordering of splits, it provides no information about population size changes or the timings of events.” But the numbers above give time scales of foundings on the order of 1,000 years, with perhaps others at 3,000 years. Elsewhere they say:

Two features of the inferred history are of special interest. First, the ANI and CEU form a clade, and further analysis shows that the Adygei, a Caucasian group, are an outgroup. Many Indian and European groups speak Indo-European languages, whereas the Adygei speak a Northwest Caucasian language. It is tempting to assume that the population ancestral to ANI and CEU spoke ‘Proto-Indo-European’, which has been reconstructed as ancestral to both Sanskrit and European languages, although we cannot be certain without a date for ANI-ASI mixture.

Despite the hedge, the allusion here suggests a date pegged on the order of 4,000 years ago. We don’t know much about how the Indo-Aryans arrived in India; the earliest extant records, the Vedas (which were transmitted orally initially), seem to be set in Northwest India. The general suspicion though is that the Indus Valley Civilization was not Indo-Aryan, and there is a Dravidian speaking population to the west of Pakistan, suggesting that that language group was at one point spoken in the region. All in all the outline being faintly sketched out in this paper sounds a lot like what Indians refer to as the Aryan invasion theory, a mass movement of populations out of the Northwest replacing and subjugating the natives. ANI values on the order of 70-80% in the Northwest seems to suggest near total replacement.

I’m skeptical. Obviously the Ind-Aryans had to arrive physically, but these sorts of nomadic populations tend to quickly dominate and culturally assimilate sedentarists. In the case of the Hungarians and Turks they even imposed their language upon the natives, with only marginal genetic impact. The paper itself points to the likelihood of a complex history of periodic, and perhaps continuous, gene flow. Two ancient populations mixing is what economists would term a “stylized fact,” good enough to get some points across, but not to be confused for reality.

What about the idea of foundings and subsequent endogamy explaining the high Fst? 2,500 years ago Herodotus already reported that India was the most populous nation in the world (he did not know of China). It isn’t as if the Indo-Aryans arrived in the New World, where the natives died off so that they could enter into a major demographic expansionary phase. That being said, India’s population did grow over time as cultures pushed east with better tools (e.g., iron axes), and cut down the local forests. To really test drive this model you need more 132 individuals from 25 populations. You need a lot of data from many individuals on to get a more granular feel for the variation. Population expansions did occur in the east down to the Mughal period as land was reclaimed for agriculture. Much of eastern Bengal was settled relatively recently, within the last 500-1000 years. In some regions we do have a sense of what the demographic history was, so we could be able to predict patterns of Fst if the model of founding + endogamy is operative. Historically this may make sense for some groups, such as Brahmins, who migrated to various regions to provide specialized services and then became indigenized, but it seems unlikely as an explanation for the majority of castes and jatis. Many of the same dynamics at work in India were probably at work in the Middle East. And also in Europe, which went through a population crash and “bounce back” after the fall of the Roman Empire. They should have just struck with a tree without the timing….

John Hawks has a related post.

Citation: Reich D, Thangaraj K, Patterson N, Price AL, Singh L. 2009. Reconstructing Indian population history. Nature 461:489-494. doi:10.1038/nature08365
