Reich + Ancient DNA Willerslev + 1000 Genomes South Indian data

There’s a lot of genetic data out there. Many of the Reich lab data are downloadable. Additionally, Martin Sikora gave me a pedigree file with a lot of the ancient genotypes in their recent paper (much appreciated since pulling genotypes out of a lot of big sequence files of varied coverage was going to take some  time and care). I merged the two together. But for whatever reason the Reich data set did not include anything from the South Indian samples from the 1000 Genomes. Since I have those, I decided to add a bunch. These are Telegu and Tamil speakers who are neither Brahmins nor scheduled castes and tribes (for those curious, the Velama map pretty well on PCA to the “South Indians” I culled from the 1000 Genomes).

You can download it here. It’s a 200 MB tarball. It’s in plink format. I did a minor allele frequency filter out 0.05, and got it down to 385,000 SNPs. Please note: these data vary greatly in quality on the individual level. A lot of the ancient samples are missing a lot of positions, so keep that in mind when you analyze them (e.g., if you run PCA some of the dimensions are pretty obviously just ancient samples missing lots of markers in a systematic manner). Finally, there are non-human outgroups in the data. For example if you run a PC analysis without subsetting PC 1 will separate Marmoset from humans, with other primates and ancient samples in spanning the gap. If you leave in ancient populations, a lot of them are going to be of much lower quality than the run-of-the-mill population.

Below are the samples by population and size. Most of the labels are from the Haak et al. data set. Obviously they’re a little idiosyncratic, but I figure you can figure it out. Please note that the .fam file has population labels in the family ID column. I added them manually where they didn’t exist (e.g., the Willerslev data and the 1000 Genomes did not have them, so I added them where appropriate).

Group N
S_India 136
Yoruba 70
Turkish 56
Spanish 53
Druze 39
Palestinia 38
Han 33
Basque 29
Japanese 29
Sardinian 27
BedouinA 25
French 25
Ulchi 25
Burusho 23
Chukchi 23
Eskimo 22
Russian 22
Tubalar 22
Brahui 21
Mozabite 21
Balochi 20
Biaka 20
Greek 20
Hungarian 20
Makrani 20
Yakut 20
BedouinB 19
Pathan 19
Yukagir 19
Egyptian 18
Kalash 18
Mayan 18
Sindhi 18
Adygei 17
Mandenka 17
Bell_Beaker 15
Unetice 15
Yamnaya 15
Hazara 14
Papuan 14
Pima 14
HungaryGam 13
Orcadian 13
Somali 13
AA 12
Bergamo 12
Icelandic 12
Karitiana 12
LBK_EN 12
Masai 12
Corded_Ware 11
Khomani 11
Nganasan 11
Norwegian 11
Sicilian 11
SwedenSkog 11
Ami 10
Armenian 10
Balkar 10
Belarusian 10
Bougainvil 10
Bulgarian 10
Chuvash 10
Croatian 10
Czech 10
Dai 10
English 10
Estonian 10
Even 10
Georgian 10
Han_NChina 10
Kalmyk 10
Kusunda 10
Lithuanian 10
Mbuti 10
Miao 10
Mixe 10
Mixtec 10
Mordovian 10
North_Osse 10
Selkup 10
She 10
Thai 10
Tu 10
Tujia 10
Tuvinian 10
Uygur 10
Uzbek 10
Yi 10
Zapotec 10
Abkhasian 9
Atayal 9
Chechen 9
Daur 9
Iranian_Je 9
Jordanian 9
Koryak 9
Kyrgyz 9
Lezgin 9
Libyan_Jew 9
Naxi 9
Nogai 9
Oroqen 9
Ukrainian 9
BantuSA 8
Cambodian 8
Cypriot 8
Esan 8
Hezhen 8
Iranian 8
Kinh 8
Kumyk 8
Lahu 8
Lebanese 8
Luhya 8
Luo 8
Maltese 8
Mansi 8
Mende 8
Punjabi 8
Saudi 8
Surui 8
Syrian 8
Tajik_Pomi 8
Tunisian 8
Tuscan 8
Yemenite_J 8
Aleut 7
Algerian 7
Altaian 7
Ashkenazi 7
Bengali 7
Bolivian 7
Ethiopian 7
Finnish 7
French_Sou 7
Georgian_J 7
Karasuk 7
Motala_HG 7
Tunisian_J 7
Turkmen 7
Xibo 7
Albanian 6
BantuKenya 6
Gambian 6
Hungary_Vatya 6
Iraqi_Jew 6
Itelmen 6
Korean 6
Mongola 6
Moroccan_J 6
Saharawi 6
Yemen 6
Afanasievo 5
Armenia_LBA 5
Cochin_Jew 5
GujaratiA 5
GujaratiB 5
GujaratiC 5
GujaratiD 5
Hadza 5
Ju_hoan_No 5
MTurkish_J 5
Quechua 5
Spanish_No 5
Andronovo 4
Kikuyu 4
Piapoco 4
Russia_Iron_Age 4
Scottish 4
Sintashta 4
Spain_EN 4
Spain_MN 4
Tlingit 4
Armenia_MBA 3
Australian 3
Baalberge 3
Benzigerod 3
Datog 3
Dolgan 3
FTurkish_J 3
Hungary_Maros 3
Italy_Remedello 3
Mezhovskaya 3
Sweden_Nordic_BA 3
Athabascan 2
Botocudo 2
Canary_Isl 2
Denmark_Nordic_BA 2
Denmark_Nordic_LN 2
Greenland 2
MiddleDors 2
Nivkh 2
Okunevo 2
Russia_LBA 2
Sweden_Nordic_LN 2
AG2 1
Alberstedt 1
Aleutian 1
Altai 1
Ancient_De 1
Ancient_Ne 1
Birnirk 1
Chimp 1
Clovis 1
Denisovan 1
Denmark_Nordic_LBA 1
Denmark_Nordic_MN_B 1
EBA 1
Esperstedt 1
Germany_BA 1
Gorilla 1
Halberstad 1
hg19ref 1
Hungary_MBA 1
Iceman 1
Italian_So 1
Karelia_HG 1
Karsdorf_L 1
Kazakhstan_Sintashta 1
Kostenki14 1
LaBrana1 1
LateDorset 1
LBKT_EN 1
Lithuania_LBA 1
Loschbour 1
MA1 1
Macaque 1
Marmoset 1
Mezmaiskay 1
Montenegro_Iron_Age 1
Montenegro_LBA 1
Orang 1
RR 1
Saami_WGA 1
Samara_HG 1
Saqqaq 1
Spain_EN_r 1
Starcevo_E 1
Stuttgart 1
Sweden_Battle_Axe 1
Sweden_Battle_AxeNordic_LN 1
Sweden_Iron_Age 1
Thule 1
Ust_Ishim 1
Vindija 1
0
Posted in Uncategorized

Comments are closed.