Substack cometh, and lo it is good. (Pricing)

Reich + Ancient DNA Willerslev + 1000 Genomes South Indian data

There’s a lot of genetic data out there. Many of the Reich lab data are downloadable. Additionally, Martin Sikora gave me a pedigree file with a lot of the ancient genotypes in their recent paper (much appreciated since pulling genotypes out of a lot of big sequence files of varied coverage was going to take some  time and care). I merged the two together. But for whatever reason the Reich data set did not include anything from the South Indian samples from the 1000 Genomes. Since I have those, I decided to add a bunch. These are Telegu and Tamil speakers who are neither Brahmins nor scheduled castes and tribes (for those curious, the Velama map pretty well on PCA to the “South Indians” I culled from the 1000 Genomes).

You can download it here. It’s a 200 MB tarball. It’s in plink format. I did a minor allele frequency filter out 0.05, and got it down to 385,000 SNPs. Please note: these data vary greatly in quality on the individual level. A lot of the ancient samples are missing a lot of positions, so keep that in mind when you analyze them (e.g., if you run PCA some of the dimensions are pretty obviously just ancient samples missing lots of markers in a systematic manner). Finally, there are non-human outgroups in the data. For example if you run a PC analysis without subsetting PC 1 will separate Marmoset from humans, with other primates and ancient samples in spanning the gap. If you leave in ancient populations, a lot of them are going to be of much lower quality than the run-of-the-mill population.

Below are the samples by population and size. Most of the labels are from the Haak et al. data set. Obviously they’re a little idiosyncratic, but I figure you can figure it out. Please note that the .fam file has population labels in the family ID column. I added them manually where they didn’t exist (e.g., the Willerslev data and the 1000 Genomes did not have them, so I added them where appropriate).

GroupN
S_India136
Yoruba70
Turkish56
Spanish53
Druze39
Palestinia38
Han33
Basque29
Japanese29
Sardinian27
BedouinA25
French25
Ulchi25
Burusho23
Chukchi23
Eskimo22
Russian22
Tubalar22
Brahui21
Mozabite21
Balochi20
Biaka20
Greek20
Hungarian20
Makrani20
Yakut20
BedouinB19
Pathan19
Yukagir19
Egyptian18
Kalash18
Mayan18
Sindhi18
Adygei17
Mandenka17
Bell_Beaker15
Unetice15
Yamnaya15
Hazara14
Papuan14
Pima14
HungaryGam13
Orcadian13
Somali13
AA12
Bergamo12
Icelandic12
Karitiana12
LBK_EN12
Masai12
Corded_Ware11
Khomani11
Nganasan11
Norwegian11
Sicilian11
SwedenSkog11
Ami10
Armenian10
Balkar10
Belarusian10
Bougainvil10
Bulgarian10
Chuvash10
Croatian10
Czech10
Dai10
English10
Estonian10
Even10
Georgian10
Han_NChina10
Kalmyk10
Kusunda10
Lithuanian10
Mbuti10
Miao10
Mixe10
Mixtec10
Mordovian10
North_Osse10
Selkup10
She10
Thai10
Tu10
Tujia10
Tuvinian10
Uygur10
Uzbek10
Yi10
Zapotec10
Abkhasian9
Atayal9
Chechen9
Daur9
Iranian_Je9
Jordanian9
Koryak9
Kyrgyz9
Lezgin9
Libyan_Jew9
Naxi9
Nogai9
Oroqen9
Ukrainian9
BantuSA8
Cambodian8
Cypriot8
Esan8
Hezhen8
Iranian8
Kinh8
Kumyk8
Lahu8
Lebanese8
Luhya8
Luo8
Maltese8
Mansi8
Mende8
Punjabi8
Saudi8
Surui8
Syrian8
Tajik_Pomi8
Tunisian8
Tuscan8
Yemenite_J8
Aleut7
Algerian7
Altaian7
Ashkenazi7
Bengali7
Bolivian7
Ethiopian7
Finnish7
French_Sou7
Georgian_J7
Karasuk7
Motala_HG7
Tunisian_J7
Turkmen7
Xibo7
Albanian6
BantuKenya6
Gambian6
Hungary_Vatya6
Iraqi_Jew6
Itelmen6
Korean6
Mongola6
Moroccan_J6
Saharawi6
Yemen6
Afanasievo5
Armenia_LBA5
Cochin_Jew5
GujaratiA5
GujaratiB5
GujaratiC5
GujaratiD5
Hadza5
Ju_hoan_No5
MTurkish_J5
Quechua5
Spanish_No5
Andronovo4
Kikuyu4
Piapoco4
Russia_Iron_Age4
Scottish4
Sintashta4
Spain_EN4
Spain_MN4
Tlingit4
Armenia_MBA3
Australian3
Baalberge3
Benzigerod3
Datog3
Dolgan3
FTurkish_J3
Hungary_Maros3
Italy_Remedello3
Mezhovskaya3
Sweden_Nordic_BA3
Athabascan2
Botocudo2
Canary_Isl2
Denmark_Nordic_BA2
Denmark_Nordic_LN2
Greenland2
MiddleDors2
Nivkh2
Okunevo2
Russia_LBA2
Sweden_Nordic_LN2
AG21
Alberstedt1
Aleutian1
Altai1
Ancient_De1
Ancient_Ne1
Birnirk1
Chimp1
Clovis1
Denisovan1
Denmark_Nordic_LBA1
Denmark_Nordic_MN_B1
EBA1
Esperstedt1
Germany_BA1
Gorilla1
Halberstad1
hg19ref1
Hungary_MBA1
Iceman1
Italian_So1
Karelia_HG1
Karsdorf_L1
Kazakhstan_Sintashta1
Kostenki141
LaBrana11
LateDorset1
LBKT_EN1
Lithuania_LBA1
Loschbour1
MA11
Macaque1
Marmoset1
Mezmaiskay1
Montenegro_Iron_Age1
Montenegro_LBA1
Orang1
RR1
Saami_WGA1
Samara_HG1
Saqqaq1
Spain_EN_r1
Starcevo_E1
Stuttgart1
Sweden_Battle_Axe1
Sweden_Battle_AxeNordic_LN1
Sweden_Iron_Age1
Thule1
Ust_Ishim1
Vindija1
Posted in Uncategorized

Comments are closed.