Substack cometh, and lo it is good. (Pricing)

Geography of Genetic Variants Browser

browser

Jonathan Novembre’s lab has released a new toy web 2.0 toy which you might find of interest. From an announcement on the lab’s website:

We have released a new teaching/research tool.  The Geography of Genetic Variants browser.  This is the hard work of our “programmer-at-large” Joe Marcus.  It’s also a pilot project for a grant we submitted earlier this summer to work on challenges that arise visualizing population structure in genomic-scale variation data.

This is very beta. E.g. “More features are on the way…scaled circles as alternates to pie charts, computing a bounding box for regional datasets, pdf export for publication quality figures, and search by rsID or tables of markers. Contact us with any ideas!” I’d like rsID in particular, since that’s what I know off the top of my head for variants of specific interest.

Why is this noteworthy? It seems that the group is trying to integrate disparate data sources and present the results in a comparable fashion. This is useful. The HGDP, HapMap, and 1000 Genome Browsers are great, but obviously they focused on a narrow subset of human genomic data (also, they’re from an older web era, when everything was synchronous GET/POST). Geography of Genetic Variants Browser is also displaying results from the POPRES data set, which is kind of them since only academic researchers who have duly inquired and obtained permissions can view this. Hopefully they won’t stop there, and the code is such that they can just integrate new data sets in a rolling fashion. And why limit it to humans? The fly people have enough geographic samples to browse pretty maps.

I found this via Pontus Skoglund, who asserts that you “get a feel for the stochasticity of human genetic variation” if you click the random option. That’s actually sort of true. It’s also evident that there’s geographic structure, and that structure starts to get consistent in a Gestalt sense if you click dozens of times. You can also see that the allele frequency difference in Africa vs. non-Africa is large, while that within Europe is modest (some sort of smoothed kernel density might be a nice complement to pinned-pie charts; there are many R packages which could handle render that).

Finally, this browser is a pilot project which might lead to a grant that is aimed at tackling visualization of population structure. That’s a big deal. Perhaps I’m wrong, but it strikes me that the 1000th PCA package that runs 10% faster than Eigensoft is probably more a CV-builder than anything else. There are lots of good analysis packages out there, though the PaintMyChromosomes project gives you an idea of where there’s still fruit to be picked. On the other hand tools to visualize and comprehend the analysis that’s bubbling up are few and far between. Not everyone is a “command-line fiend,” and it is in the GUI that you get purchase with the public. Genomic science needs that.

Addendum: Also, there is something called the Midwest Population Genetics conference being organized now. I’m sure it’s a coincidence that they used the term genetics rather than genomics, because when looking at their program it was rather heavy on -omics. Let the battle between freshwater and saltwater genomics conferences commence!

Posted in Uncategorized

Comments are closed.