Matrix eQTL: ultra fast eQTL analysis

I just gave a journal club presentation on this wonderful piece of software. It was very well received and I really hope more people use this. Here is a quick review.

Well, what is it? It’s freely available R and matlab package written by Andrey Shabalin available from http://www.bios.unc.edu/research/genomic_software/Matrix_eQTL/ for ultrafast analysis of expression QTLs (eQTL). In an eQTL analysis, one tries to find DNA variation associated with gene expression changes and typically involves calculating the association between hundreds of thousands of SNPs and tens of thousands of gene expression profiles.

In my case, I have exon-level arrays and 1000Genomes imputed SNP dataset in ten different brain regions, which means about 2.4 x 1013 or 24 trillion tests (equivalent to running 3 million GWAs analyses). This would have taken a whole year to run using 200 computing processors which is the capacity of our in-house server and that would not have pleased neither my system administrator nor my colleagues. That is when I serendipitously stumbled upon Matrix eQTL and now the same task only about 6 hours to complete.

How? Andrey has a few extremely clever and practical tricks (read his well written manuscript for more details):

  1. Most test statistics (e.g. t-test from linear regression) can be rewritten in terms of correlation between the gene and the SNP
  2. The correlation is unchanged when the gene or SNP is standardized to have zero mean and unit variance
  3. The correlation can be calculated using matrix multiplication operation
  4. Can speed up the matrix multiplication by calculating it for manageable chunks of data (say 10,000 genes vs 10,000 snps) at a time
  5. This one is for R users only. The standard R installation is with Basic Linear Algebra Subprograms (BLAS) library. A faster alternative is the ATLAS library. But for the fastest libraries, try the following suggestions:
    1. For Windows, you can use Revolution R. This is a commercial software but freely available to academic users. See my previous post on this.
    2. For Unix/Linux, there is also a redhat version from Revolution R.
    3. For Unix/Linux, you can link the R installation to Maths Kernel Library (MKL) or equivalent if you have the license for it.

Two minor caveats which I can easily live with:

  1. The software imputes missing values by average mean before eQTL analysis. Not a problem for me since my SNPs are imputed and there are no missing value at random in my expression data.
  2. Only p-value and t-statistics (allows you to test directionality). No Beta or SE which you have to calculate for significant hits. Not a problem for me but Beta and SE are typically required for meta-analysis of GWAs studies.

There are few more advanced topics the software is able to do: ANOVA regression, covariate correction and able to test heteroscedastic and correlated errors (e.g. if you have pedigree/family data).

And before I finish, here are a few ways you can use this software for other purposes:

  1. Gene – gene interaction by setting SNP=gene. You don’t even have to create additional files. You can do this because the SNP data is not required to be integers or restricted to 0,1,2 values. Just remember to use the off-diagonal elements (i.e. remove the interaction of a gene with itself and one of gene1-gene2 or gene2-gene1 combinations).
  2. SNP – SNP interaction. Same as above by setting gene=SNP.
  3. GWAs analysis. Why not? You simply set your “gene” to be the phenotype of interest (therefore you can test multiple phenotypes easily) and feed in the covariates you want to adjust for. And the answer is few seconds rather than minutes/hours. Do note my point above about the lack of beta or standard error if you planning on a meta-analyses of GWAs.

Above all, the software is easy to use, the manuscript is clearly written (first time I understood the power of matrix algebra) and the author responds quickly to queries and is very helpful. That’s a very rare combination of pros! Thank you.


One comment


Leave a comment