How polymorphism(s) in probe sequences affect cis-eQTL results

In our paper, we show that the presence of polymorphisms (SNPs and indels) in probe sequences can induce a serious technical artefact (through reduced hybridization efficiency) when conducting expression QTL (eQTL) studies, especially if you are using array based technologies. You can find the paper here:

But for all of you busy people, the main result is summarized in Figure 1 (also attached below).


Quick summary:

  • About 6.1% of the probes (25-mers) in Affymetrix Human Exon 1.0 array contains polymorphisms but they account 50 – 90% of cis-eQTLs
  • About 11.7% of the probes (50-mers) in Illumina HT12 array contains polymorphisms but they account for 30 – 45% of the cis-eQTLs
  • The binding efficiency of longer probes are less affected by polymorphism (as expected) but still > 30% of cis-eQTLs are false!
  • The 1000G dataset appears to be good enough to identify probes containing polymorphisms. The only way to possibly improve on this is to look at private mutations from exome-seq etc but may not be worth the effort.
  • Increasing p-value stringency seems to make situation worse (cis-eQTLs due to polymorphisms are very strong) so many published cis-eQTLs may suffer from this (especially if the authors have not done this or used an older reference panel to identify polymorphisms)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s