Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corections for multiple tests for trans-eQTL #146

Open
AlexandreHUBERT17 opened this issue Aug 19, 2024 · 2 comments
Open

Corections for multiple tests for trans-eQTL #146

AlexandreHUBERT17 opened this issue Aug 19, 2024 · 2 comments

Comments

@AlexandreHUBERT17
Copy link

Hello, my colleagues and I, are using TensorQTL for our researches and we have a few questions about the corections for multiple tests when identifying trans-eQTLs.

Many papers incorrectly state that they have detected many more trans-eQTL than cis-eQTL, probably due to an incorrect correction for multiple tests.
We would therefore like to know more about the correction for multiple tests made by TensorQTL when calculating trans-eQTL.
To clarify our questions, let's take our study set as an example: 300k heterozygous markers on 15K genes expressed in a specific tissue from 500 hens.

  1. We assume that TensorQTL calculates 300K x 15K = 4.5 billion pvalues.
    TensorQTL then returns the marker x gene associations with a pvalue.Nomimal only ≤ threshold to be set by the user (usually 10^-5) and after removing the cis-type associations calculated according to the 'marker-TSS' distance parameter specified by the user (usually 1Mb). In the end, we have a file with around 1M lines.
  • What is the rationale behind such a selection “pvalue.Nomimal ≤ threshold 10^-5”?
  1. We then calculate permutations using the trans.map_permutations() and trans.apply_permutations() functions, in order to obtain, in addition to the pvalue.Nomimal,
  • A pvalue.perm :

    • Is it calculated following the N permutations indicated by the 'nPerm' parameter (10K by default) in the 'trans.map_permutations()' function?
    • What do the permutations correspond to?
    • Do we just swap the expression values for each gene?
    • Then do we repeat the test N times for each swapped expression gene * marker?
    • If so, that would mean 10K permutations * 15k * 300 K, i.e. far too many tests to do. Why have 10K permutations been defined by default?
    • Finally, is pvalue.perm the 95th percentile of the distribution of pvalues obtained by chance after permutation?
  • A pvalBeta, calculated after permutation:

    • How are permutations taken into account in this pvalBeta?
    • Wht is the formula?

Thank you in advance

@francois-a
Copy link
Collaborator

The 10^-5 threshold for nominal p-values is somewhat arbitrary, and was chosen to include potentially interesting (but not genome-wide significant) results. Reporting the full set of trans associations would result in prohibitively large outputs for most datasets.

For permutations it is important to note that the approach implemented in TensorQTL only works if the phenotypes are all standard normal distributed (e.g., from applying an inverse normal transform). Based on this assumption, empirical p-values can be obtained from genome-wide permutations of a standard normal (with the chr_s=pgr.pvar_df['chrom'] argument, this is performed as 'leave on chromosome out'). A beta distribution is fitted to the permutation p-values in the same manner as for cis-QTLs (for details, see Ongen et al., Bioinformatics, 2016). pval_perm is the value computed from the permutations; pval_beta is the corresponding beta-approximated p-value that should be used for analyses (e.g., to compute q-values).

@AlexandreHUBERT17
Copy link
Author

Thank you for your reply,

I understand better the distinction between the two types of pval.

However, I didn't quite understand the point of using the chr_s parameter, can you tell me more?
Also, in your example, what does pgr.pvar_df correspond to? Is it the dataframe containing the positions of the variants?

Thank you in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants