Best choice parameters #78

JoseCorCab · 2018-05-10T14:05:06Z

Hello,
I'm using epic 0.2.9 to compare two mapping samples against a mapping reference, and I have some questions for you.
I'm looking for enriched regions of unknown size comparing two experimental groups (GROUP1: sample1 and sample2) (GROUP2: reference). But it's essential that no reads of one of the groups map on the enriched sequence.
At first, I mapped the 3 samples sequences (sample 1, sample 2 and reference) against the same reference genome using bowtie2. Then I used epic to compare sample1_mapping/reference_mapping and sample1_mapping/reference_mapping. I created a chromosome-size-file.

When I compared the samples with default parametres, I get large enriched regions and logically a lot of reads of both groups mapped on each enriched region.

I extracted a little test_sample of each sample. Speciffically I extracted known enriched region from genome. Then I reproduce the same steps.
I make a lot of executions with high and low FDR, proving combinations of windows size and gap allowed through some loops.
When I checked the epic results I realized that in sample1/reference comparison, the program return the exactly region.

But in sample2/reference comparison it returns:
-One very large region whith high FDR and logically both samples reads mapped in it.
-A lot of very short regions with no reads mapped from one of the samples, with very low FDR, but there are a lot of little gaps between them.
-No enriched regions.
When I graph the coverage map of these comparison I can see a clear enriched region in both comparisons.
Here are the graphics:

The most notable difference between sample1 and sample2 mappings is the coverage deep, how you can see in the graphs.
I could test a combination of parametres for tune up the program for the test-files. But in the case of the real samples, I don't think it work because I'm looking for unknown size sequences, from 3 pb to largest possible.
FIRST QUESTION:
Which combination of parameters do you recommend for that kind of experiment?
SECOND QUESTION:
In what degree does the coverage deep diference between mappings affect?
THIRD QUESTION:
In what degree does the definition of the samples as control (-c) or as treatment (-t) affect?

I'm waiting for an answer.
Thanks you very much,
Jose.

endrebak · 2018-05-10T16:33:28Z

The recommendations are for specific histone/protein marks.
The sample differences should not matter much as the data are pooled if analyzed together. Library differences can be due to one poor quality experiment though. Perhaps you should investigate the data with deeptools?
You should only use ChIP samples as -t. I guess using ChIP as -c can work, but it is much better to use actual input data.

Thanks for trying epic. I’d love to be of help, but I suspect you’d get much better answers at biostars.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best choice parameters #78

Best choice parameters #78

JoseCorCab commented May 10, 2018

endrebak commented May 10, 2018

Best choice parameters #78

Best choice parameters #78

Comments

JoseCorCab commented May 10, 2018

endrebak commented May 10, 2018