-
Notifications
You must be signed in to change notification settings - Fork 0
/
2.4-discussion-ideas.Rmd
77 lines (56 loc) · 5.25 KB
/
2.4-discussion-ideas.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
There is ample evidence that increased taxon sampling benefits phylogenetic inference. It breaks long branches (Hillis 1996), increases phylogenetic accuracy (Zwickl and Hillis, 2002), resolves phylogenetic conflict between gene trees (Hedtke et al. 2006), and resolves contentious phylogenetic relationships (Townsend & Lopez-Giraldez 2010; Natsidis et al. 2019), reveal hidden orthologs and paralogs (Martin-Duran et al. 2017). It has been the focus to give insight into phylogenetic relationships of many different taxa.
Applied phylogenetic analyses have largely focused on improving taxon sampling to gain better insight in a group’s evolutionary history:
in fish (Salmonidae, Crête et al. 2012; flounders, Vinnikov et al. 2018),
birds (Wang et al. 2017)
Gastropods (Oskars et al. 2015)
Insects (Hemiptera, Song et al. 2019; Diptera, Shin et al. 2018; Lepidopetra, Wilson 2011; Orthoptera, Vandergast et al. 2017),
amphibians (Peloso et al. 2016)
Mammals (Cetartiodactyla, Agnarsson & May-Collado 2008)
Squamate reptiles (Ketchum & Benson 2010)
turtles (Vitek et al. 2018)
flowering plants (Cornales, Xiang et al. 2011; Caryophyllales, Crawley & Hilu 2012; Campanula, Mansion et al. 2012, Rosids, Sun et al. 2020)
unicellular organisms (Ciliates, Tang & Zhao 2016; Xu et al. 2021; chlamydomonas, Nakada et al. 2019)
Spiders (Kallal et al. 2020)
Sponges (Alvizu et al. 2018)
fungi (Prsanna et al. 2020)
biogeography (bees, Kayaalp et al. 2017; gallus, )
Increased taxon sampling also improves time of divergence estimates (Schulte 2013; Soares & Schrago 2015)
<!-- databases designed to preserve and democratize access to biological data constitute an amazing resource for evolutionary estimation -->
*COMPARE WITH PERFORMANCE OF OTHER PIPELINES FOR SEQUENCE SCRAPING*
*WHY WE DID NOT MAKE A BENCHMARK COMPARISON?*
<!-- Usually, initial ideas about the data are changed by analyses.
Ideally, new understanding of the data can be registered on databases,
exposing newcomers to the most up to date knowledge about the data. -->
<!-- Taking advantage of public DNA data bases have been the main focus. -->
Discussion: (although there are plans to allow storage of unpublished trees).
Describe again statistics to compare phylogenies provided by physcraper via OpenTreeOfLife.
Mention statistics provided by other tools: PhyloExplorer [@ranwez2009phyloexplorer].
Compare and discuss.
How is physcraper already useful:
- to mine targeted sequences, in this way it is similar to baited analyses from PHLAWD and pyPHLAWD. Phylota does not do baited analyses, I think, only clustered analyses.
- Finding
-
<!--
Physcraper potential:
- rapid phylogenetic placing of newly discovered species, as mentioned in @webb2010biodiversity
- obtain trees for ecophylogenetic studies, as mentioned in @helmus2012phylogenetic
- one day could be used to sistematize nucleotide databases, such as Genbank [@benson2000genbank; @wheeler2000database], as mentioned in @san2010molecular, i.e., curate ncbi taxonomic assignations.
- allows to generate custom species trees for downstream analyses, as mentioned in @stoltzfus2013phylotastic -->
Things that physcraper does not do:
- analyse the whole GenBank database [@benson2000genbank; @wheeler2000database] to find homolog regions suitable to reconstruct phylogenies, as mentioned in @antonelli2017toward. There are already some very good tools that do that.
- provide basic statistics on data availability to assemble molecular datasets, as mentioned by @ranwez2009phyloexplorer. Phyloexplorer does this?
- it is not a tree repo, as phylota is, mentioned in @deepak2014evominer
***Is physcraper intended for the nonspecialist?? We have two types of nonspecialists:
the ones that do not know about phylogenetic methods and the ones that might know
about phylogenetic methods but do not know much about a certain biological group.***
\comment(hmm yeah, I think it is probably better for the latter - phylogeneticists who are tired of blast searching :P)
All these tools are key efforts for advancing towards reproducibility in phylogenetics,
a field that has relied on processes which are somewhat artisanal.
Here, we highlight the potential of taking advantage of this careful curation work in previous phylogenetic estimates.
<!-- Creating unified standards usually ends up generating an extra standard. However, the OpenTree Taxonomy is a powerful tool that is allowing us to reach decade-long dreams of "exploring and analysing biodiversity at an accelerated pace, to return systematics into the mianstream of science" [@wilson2003encyclopedia]. -->
<!-- Physcraper might be overlapping/partially solving the goal of the Encyclopedia of Life:
"the exploration and analysis of biodiversity at a vastly accelerated pace. Its momentum will return systematics from its long sojourn at the margin and back into the mainstream of science." [@wilson2003encyclopedia] -->
<!-- Most of them focus on efficient ways to mine the data -- getting the most homologs.
Some focus on accurate ways of mining the data -- getting real and clean homologs.
Others focus on refining the alignment. -->
<!-- *from Robert: I think you can sell the program more here. Why is it better than the other methods? You mentioned in lab meeting that its difficult to run other programs, talk about that here, talk about the speed and other advantages* -->