From bdee516886a68cb77cbc1e8e667004906ca5b10f Mon Sep 17 00:00:00 2001 From: Sphinx Date: Thu, 17 Oct 2024 01:10:45 +0000 Subject: [PATCH] Automated update eedf82d415de1c780ee818743144b1dc3353cc4f --- dev/Tutorial/chapter_bibliography.html | 4 ++ dev/Tutorial/chapter_pairwise.html | 54 ++++++++++++++++++------- dev/api/Bio.Align.html | 10 ++++- dev/objects.inv | Bin 46069 -> 46082 bytes dev/searchindex.js | 2 +- 5 files changed, 53 insertions(+), 17 deletions(-) diff --git a/dev/Tutorial/chapter_bibliography.html b/dev/Tutorial/chapter_bibliography.html index d2bc291f..bcb4cc62 100644 --- a/dev/Tutorial/chapter_bibliography.html +++ b/dev/Tutorial/chapter_bibliography.html @@ -124,6 +124,10 @@ [Cavener1987]

Douglas R. Cavener: Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Research 15 (4): 1353–1361 (1987). https://doi.org/10.1093/nar/15.4.1353

+
+[Chakraborty2013] +

Chakraborty, A., Bandyopadhyay, S. FOGSAA: Fast Optimal Global Sequence Alignment Algorithm. Sci Rep 3, 1746 (2013). https://doi.org/10.1038/srep01746

+
[Chapman2000]

Brad Chapman and Jeff Chang: Biopython: Python tools for computational biology. ACM SIGBIO Newsletter 20 (2): 15–19 (August 2000).

diff --git a/dev/Tutorial/chapter_pairwise.html b/dev/Tutorial/chapter_pairwise.html index 3ad98ed3..941eafe3 100644 --- a/dev/Tutorial/chapter_pairwise.html +++ b/dev/Tutorial/chapter_pairwise.html @@ -149,9 +149,10 @@ Bio.Align module contains the PairwiseAligner class for global and local alignments using the Needleman-Wunsch, Smith-Waterman, Gotoh (three-state), and Waterman-Smith-Beyer global and local pairwise -alignment algorithms, with numerous options to change the alignment -parameters. We refer to Durbin et al. [Durbin1998] -for in-depth information on sequence alignment algorithms.

+alignment algorithms, and the Fast Optimal Global Alignment Algorithm (FOGSAA), +with numerous options to change the alignment parameters. We refer to Durbin +et al. [Durbin1998] for in-depth information on sequence alignment +algorithms.

Basic usage

To generate pairwise alignments, first create a PairwiseAligner @@ -384,6 +385,29 @@ alignments if segments with a score 0 can be added to the alignment. We follow the suggestion by Waterman & Eggert [Waterman1987] and disallow such extensions.

+

If aligner.mode is set to “fogsaa”, then the Fast Optimal Global Alignment +Algorithm [Chakraborty2013] with some modifications is used. This mode +calculates a global alignment, but it is not like the regular “global” mode. +It is best suited for long alignments between similar sequences. Rather than +calculating all possible alignments like other algorithms do, FOGSAA uses a +heuristic to detect steps in an alignment that cannot lead to an optimal +alignment. This can speed up alignment, however, the heuristic makes +assumptions about your match, mismatch, and gap scores. If the match score is +less than the mismatch score or any gap score, or if any gap score is greater +than the mismatch score, then a warning is raised and the algorithm may return +incorrect results. Unlike other modes that may return more than one alignment, +FOGSAA always returns only one alignment.

+
>>> aligner.mode = "fogsaa"
+>>> aligner.mismatch_score = -10
+>>> alignments = aligner.align("AAACAAA", "AAAGAAA")
+>>> len(alignments)
+1
+>>> print(alignments[0])
+target            0 AAAC-AAA 7
+                  0 |||--||| 8
+query             0 AAA-GAAA 7
+
+

The pairwise aligner object

@@ -452,7 +476,7 @@
  • By specifying a match score for identical letters, and a mismatch scores for mismatched letters. Nucleotide sequence alignments are typically based on match and mismatch scores. For example, by default -BLAST [Altschul1990] uses a match score of +BLAST [Altschul1990] uses a match score of \(+1\) and a mismatch score of \(-2\) for nucleotide alignments by megablast, with a gap penalty of 2.5 (see section Affine gap scores for more information on gap @@ -492,7 +516,7 @@ allows you to apply different scores for different pairs of matched and mismatched letters. This is typically used for amino acid sequence alignments. For example, by default BLAST -[Altschul1990] uses the BLOSUM62 substitution +[Altschul1990] uses the BLOSUM62 substitution matrix for protein alignments by blastp. This substitution matrix is available from Biopython:

    >>> from Bio.Align import substitution_matrices
    @@ -1107,7 +1131,7 @@ 

    Aligning to the reverse strand

    Substitution matrices

    -

    Substitution matrices [Durbin1998] provide the scoring +

    Substitution matrices [Durbin1998] provide the scoring terms for classifying how likely two different residues are to substitute for each other. This is essential in doing sequence comparisons. Biopython provides a ton of common substitution matrices, @@ -1551,7 +1575,7 @@

    Reading ArrayFor two-dimensional arrays, we follow the file format of substitution matrices provided by NCBI. For example, the BLOSUM62 matrix, which is the default substitution matrix for NCBI’s protein-protein BLAST -[Altschul1990] program blastp, is stored as +[Altschul1990] program blastp, is stored as follows:

    #  Matrix made by matblas from blosum62.iij
     #  * column uses minimum score
    @@ -1621,8 +1645,8 @@ 

    Reading ArrayLoading predefined substitution matrices

    Biopython contains a large set of substitution matrices defined in the literature, including BLOSUM (Blocks Substitution Matrix) -[Henikoff1992] and PAM (Point Accepted Mutation) -matrices [Dayhoff1978]. These matrices are available +[Henikoff1992] and PAM (Point Accepted Mutation) +matrices [Dayhoff1978]. These matrices are available as flat files in the Bio/Align/substitution_matrices/data directory, and can be loaded into Python using the load function in the substitution_matrices submodule. For example, the BLOSUM62 matrix @@ -1647,7 +1671,7 @@

    Loading predefined substitution matrices[Schneider2005] uses an alphabet consisting of +[Schneider2005] uses an alphabet consisting of three-nucleotide codons:

    >>> m = substitution_matrices.load("SCHNEIDER")
     >>> m.alphabet  
    @@ -1776,7 +1800,7 @@ 

    Loading predefined substitution matrices

    Generalized pairwise alignments using a substitution matrix and alphabet

    -

    Schneider et al. [Schneider2005] created a +

    Schneider et al. [Schneider2005] created a substitution matrix for aligning three-nucleotide codons (see below in section Substitution matrices for more information). This substitution matrix is associated with an @@ -2237,13 +2261,13 @@

    Calculating the number of nonsynonymous and synonymous substitutions per sit (NG86, LWL85, YN00) as well as the maximum likelihood method (ML) to estimate dN and dS:

      -
    • NG86: Nei and Gojobori (1986) [Nei1986] +

    • NG86: Nei and Gojobori (1986) [Nei1986] (default). With this method, you can also specify the ratio of the transition and transversion rates via the argument k, defaulting to 1.0.

    • -
    • LWL85: Li et al. (1985) [Li1985].

    • -
    • YN00: Yang and Nielsen (2000) [Yang2000].

    • -
    • ML: Goldman and Yang (1994) [Goldman1994]. With +

    • LWL85: Li et al. (1985) [Li1985].

    • +
    • YN00: Yang and Nielsen (2000) [Yang2000].

    • +
    • ML: Goldman and Yang (1994) [Goldman1994]. With this method, you can also specify the equilibrium codon frequency via the cfreq argument, with the following options:

        diff --git a/dev/api/Bio.Align.html b/dev/api/Bio.Align.html index 9e4e3591..e99200ee 100644 --- a/dev/api/Bio.Align.html +++ b/dev/api/Bio.Align.html @@ -2966,7 +2966,15 @@

        Submodules