Command-line tool for converting the ABIF file format to XML.
The first version of the command-line tool abi2xml was written in 2003 by Erik Sjölund then working at Karolinska Institutet. It was meant to serve the purpose of parsing the binary file format coming from an ABI PRISM TM 377 DNA Sequencer and writing the information as plain text in to an XML file.
In February 2020 the source code was migrated from the original Sourceforge repository in to Github. The migrated source code came from the file abi2xml-1.2.tar.gz that was released in 2006.
The ABIF file format is described in this publication:
by Clark Tibbetts, Ph. D. Professor of Microbiology, Vanderbilt University, August 1995.
Applied Biosystems later published their own specification. The PDF document Applied Biosystems Genetic Analysis Data File Format is still available at the QAbifReader Github repo.
The qt library version 4 (4.1 or higher) required for building abi2xml. The qt library is available for Linux, Microsoft Windows and Mac OS X and others.
After qt has been installed, unpack the abi2xml sources and run qmake && make && make install
If you are running Windows on a 32-bit (i386) platform you can download the file abi2xml-1.2.zip and unzip it. If your Windows computer lacks zip support you first have to install 7-zip to be able to unzip the file.
Usage: abi2xml -i binaryfile -o xmlfile
Other option flags are also available. To list them type abi2xml --help
.
[erik@linux]$ abi2xml --help
abi2xml 1.2
This program parses the binary file format coming from an
ABI PRISM TM 377 DNA Sequencer and writes the information out as
an xml file
Usage: abi2xml [OPTIONS]...
-h, --help Print help and exit
-V, --version Print version and exit
-i, --input-file=STRING input abi file
-o, --output-file=STRING output xml file
-I, --input-dir=STRING input dir with abi files
-O, --output-dir=STRING output dir
-s, --abi-file-suffix=STRING suffix of abi files ( used with --input-dir )
(default=`abi')
-a, --int-vector-as-attribute write integer vectors inside attributes ( It
makes file size smaller )
-e, --input-encoding=STRING input string encoding. Available encodings
listed at:
http://doc.trolltech.com/3.0/qtextcodec.html
(default=`Apple Roman')
To convert a whole directory of ABI files to XML
[erik@linux]$ abi2xml -I dir_with_abi_files -O output_dir
If you want to test abi2xml but you don't have any ABI files, you may use the file staden-src-1-6-0/userdata/Sample_671.ab1 found in the staden-src-1-6-0.tar.gz from the Staden project.
An XSLT script can be useful when you want to retrieve information from the XML file. Take a look in the xslt_examples sub directory. There you find some example scripts.
You run a XSLT script like this:
xsltproc xsltscript abi2xml-generated-xmlfile
QAbifReader, Qt5 ABIF file reader for Genetic Analysis. License: GPL v2. Programming language: C++.
ABIParser.py is a python module for parsing ABI files. License: GPL v2. Programming language: Python.
ABITrace java class in Biojava. License: LGPL. Programming language: Java.
Bio::SeqIO::abi is a perl module in Bioperl for parsing ABI files. It doesn't actually parse the abi files but uses the Staden package for that ( see Section 5.5, “Staden” ). License: "You may distribute this module under the same terms as perl itself". Programming language: Perl.
Emboss includes abiview, an application that parses an abi file and converts the information to vector or bitmap images or to text files. License: GPL. Programming language: C.
Staden has capabilities to extract information from abi trace files ( e.g. the program extract_seq ). License: BSD. Programming language: C.
abi2xml was referenced
-
in the book Plant DNA Barcoding and Phylogenetics from 2015 that is also available as a PDF.
-
in the scientific paper Modification of orthogonal tRNAs: unexpected consequences for sense codon reassignment that was published in Nucleic Acids Reasearch in 2016.
-
in the doctoral dissertation Expanding and evaluating sense codon reassignment for genetic code expansion by Biddle, C. William in 2017.