Skip to content

ERV Project Data

Robert J. Gifford edited this page Dec 4, 2024 · 3 revisions

Genome Feature Definitions

Putative ancestral open reading frames (ORFs) encoding accessory genes have been identified in many endogenous lentiviruses. In some cases these represent clear homologs of the accessory genes found in exogenous lentiviruses, but in others it remains unclear whether they are homologs of previously described accessory genes or distinct genes. We therefore extended the set of lentivirus genome features defined in our core project to include these putatively distinct genes.


Consensus Reference Sequences

For each lentiviral paleovirus species, we have defined a "master" reference sequence. These represent reconstructed genomes of ancestral lentiviruses, which gave rise to ERV lineages.

Reference Sequences

Click to download each sequence in FASTA format:

Supplemental Resources

Annotated Consensus Reference Sequences (PDFs)

Figures illustrating snnotated paleovirus reference sequences, as collated from published manuscripts, are available in PDF format for each species:

  • Rabbit endogenous lentivirus K: RELIK
  • Prosimian immunodeficiency virus 1: pSIV-1
  • Prosimian immunodeficiency virus 2: pSIV-2
  • Mustelid endogenous lentivirus: MELV
  • Dermopteran endogenous lentivirus: DELV
  • Springhare endogenous lentivirus: SpELV

Raw Sequence Data (Individual ERV Loci)

Overview

  1. FASTA sequences: these sequences represent lentiviral ERVs identified in the whole genome sequences of vertebrate species. ERVs were identified by screening genomes using the database-Integrated Genome Screening Tool. A complete list of genome assemblies screened is available here.

  2. GenBank XML sequences: ERV sequences that were generated via PCR and submitted to GenBank.

Supporting Files


ERV Locus Nomenclature

We have applied a systematic approach to naming lentivirus ERV lineages and loci.

Each element was assigned a unique identifier (ID) constructed from a defined set of components.

ERV Nomenclature

The first component is the classifier ‘ERV’ (endogenous retrovirus).

The second component is a composite of two distinct subcomponents separated by a period:

(i) the name of lentivirus the ERV derives from;
(ii) a numeric ID that identifies a unique insertion locus. Orthologous copies in different species are given the same number.

The third component of the ID defines the set of host species in which the ortholog occurs.


Clone this wiki locally