Skip to content

Bioinformatics pipeline for lncRNA prediction in Arabiodpsis thaliana

License

Notifications You must be signed in to change notification settings

vivek37373/lncFETCHER

Repository files navigation

lncFETCHER

lncFETCHER is a bioinformatics pipeline designed for predicting long non-coding RNAs (lncRNAs) in Arabidopsis thaliana using a pseudoalignment guided approach. This pipeline facilitates the identification of lncRNAs from large-scale RNA-seq datasets. It has been developed for our work titled "Pseudoalignment-guided lncRNA identification with extended multi-omics annotations in Arabidopsis thaliana."

Please note that it is still under development for users, and users are advised to run each step individually. Since it includes manipulation of several files, most of the steps may require manual intervention, especially if you are using a genome other than Arabidopsis. In this work, we have used files of Arabidopsis thaliana - Genome assembly: TAIR10 (https://plants.ensembl.org/Arabidopsis_thaliana/Info/Index).

Installation

  1. Clone the repository:
git clone https://github.com/vivek37373/lncFETCHER.git
  1. Download and install dependencies:

    • Programs:
      • fastq-dl
      • trim_galore
      • RSEM
      • salmon
      • hisat2
      • stringtie
      • samtools
      • gffread
      • gffcompare
      • cgat
      • FEELnc
      • cpc
      • featurecounts

Usage

The lncFETCHER.sh script provides instructions for each step of the analysis in comments. It guides users through mapping data to the reference genome, assembling the transcriptome, creating a combined transcriptome, and predicting lncRNAs. Before running the pipeline, ensure that you have downloaded essential files such as the genome FASTA, annotation files in GFF/GTF format, and transcript fasta in the ref folder.

References

  1. Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples
  2. The long non-coding RNA landscape of Candida yeast pathogens

For any questions or requests, please contact:

Feel free to reach out if you have any inquiries or require further assistance.

About

Bioinformatics pipeline for lncRNA prediction in Arabiodpsis thaliana

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages