GFF-Ex: a genome feature extraction package

Springer Science and Business Media LLC - Tập 7 - Trang 1-3 - 2014
Achal Rastogi1, Dinesh Gupta1
1Bioinformatics Laboratory, Structural and Computational Biology Group, International Center for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi, India

Tóm tắt

Genomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats. Amongst these formats, the GFF (Generic/General Feature Format) has emerged as a widely accepted, portable and successfully used flat file format for genome annotation storage. With an increasing interest in genome annotation projects and secondary and meta-analysis, there is a need for efficient tools to extract sequences of interests from GFF files. We have developed GFF-Ex to automate feature-based extraction of sequences from a GFF file. In addition to automated sequence extraction of the features described within a feature file, GFF-Ex also assigns boundaries for the features (introns, intergenic, regions upstream to genes), which are not explicitly specified in the GFF format, and exports the corresponding primary sequence information into predefined feature specific output files. GFF-Ex package consists of several UNIX Shell and PERL scripts. Compared to other available GFF parsers, GFF-Ex is a simpler tool, which permits sequence retrieval based on additional inferred features. GFF-Ex can also be integrated with any genome annotation or analysis pipeline. GFF-Ex is freely available at http://bioinfo.icgeb.res.in/gff .

Tài liệu tham khảo