Annovar is quite a popular tool to do filtering and annotation of genetic variants on large amounts of sequencing data (like for example in case of whole genome or exome sequencing). Annovar can run on simple desktops, as it does not require a huge computational power, and can annotate single nucleotide variations, small insertions/deletions and even larger structural variations. It outputs annotations and predictions on variant functional consequences, cytogenetic bands, evolutionary conserved regions and allele frequency (based on dbSNP). Annovar can be also potentiated by inducing it to interrogate other databases (as long as these conform to Generic Feature Format 3 – GFF3).
HOW DOES ANNOVAR WORK?
ANNOVAR is a known as a command-line driven software, as it works with text-based input files, where each line correspond to a genetic variant. Typically one line contains the coordinates of the chromosome, the starting and ending nucleotide position and the reference and observed nucleotide(s).
To annotate variants, ANNOVAR needs to download annotation data sets from external databases such as UCSC Genome Browser, RefSeq or Ensembl. ANNOVAR has been designed to detect intronic, exonic, intergenic, 5’/3′-UTR and upstream/downstream variants. The annotation can be done on a gene or on a genomic region basis. To exclude form final results non-sense mutations in genes which frequently harbor such kind of mutations without any deleterious consequence, a list of more 2000 “dispensable” genes has been compiled. Variants in the “disposable” genes are therefore filtered and deleted.
Of note, together with VAT, ANNOVAR is the only annotation software that is capable of handling structural variations (e.g. large deletions/duplications).