DCARv2 Genome Paper Variants 2016 JBrowse VCF

Resource Type: 
File
File Type: 
VCF
Download: 
Download File NameAvailable atSizeMD5
58.pubsnpav.vcf.gzCarrotOmics304.35MBcf4b4a193fc19cb495dc0f238891460e
58.pubsnpav.vcf.gz.tbiCarrotOmics216.23KBef5000c025b0936f021d2af7d1584be9
carrot_79200ftp.ncbi.nih.gov
Description: 

Collection of 1,393,425 SNP variants from the genome publication. This is the VCF file used for JBrowse. See the linked analysis below for further details.

There is 1 relationship.
Relationships
The FASTA format, Carrot Genome Assembly DCARv2 Sequence Original Naming Scheme, is reference genome VCF, DCARv2 Genome Paper Variants 2016 JBrowse VCF.
Loading content
References: 
The following records refer to this file:
AnalysisLoading content
Analysis: 
NameDescription

We used BWA-MEM version 0.7.10 to map the resequencing reads from all carrot genotypes to the carrot reference genome using the following parameters -a -M –t 42. Alignments were filtered using SAMtools version 0.1.19 for only primary alignments with quality of at least 30, i.e. parameters -q 30 -F 256. Duplicate reads were marked using MarkDuplicates from Picard tools version 1.119 (https://broadinstitute.github.io/picard/). The GATK version 3.3-0 was used to identify SNP variants for each genotype using the GATK best practices method using RealignerTargetCreator, IndelRealigner, HaplotypeCaller, and GenotypeGVCFs. Then SelectVariants was used to separate SNPs, indels, and other variants. Reads used to construct the doubled haploid reference genome were also analyzed as a control, and variants that were also present here were filtered out with a custom Perl program. Variants were then filtered using VCFTools v0.1.12a with parameters --maf 0.1, --min-meanDP 5, and --max-missing 1.

After filtering and variant detection with GATK from 39,695,937 SNP variants we generated 1,393,425 filtered SNPs.

These variants were submitted to dbSNP, but that database has since limited its coverage to only human variants, the submitted files are only available in an archived form at ftp://ftp.ncbi.nih.gov/snp/organisms/archive/carrot_79200/

Post-publication, the variant file has been further annotated with ANNOVAR which categorized the variants into various categories: intergenic, upstream, downstream, splicing, intronic, exonic:synonymous SNV, exonic:nonsynonymous SNV, exonic:stopgain, exonic:stoploss, and in some cases combinations of these categories. This file can be downloaded from the link below.

Data from this analysis can be viewed in JBrowse here.

Loading content
License: 
NameAttribution 4.0 International (CC BY 4.0)
License Summary

You are free to:

  • Share: copy and redistribute the material in any medium or format

  • Adapt: remix, transform and build upon the material for any purpose, even commercially.

The licensor cannot revoke these freedoms as long as you follow the following license terms:

  • Attribution You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • No additional restrictions You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits

Notices:

  • You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.

  • No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

Full Legal Texthttps://creativecommons.org/licenses/by/4.0/legalcode