TogoVar datasets (GRCh38)

Variant frequencies for which you can apply for use of individual-level data∗1 to the NBDC human databases∗2

Click the links at the Included controlled-access datasets to apply for use of individual-level data

Variant dataset nameAnalysis methodLiftover GRCh37Target populationHealthy subjectsAffected subjectsSample sizeNumber of allelesIncluded controlled-access datasets
GEM Japan Whole Genome Aggregation (GEM-J WGA) PanelWGSJapanese7,60995,555,7406 datasets
JGA-SNPSNP-ChipJapanese183,8841,249,6713 datasets
JGA-WESWESJapanese1254,668,805
7 datasets
JGA-WGSWGSJapanese7820,822,4134 datasets

∗1:fastq/bam/cel files and/or lists of genotype data etc.
∗2:Japanese Genotype-phenotype Archive (JGA) / AMED Genome group sharing Database (AGD)

Other variant frequency datasets

Variant dataset nameAnalysis methodLiftover GRCh37Target populationHealthy subjectsAffected subjectsSample sizeNumber of allelesAuthorVersion/Last updated
Genome Aggregation Database (gnomAD) exomesWESMixed730,947183,717,261Broad Institutev4.1
Genome Aggregation Database (gnomAD) genomesWGSMixed76,215759,336,320Broad Institutev4.1
NCBNWGSMixedJPN:9,290 1KGP:2,504215,729,032National Center Biobank Network (NCBN)2024/6/28
ToMMo 54KJPN Allele Frequency Panel(54KJPN)WGSJapanese54,302252,777,334Tohoku Medical Megabank Organizationv20230626

Note: 54KJPN consists of SNVs (Autosome, chrX(PAR1+PAR2+XTR) and chrMT) and INDELs (Autosome and chrX(PAR1+PAR2+XTR)).

Non-variant frequency datasets

Dataset nameVersion/Last updateDescriptionAuthor
ClinVar2024/10/03Clinical significance of variantsNCBI
ColilObtained by APIInformation on citation relationships in life sciences literatureDBCLS
GRCh38.p132019/03/01Human genome reference sequenceGRC
GWAS Catalog2024/10/15Catalog of human genome-wide association studiesNHGRI-EBI
HGNC symbol report2024/08/27Approved human gene nomenclature and associated gene informationHGNC
LitVarObtained by APIInformation on papers in which the names of variants appearNCBI
MedGen2024/10/15Information about conditions and phenotypes related to Medical GeneticsNCBI
MGeND2024/03/08Clinical significance of variants collected from the Japanese populationNCGM
PubMed2024/10/14Information on papersNCBI
PubTator Central2024/01/04Information on papers in which the names of variants appearNCBI

Note: TogoVar obtained ClinVar variants from the VCF filethat contains only variants for which GRCh38 positions were determined.
Note: Disease names in MGeND were mapped to MedGen using a proprietary method, so they may not match the disease names listed in MGeND.

Tools for data processing

NameVer.DescriptionAuthor
bcftools1.20Split multiallelic sites into biallelic variants, exclude reference mismatches, and normalize themGenome Research Ltd.
BioReTIdentify germline short variants (SNPs and Indels) in WES from JGA to produce a joint callset in VCF formatAmelieff
GATK Best Practice - Germline short variant discovery (SNPs and Indels)4Identify germline short variants (SNPs and Indels) in WGS from JGA to produce a joint callset in VCF format.Broad Institute
transanno0.45Accurate VCF LiftOver tool for converting genome assemblies from GRCh37 to GRCh38OKAMURA, Yasunobu
Variant Effect Predictor (VEP)Ensembl release 112Add Annotations like gene names, consequences or deleterious predictions (AlphaMissense, SIFT and Polyphen) to variantsEMBL-EBI