TogoVar datasets (GRCh38)
Variant frequencies for which you can apply for use of individual-level data∗1 to the NBDC human databases∗2
Click the links at the Included controlled-access datasets to apply for use of individual-level data
Variant dataset name | Analysis method | Liftover GRCh37 | Target population | Healthy subjects | Affected subjects | Sample size | Number of alleles | Included controlled-access datasets |
---|---|---|---|---|---|---|---|---|
GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel | WGS | ✔ | Japanese | ✔ | 7,609 | 95,555,740 | 6 datasets | |
JGA-SNP | SNP-Chip | ✔ | Japanese | ✔ | ✔ | 183,884 | 1,249,671 | 3 datasets |
JGA-WES | WES | ✔ | Japanese | ✔ | ✔ | 125 | 4,668,805 | 7 datasets |
JGA-WGS | WGS | Japanese | ✔ | ✔ | 78 | 20,822,413 | 4 datasets |
∗1:fastq/bam/cel files and/or lists of genotype data etc.
∗2:Japanese Genotype-phenotype Archive (JGA) / AMED Genome group sharing Database (AGD)
Other variant frequency datasets
Variant dataset name | Analysis method | Liftover GRCh37 | Target population | Healthy subjects | Affected subjects | Sample size | Number of alleles | Author | Version/Last updated |
---|---|---|---|---|---|---|---|---|---|
Genome Aggregation Database (gnomAD) exomes | WES | Mixed | ✔ | ✔ | 730,947 | 183,717,261 | Broad Institute | v4.1 | |
Genome Aggregation Database (gnomAD) genomes | WGS | Mixed | ✔ | ✔ | 76,215 | 759,336,320 | Broad Institute | v4.1 | |
NCBN | WGS | ✔ | Mixed | ✔ | ✔ | JPN:9,290 1KGP:2,504 | 215,729,032 | National Center Biobank Network (NCBN) | 2024/6/28 |
ToMMo 54KJPN Allele Frequency Panel(54KJPN) | WGS | Japanese | ✔ | 54,302 | 252,777,334 | Tohoku Medical Megabank Organization | v20230626 |
Note: 54KJPN consists of SNVs (Autosome, chrX(PAR1+PAR2+XTR) and chrMT) and INDELs (Autosome and chrX(PAR1+PAR2+XTR)).
Non-variant frequency datasets
Dataset name | Version/Last update | Description | Author |
---|---|---|---|
ClinVar | 2024/10/03 | Clinical significance of variants | NCBI |
Colil | Obtained by API | Information on citation relationships in life sciences literature | DBCLS |
GRCh38.p13 | 2019/03/01 | Human genome reference sequence | GRC |
GWAS Catalog | 2024/10/15 | Catalog of human genome-wide association studies | NHGRI-EBI |
HGNC symbol report | 2024/08/27 | Approved human gene nomenclature and associated gene information | HGNC |
LitVar | Obtained by API | Information on papers in which the names of variants appear | NCBI |
MedGen | 2024/10/15 | Information about conditions and phenotypes related to Medical Genetics | NCBI |
MGeND | 2024/03/08 | Clinical significance of variants collected from the Japanese population | NCGM |
PubMed | 2024/10/14 | Information on papers | NCBI |
PubTator Central | 2024/01/04 | Information on papers in which the names of variants appear | NCBI |
Note: TogoVar obtained ClinVar variants from the VCF filethat contains only variants for which GRCh38 positions were determined.
Note: Disease names in MGeND were mapped to MedGen using a proprietary method, so they may not match the disease names listed in MGeND.
Tools for data processing
Name | Ver. | Description | Author |
---|---|---|---|
bcftools | 1.20 | Split multiallelic sites into biallelic variants, exclude reference mismatches, and normalize them | Genome Research Ltd. |
BioReT | ‐ | Identify germline short variants (SNPs and Indels) in WES from JGA to produce a joint callset in VCF format | Amelieff |
GATK Best Practice - Germline short variant discovery (SNPs and Indels) | 4 | Identify germline short variants (SNPs and Indels) in WGS from JGA to produce a joint callset in VCF format. | Broad Institute |
transanno | 0.45 | Accurate VCF LiftOver tool for converting genome assemblies from GRCh37 to GRCh38 | OKAMURA, Yasunobu |
Variant Effect Predictor (VEP) | Ensembl release 112 | Add Annotations like gene names, consequences or deleterious predictions (AlphaMissense, SIFT and Polyphen) to variants | EMBL-EBI |