The OurHealth study of Cardiovascular Disease in South Asians was selected for the PRIMED Supplemental Genotyping Program to undergo Blended Genome-Exome sequencing of 1,000 study participants.
The OurHealth dataset
OurHealth data is now available on the AnVIL platform and includes self-reported basic demographics, anthropometric traits, cardiometabolic outcomes, and Blended Genome-Exome (BGE) sequencing data on 621 study participants.
Accessing OurHealth data
Researchers from the scientific community can apply for controlled access to OurHealth data stored on the AnVIL platform in one of two ways, described briefly below. For both cases, applicants should submit a Data Access Request (DAR) for dbGaP accession number phs003821.
Note: regardless of which way you apply, access will be granted to the same data in AnVIL. The process for application, renewal, and approval are what differ.
- dbGaP:
- Follow the NIH Scientific Data Sharing instructions for How to Request and Access Datasets from dbGaP.
- Submit a DAR for phs003821.
- Data Use Oversight System (DUOS):
- Follow the instructions in the DUOS FAQ: How do I make a data access request in DUOS?
- Submit a DAR for phs003821.
- See also the DUOS FAQs and Researcher FAQs.
Working with OurHealth data
AnVIL is the primary repository for OurHealth data. AnVIL provides controlled-access data storage and a cloud-based analysis environment for researchers.
Refer to the Getting Started on AnVIL Book, which covers many topics in detail, including:
- AnVIL Account Setup for Data Analysts
- AnVIL Billing Account and Project Setup for PIs and Lab Managers
- AnVIL documentation for accessing and analyzing AnVIL data in Terra
Details on methods and QC are available on the OurHealth dbGaP study webpage. The dataset on AnVIL includes BGE data in CRAM format for 621 samples, as well as genotypes in build GRCh38 imputed to the combined 1000G+HGDP reference panel using GLIMPSE2, available in multi-sample VCF and PLINK2 files. Survey-based cardiometabolic outcomes, anthropometric traits, and detailed population descriptors are also available.
The OurHealth data aligns with the PRIMED Data Model, which is available on GitHub.
Future Data Releases
Future releases of OurHealth data will include additional phenotypes and BGE data on additional study participants. These releases will be available through the phs003821 dbGaP study accession.
Acknowledgements and attribution
Funding for BGE sequencing and analyses are provided through the NIH-funded Polygenic Risk Methods Development (PRIMED) Consortium (U01HG011719 and U01HG011697; see the OurHealth Acknowledgements and attribution statement).