Collected from 50,000 UK Biobank participants, the exome sequence data was generated that the Regeneron Genetics Center (RGC) through a collaboration with UK Biobank, Regeneron, and GlaxoSmithKline (GSK). The data is linked to detailed de-identified health records, imaging, and other additional health-related data.
Compiled data from the first 50,000 participants will now be incorporated back into the UK Biobank resource for the global health research community following a brief research period for Regeneron and GSK.
Sequencing was performed by the RGC, one of the world’s largest human genetics sequencing and research programs. The RGC is currently sequencing at a rate of 500,000 exomes per year.
Regeneron, a consortium of biopharma companies including AbbVie, AstraZeneca, Pfizer, and Takeda, will complete the exome sequencing of the remaining 450,000 UK Biobank participants by 2020. As the data is completed it will be released over the next two years.
Aris Baras, senior vice president and head of the RGC said in a statement, “We believe this is the largest open-access resource of exome sequence data linked to robust health records in the world – and this is just the beginning.”
Baras told us, that members of the pharmaceutical industry are using ‘big data’ in multiple sectors of research.
He also said that it is important to note that the volume of data is certainly important and, “the diversity of these datasets in terms of which people, and their records, are included in the larger dataset and the consistency of this data across the entire dataset.”
RGC stated that actionable information is a resource that can be used to accelerate science and improved patient care.
Data resource advances
Regeneron and GSK released a preprint of a manuscript describing findings from an examination of the first 50,000 exomes.
GSK is also incorporating the daily advances in genetics and genomics into its drug research programs to form collaborations and is working closely with other world-leading organizations.
Tony Wood, senior vice president of Medicinal Science and Technology at GSK, said in a statement, “Genetics is playing an increasingly important role in research, and by generating and now integrating these exome data, UK Biobank has some of the richest health and genetics data available for use by the broader scientific community to enhance their understanding and research effort.”
From this data, new disease-related algorithms are provided on asthma, kidney disease, dementia, and Parkinson’s disease.
The exome makeups up to 1-2% of the human genome where the actual protein-coding genes are contained and are believed to have the most relevance for discovering genetic variants that may inform the discovery and development of improved medicines.