
Genomics England Data Submission Guidance
Learn about submitting data for the 100,000 Genomes Project, data flow processes, common issues like dropouts and data validation rules, and how to address them to ensure successful submission to Mercury. Understand tools like LabKey and BuRST for managing genetic data effectively.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Submitting Data for the 100,000 Genomes Project 25.10.16
Data Flow LabKey Tool for viewing the database Viewing data only OpenClinica, Genie etc GMC USERS Data Entered e.g. XML files (registration etc) Sample data UKB ILLUMINA Sequence INTERPRETATION Mercury Holds the data CSV files Sample data (sample metadata) Targeted data set BuRST Sends success/failure messages on submission BuRST Messages Interpretation Report 21 March 2025 2
Data Dropouts LabKey Tool for viewing the database Viewing data only Invalid data e.g. email address with .ccom at end 2 OpenClinica, Genie etc GMC USERS Data Entered e.g. XML files (registration etc) Sample data UKB ILLUMINA Sequence INTERPRETATION Mercury Holds the data CSV files Sample data (sample metadata) 1 3 Data saved locally but not submitted Invalid data e.g. Group size issues, Participants not registered, Tumour & Germline Targeted data set 4 Missing data e.g. HPO terms, Diagnosis (Cancer) BuRST Sends success/failure messages on submission BuRST Messages Interpretation Report 21 March 2025 3
Data Dropouts Data stored locally but not submitted to Mercury. 1 ISSUE - It is possible (especially in OpenClinica) to save data for future updates. This data has not yet been submitted to Mercury HOW TO IDENTIFY data is visible in your local tool, but is not visible in LabKey RESOLUTION retry the submission of data from the local tool When data is submitted in OpenClinica, it will apply validation rules which are specified in the Genomics England Data Model 2 Invalid data failing Mercury validation rules ISSUE - There are a small number of rules in Mercury that are not applied in OpenClinica phone numbers and email addresses must be in a valid format HOW TO IDENTIFY your data will be rejected by Mercury, generating a Burst message which will be emailed to you, provided you are set up to receive alerts RESOLUTION Correct the data in your local system and resubmit 21 March 2025 4
Data Dropouts Group size issues 3 ISSUE Group size in data does not match number of samples submitted under that Family ID HOW TO IDENTIFY Burst Message received rejecting data due to incorrect group size, or group size not provided RESOLUTION Usually either updating the group size to match the family size, or ensuring all samples within same family have same Family ID 4 Participant not registered ISSUE Sample provided for participant not registered in Mercury HOW TO IDENTIFY Burst message indicates participant not registered RESOLUTION Review data in your local system and attempt to resubmit registration 5 Tumour and Germline ISSUE Sample file does not contain both tumour and germline, and has not been previously submitted HOW TO IDENTIFY Burst message indicates missing element RESOLUTION combine tumour and germline data in one file if available. Withdraw sample if both elements to available 6 Targeted data set not available ISSUE Insufficient data available for interpretation. Missing data will generate queries which will be directed to GMCs. 21 March 2025 5