Cancer Data Service (CDS) Data Submission Form – Requiring Revision
Proteomic Data Commons (PDC) Data Submission Form
Integrated Canine Data Commons (ICDC) Data Submission Form
Non-NCI Study Information Form
The following form has been revised to include edited questions, new questions, and be administered as an online form embedded in the Cancer Research Data Commons Data Submission Portal. The CRDC team aims to make this the standardized form for all data submissions to the Cancer Research Data Commons, ideally replacing the Proteomic Data Commons Data and Integrated Data Commons data submission forms once the CRDC data submission process is available to all.
OMB No.:
0925-7775
Expiration Date: 06/30/2025
Collection
of this information is authorized by The Public Health Service Act,
Section 411 (42 USC 285a). Rights of participants are protected by
The Privacy Act of 1974. Participation is voluntary, and there are no
penalties for not participating or withdrawing at any time. Refusal
to participate will not affect your benefits in any way. The
information collected will be kept private to the extent provided by
law. Names and other identifiers will not appear in any report.
Information provided will be combined for all participants and
reported as summaries. You are being contacted by email to complete
this form so that NCI can consider your study for submission into the
Cancer Data Service.
Public reporting burden
for this collection of information is estimated to average 60 minutes
per response, including the time for reviewing instructions,
searching existing data sources, gathering and maintaining the data
needed, and completing and reviewing the collection of information.
An agency may not conduct or sponsor, and a person is not required to
respond to, a collection of information unless it displays a
currently valid OMB control number. Send comments regarding this
burden estimate or any other aspect of this collection of
information, including suggestions for reducing this burden to: NIH,
Project Clearance Branch, 6705 Rockledge Drive, MSC 7974, Bethesda,
MD 20892-7974, ATTN: PRA (0925-7775). Do not return the completed
form to this address.
The following sets of high-level questions are intended to provide an insight to CDS, into the data storage, access and secondary sharing needs and requirements of data submitters. It is requested that the submitters answer as many questions as they can. It is not required to answer all questions.
Data Characteristics
What are the principal types of data the program will be submitting (e.g., genomic, clinical, imaging)?
Will there be additional data types associated with the principal data types, not being submitted to CDS? For ex: Proteomics, Imaging etc.
Do you anticipate other additional data types to be submitted to CDS in future? For example, data type that does not fit the submission criteria to any of the present CRDC nodes.
Is the data from Humans?
NOTE: CDS accepts only Human data at his point.
What additional associated data would you be providing? For ex: Clinical/Phenomics data from study subjects (participants) and/or any other study associated metadata/searchable variables. Describe the format for each.
NOTE: CDS at this point will accept all metadata submitted.
What is the total number of samples and cases per study, being submitted?
For Genomics datasets, though CDS takes BAM files, it is preferred to submit CRAM files. Would you be able to provide CRAM files instead of BAM files?
Data Storage and Management
Who is the PI on the study?
How much data are you planning to submit to CDS?
By data type (if known)?
What is the reason you are looking for storage with CDS? What are your challenges related to the storage of data?
Do you have a preference of AWS versus Google cloud for storage? CDS provides AWS storage as of now and plans to provide Google storage in near future.
Data Submission
Who will submit the data, the PI (or the PI’s team) or a collaborator?
Would there be multiple uploaders (ex: by data type or working groups)?
Is there a program timeline associated with the data Submission?
When do you plan to start submitting data to CDS?
Who is the primary point of contact for data submission?
Do you plan one or multiple submissions to CDS? For example, multiple studies or newer versions of the data for the same study.
If yes, do you have a timeline for the successive submissions?
If this submission has data from a newer version of the study already submitted to CDS, do you want to retain data from the older version/s at CDS?
Do you have an Amazon/Google account for data submission? CDS submissions presently require that the data uploaders have an Amazon account.
Data Sharing
Is your data being released to broader research community for secondary sharing?
When is the data planned to be released for secondary sharing ?
Is your data sensitive, i.e., require controlled access?
Has the data been registered with any public sharing repository such as dbGaP?
If not, is there a reason?
If yes, please share the associated study ID, for ex: dbGaP PHS number.
Is the study RELEASED by dbGaP?
Is data currently shared through NCBI /dbGaP or other means? What is a plausible timeline?
Has the data already been submitted to any data repository? For Ex: SRA
CDS does not allow downloads. Given this, does CDS meet your data sharing needs?
Are there any data access limitations?
Is the data embargoed? If yes, would the data reside in CDS during that time? How would it effect user access?
Is any part of this data “open-access” ? for ex: VCFs from Genomics studies.
How can you assure the data does not contain PII and PHI and/or identifiable data elements?
Data Users/Access
How do the users access the data presently?
How well are these methods working today?
Will your data be made accessible through any other repository?
Data Analysis
For what purpose(s) do the approved users access the data?
Conduct analyses / computations?
Cite in a publication?
Do they need to link this data to other data types in other repositories/CRDC nodes for analysis?
Data Post CDS Destination
Do you know the post-CDS destination for this data? For ex: to other CRDC nodes such as GDC, PDC, IDC etc.
Is there any plan to move data out of CDS buckets, before sharing publicly?
Is there any other information you would like to share about your data?
OMB No.:
0925-7775
Expiration Date: 06/30/2025
Collection
of this information is authorized by The Public Health Service Act,
Section 411 (42 USC 285a). Rights of participants are protected by
The Privacy Act of 1974. Participation is voluntary, and there are no
penalties for not participating or withdrawing at any time. Refusal
to participate will not affect your benefits in any way. The
information collected will be kept private to the extent provided by
law. Names and other identifiers will not appear in any report.
Information provided will be combined for all participants and
reported as summaries. You are being contacted online to complete
this form so that NCI can consider your study for submission into the
Proteomic Data Commons.
Public reporting
burden for this collection of information is estimated to average 30
minutes per response, including the time for reviewing instructions,
searching existing data sources, gathering and maintaining the data
needed, and completing and reviewing the collection of information.
An agency may not conduct or sponsor, and a person is not required to
respond to, a collection of information unless it displays a
currently valid OMB control number. Send comments regarding this
burden estimate or any other aspect of this collection of
information, including suggestions for reducing this burden to: NIH,
Project Clearance Branch, 6705 Rockledge Drive, MSC 7974, Bethesda,
MD 20892-7974, ATTN: PRA (0925-7775). Do not return the completed
form to this address.
Please complete the following document and send to: PDCHelpDesk@mail.nih.gov.
Please include a narrative describing your study and its scientific benefit for inclusion in the Proteomic Data Commons (PDC).
Please include the following information:
Name/Identifier of Study with a brief description
Grant ID and funding source (if applicable)
IRB approval numbers (if applicable)
Scientific Point of Contact (Name, Phone, Email)
Data Manager Point of Contact (Name, Phone, Email)
Data access policy (choose one):
Open-access – no-embargo
Open-access – embargo
Cancer type(s) included in study
Number of cases included in study (please indicate if demographic and diagnosis data are available)
Information on the Proteomic Data Analysis Protocol
Type of acquisition – DDA, DIA
Experiment type – Label Free, iTRAQ, TMT, etc.
Analytical fractions – Proteome, phosphoproteome, etc.
Instrument make and model
Additional proteomic data analysis protocol including experimental design
Additional data types included in study and experimental strategies used (list all that apply and indicate target repository for additional data types such as the National Cancer Institute’s Genomic Data Commons):
Imaging
Genomics
Immunology
Clinical
Other (specify)
Amount of data (in TB, # of files)
Include description of treatment, relapse/recurrence, and/or outcome data available with this dataset (if applicable)
The overall scientific benefit of including this study in the PDC
Publications associated with this study, if any.
Time constraints on processing/loading/releasing the data to the public
Data standards used, if any.
Please attach (if available):
Data Dictionary
Biospecimen and experiment metadata
Data
Model/Schema diagram indicating how collected data relates to
subjects, visits, samples, etc.
OMB No.:
0925-7775
Expiration Date: 06/30/2025
Collection
of this information is authorized by The Public Health Service Act,
Section 411 (42 USC 285a). Rights of participants are protected by
The Privacy Act of 1974. Participation is voluntary, and there are no
penalties for not participating or withdrawing at any time. Refusal
to participate will not affect your benefits in any way. The
information collected will be kept private to the extent provided by
law. Names and other identifiers will not appear in any report.
Information provided will be combined for all participants and
reported as summaries. You are being contacted online to complete
this form so that NCI can consider your study for submission into the
Integrated Canine Data Commons.
Public
reporting burden for this collection of information is estimated to
average 30 minutes per response, including the time for reviewing
instructions, searching existing data sources, gathering and
maintaining the data needed, and completing and reviewing the
collection of information. An agency may not conduct or sponsor, and
a person is not required to respond to, a collection of information
unless it displays a currently valid OMB control number. Send
comments regarding this burden estimate or any other aspect of this
collection of information, including suggestions for reducing this
burden to: NIH, Project Clearance Branch, 6705 Rockledge Drive, MSC
7974, Bethesda, MD 20892-7974, ATTN: PRA (0925-7775). Do not return
the completed form to this address.
Please
complete the following document and send to:
icdchelpdesk@mail.nih.gov.
Please include a narrative describing your study and its scientific
benefit for inclusion in the ICDC.
Please include the following information along with the narrative:
Name/Identifier of Study
Grant ID and funding source (if applicable)
IACUC/IRB approval numbers (if applicable)
Scientific Point of Contact (Name, Phone, Email)
Data Manager Point of Contact (Name, Phone, Email)
Data access policy (choose one): Open-access – no-embargo, Controlled-access – no embargo, Open-access – embargo, Controlled-access - embargo
Cancer type(s) included in study
Number of subjects included in study
Sample Source (e.g., CCOGC, other biospecimen repository, self-collected) - if other than self-collected, those identifiers will be required during submission
If self-collected, was a replicate sample also submitted to another biospecimen repository (e.g., CCOGC). If so, those identifiers will be required during submission.
Data types included in study (check all that apply): Imaging, genomics, proteomics, immunology, clinical, other (specify)
Amount of data (in TB)
The overall scientific benefit of including this study in the ICDC prototype
Any publications associated with this study, if any
Time constraints on processing/loading/releasing the data
Data standards used, if any (e.g., SEND)
Anticipated budget needed to prepare data set for submission
Please attach (if available):
1. Data Dictionary specific to study
2. Data Model/Schema diagram indicating how collected data relates to subjects, visits, samples, etc.
File Type | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
Author | Mencarelli, Anna (NIH/NCI) [C] |
File Modified | 0000-00-00 |
File Created | 2025-09-19 |