Overview
This guide explains how to document and organize your research data using metadata - descriptive information that helps others find, understand, and reuse your work.
The process typically takes 1-2 weeks, depending on:
-
Amount of data to document
-
Complexity of your experimental setup
-
Number of templates needed
-
Any validation issues that need addressing
The process involves collaboration between you (the contributor) and MC2 Center staff so your data is:
-
Well-documented with standardized descriptions
-
Properly organized for easy access
-
Linked together in meaningful ways
-
Ready for others to discover and reuse
This approach follows FAIR principles, making your data:
-
Findable: Others can discover your data
-
Accessible: Clear access requirements
-
Interoperable: Uses standard formats
-
Reusable: Well-documented for reuse
CRITICAL: Working with Templates
Throughout this process, you'll work with metadata templates provided as Google Sheets. These templates help capture important information about your data in a standardized format. To ensure your metadata can be processed correctly:
ONLY record metadata in templates linked in your Synapse Project
Why This Process Matters
Each step in this process serves an important purpose:
-
Folder Organization
-
Makes data easy to find
-
Supports automated processing
-
Metadata Templates
-
Captures essential information
-
Provides data discovery
-
Submission Order
-
Maintains data relationships
-
Prevents missing links
-
Supports validation
-
Component IDs
-
Create clear relationships
-
Support data tracking
-
Allow future updates
Understanding Metadata Types
We use two primary types of metadata:
-
Record-based Metadata
-
Describes things like:
-
Study information - Patient demographics
-
Human participants or model systems - Cell line information
-
Biospecimens - Sample processing methods
-
Experimental details - Imaging parameters
-
-
File-based Metadata
-
Describes the actual data files such as:
-
FASTQ files from sequencing
-
Microscopy images
-
Analysis results
-
Supporting documentation
-
These two metadata types work together to tell the complete story of your research:
Study
├── Participants/Models ─┐
├── Biospecimens ────────┼──> Data Files
└── Experimental Setup ──┘
Required Access
For Contributors:
-
Access to your grant-specific Synapse project
-
Access to metadata templates (provided by MC2 Center)
For MC² Center Staff:
-
Administrative access to Synapse
-
Access to schematic CLI tools
-
Access to validation scripts
Instructions
-
Understand Your Data Organization
🔷 Contributor Role: Review how your Synapse project is structured.
Content should be organized in a standard folder structure:
Project/
├── data/ # Your research data files
├── studies/ # Study information
├── biospecimens/ # Sample metadata
├── models/ # Model system and cell line metadata
├── individuals/ # Human patient metadata
├── sharing_plans/ # Data sharing info
├── governance/ # Governance documentation
├── publications/ # Metadata for publications
├── datasets/ # Metadata for released datasets
├── tools/ # Metadata for released tools
├── education/ # Metadata for released educational resources
-
Access Your Templates
🔷 Contributor Role: Get your metadata templates.
CRITICAL: Working with Templates
Throughout this process, you'll work with metadata templates provided as Google Sheets. These templates help capture important information about your data in a standardized format. To ensure your metadata can be processed correctly:
ONLY record metadata in templates linked in your Synapse Project
To access your available metadata templates, navigate to the relevant folder in your Synapse project and select the linked sheet.
Example:
-
to access the Biospecimen template, open the biospecimens/ folder
-
to access the File View template, open the data/ folder
The linked template will be named according to the format: [grant number]_[data type]_[version]
Example: CA123456_Biospecimen_v10.0.0
Your Data Sharing Plan will be used to document which metadata templates to complete for your datasets.
-
For metadata types that apply to more than one dataset (e.g., Biospecimen, File View, Individual, Model), additional rows will be added to your Data Sharing Plan
-
For all Data Sharing Plan entries, the applicable metadata templates will be linked in column Y, “DSP Dataset Metadata“
-
Follow the Completion Order
🔷 Contributor Role: Complete templates in this order:
-
Model/Individual information (if applicable)
-
Describes your experimental system
-
Must come before Biospecimen data
-
Biospecimen information (if applicable)
-
Links samples to models/individuals
-
Must come before file metadata
-
File View metadata
-
Describes your actual data files
-
Links files to samples and study
-
Assay-specific metadata
-
Additional details about specific methods
-
Example: Imaging or sequencing parameters
-
Please contact the MC2 Center for guidance on preparing assay-specific metadata
-
Resource metadata (can be submitted independently of metadata listed above)
-
Information about publications, datasets, computational tools, and educational resources associated with your grant
Why this order matters:
-
Assay-specific metadata typically includes “Key”-type attributes that link files and information.
-
Properly submitting and preparing metadata helps to ensure that content can be linked appropriately.
Study ID
├── Model/Individual ID
│ └── Biospecimen ID
│ └── File ID
└── Dataset ID
-
Record Your Metadata
🔷 Contributor Role: Fill out your templates.
For each template:
-
Look for field descriptions in column headers
-
Hover over column names for detailed descriptions
-
Required fields are highlighted in blue
-
Optional fields provide additional context
-
Check “Sheet 2” for valid values:
-
Click “Sheet 2” tab at bottom (unhide if needed)
-
Find your column of interest
-
Use exact values from "Valid Values" column
-
Multiple values? Use commas to separate
-
Use comma-separated lists for multiple values
Example: "RNA-seq, ATAC-seq, ChIP-seq"
Common information sources:
-
Data files themselves
-
Quality control reports
-
Lab notebooks
-
Protocol documents
-
Analysis outputs
-
Publications
Upload any reference documents you used to the documentation folder
-
Understanding Component IDs
🔷 Contributor Role: Each entry needs a unique ID.
IDs follow these patterns:
-
Study: [Grant number]-[Journal/Type]-[Date]
-
Model: [Grant number]-M[Number]
-
Individual: [Grant number]-IND[Number]
-
Biospecimen: [Parent ID]-B[Number]
Example flow:
Study: GRANT123-CELL-2024
└── Model: GRANT123-M1
└── Biospecimen: GRANT123-M1-B1
└── File: syn789012 (Synapse ID)
-
Submit for Validation
🔷 Contributor Role: Let MC² Center know when you're done.
If you are a contributor, update the MC² Center and STOP here.
🔶 MC2 Center Role: We will:
-
Download your completed templates
-
Run validation checks to ensure:
-
All required fields are complete
-
Values match expected formats
-
IDs and keys are properly linked
-
Relationships between records are valid
-
Provide feedback if updates are needed
-
Upload validated metadata to Synapse
If validation fails:
-
We'll provide detailed feedback about:
-
Which fields need attention
-
What the specific issues are
-
How to correct the problems
-
You can then:
-
Make the requested updates
-
Ask questions if anything is unclear
-
Resubmit for validation
Common validation issues to watch for:
-
Missing required fields
-
Incorrect date formats
-
Invalid ID patterns
-
Missing relationships between records
-
Incorrect terms or spellings
-
Values not from approved list
-
After Validation
🔶 MC2 Center Role: Once validation is successful, we:
-
Convert metadata to proper format
-
Upload to Synapse project
-
Apply metadata to relevant files
-
Update portal database
-
Prepare for eventual release
This process ensures your data is:
-
Properly documented
-
Correctly linked
-
Ready for discovery
-
Prepared for sharing
Example Workflows
Example 1: Imaging Dataset
A researcher wants to share microscopy data with some participant information:
-
Complete templates in order:
-
Individual template (participant info)
-
Biospecimen template (sample info)
-
File View template (file info)
-
Imaging Channel template
-
Imaging Level 2 template
-
-
Link everything together:
Study (GRANT123-IMG-2024)
├── Individual (GRANT123-IND1)
│ └── Biospecimen (GRANT123-IND1-B1)
│ └── Image Files (syn789012)
└── Dataset (syn456789)
Example 2: GeoMx Dataset
A researcher wants to share spatial genomics data:
-
Upload supporting files first:
-
Experimental config file
-
Probe config file
-
Lab worksheet
-
ROI coordinate files, if applicable
-
Complete templates in order:
-
Individual template (participant info)
-
Biospecimen template (sample info)
-
File View template
-
Imaging Channel template
-
ROI/segment template
-
GeoMx Auxiliary files template
-
GeoMx Level 1 template
-
GeoMx Level 2 template
-
GeoMx Level 3 template
-
GeoMx Imaging template
-
-
Example organization:
Project
├── biospecimens
└── Biospecimen metadata template
├── individuals
└── Individual metadata template
├── imaging_channel
└── Imaging channel metadata template
└── [study_id]/
├── File View metadata template
├── Auxiliary files/
└── Auxiliary files metadata template
├── ROI Data/
└── ROI metadata template
├── GeoMx Level 1/
└── GeoMx Level 1 metadata template
├── GeoMx Level 2/
└── GeoMx Level 2 metadata template
├── GeoMx Level 3/
└── GeoMx Level 3 metadata template
└── GeoMx Imaging/
└── GeoMx Imaging metadata template
Validation Process Timeline
The validation process typically takes:
-
Initial review: 1-2 business days
-
Each revision cycle: 1-2 business days
-
Final validation: 1-2 business days
Factors that can affect timing:
-
Number of templates to validate
-
Complexity of relationships
-
Number of validation issues
-
Response time for revisions
Example Templates
CRITICAL: Working with Templates
Throughout this process, you'll work with metadata templates provided as Google Sheets. These templates help capture important information about your data in a standardized format. To ensure your metadata can be processed correctly:
ONLY record metadata in templates linked in your Synapse Project
Resource metadata example templates:
Need Help?
We're here to support you through this process. Don't hesitate to Contact Us if you have questions or need guidance at any step.
Additional Resources