What Does "Good" Look Like In RWE Research?
Establishing the Fundamentals for Data Integrity and Governance
A standardised research framework that incorporates current best practices to document the assessment of the real-world data quality, and its fitness for purpose for regulatory and HTA decision-making. The recommendations are organised into three key elements (data extraction, data curation and data characterisation), with the relevant prioritisation for each level of recommendations and rationale included.
This page lists all subrecommendations linked to the overarching recommendation 2, "Establishing the Fundamentals for Data Integrity and Governance". You can use the tiles below to jump directly to a specific subrecommendation.
Subrecommendation 2.1: Data Characterisation
Systematic and standardized approaches to evaluating and documenting the properties, quality, and limitations of data sources will enable assessment of representativeness to the target population, period, and outcomes of interest, while identifying potential biases and constraints.
Rationale
Data sources for RWE can be heterogenous and may sometimes be not fit for purpose. Adequate data characterisations enable the evaluation of the dataset’s representativeness to the target population, the time period and any outcomes of interest or relevance. Systematic and structured reporting of the dataset strengths, limitations and potential biases enhances its transparency and reliability.
Details
The study development should include, as a minimum:
Essential
|
Important
- Representative Population Analysis
- Document population characteristics, including:
- Demographic comparison to target population.
- Selection factors influencing data collection.
- Coverage of relevant subpopulations.
- Generalisability assessment
- Document population characteristics, including:
- Variable Validity Assessment
- For key study variables, document:
- Comparison to gold standard measures (if applicable).
- Validation studies (if available).
- Known measurement errors or biases.
- Construct validity assessment (e.g. observed vs expected distributions).
- For key study variables, document:
Optional
- Data Lineage Visualisation
- Provide visual representation of:
- Data flows from source to analysis.
- Transformation and linkage processes.
- Quality check points.
- Decision points in data processing
- Provide visual representation of:
Subrecommendation 2.2: Data Extraction
Transparent and standardized approaches to retrieving relevant information and preparing data for analysis will demonstrate alignment with study objectives, reduce risks of bias, and ensure consistency and reliability.
Rationale
In RWE studies, transparent and well-documented extraction is essential to show how the data align with study objectives, ensure reproducibility, and minimises risks of bias. Inconsistent documentation of data extraction practices can undermine the credibility of findings and limit their use in decision-making.
Details
The study development should include, as a minimum:
Essential
|
Important
- Implement and document Quality Controls in the study documents, including:
- Validation procedures for extraction tools.
- Procedures for handling missing or inconsistent data.
- Audit trails of extraction processes.
Optional
- Alternative Extraction Strategies, including:
- Document consideration of alternative extraction approaches.
- Justify selection of final extraction methodology.
- Report sensitivity analyses using alternative extraction methods.
Subrecommendation 2.3: Data Curation
Cleaning, standardizing, and organizing data to meet quality standards for analysis will strengthen the validity and reproducibility of results.
Rationale
Data curation steps are essential for maintaining data integrity and ensuring that study results are reproducible and reliable. Without systematic curation practices, issues such as missing values, coding inconsistencies, or undocumented transformations can compromise the validity of findings.
Details
The study development should include, as a minimum:
Essential
|
Important
- Data Quality Metric Reporting
- Implement and report standardised quality metrics:
- Completeness (% of missing values by variable).
- Consistency (internal validation checks passed).
- Plausibility (% of values within expected ranges).
- Timeliness (lag between data collection and availability).
- Implement and report standardised quality metrics:
- Issue Resolution Documentation
- Maintain structured documentation of:
- Data quality issues identified.
- Methods used to address issues.
- Impact assessment of data quality problems.
- Unresolved limitations remaining after curation.
- Maintain structured documentation of:
Optional
These elements represent best practices that further enhance data extraction transparency and trust. They may be useful for complex studies, novel data sources, or submissions where extraction methods may significantly impact results.
- Curation Code Transparency
- Share with decision-makers the curation code or detailed workflows.
- Document software and versions used.
- Provide change logs for iterative curation processes.
| This page belongs to a series of pages about the IDERHA report "Recommendations on policies to support the acceptance of heterogeneous health data research in regulatory and HTA decision-making", published in November 2025. The full report is available as a PDF, or you can visit the page with an executive summary. |