Skip to content

Details & API

The API is comprised of a hierarchical set of types that mirror the structure of risk-of-bias assessment frameworks like RoB2. This design enables systematic evaluation of research manuscripts through structured questioning and evidence-based responses.

  • Frameworks define the top-level assessment structure containing multiple evaluation domains. In the RoB2 context, a Framework represents the complete "Risk of Bias 2" tool for randomized trials, organizing five primary assessment domains and an overall judgement domain into a cohesive evaluation instrument.

  • Domains represent specific categories of bias within a framework. Each domain focuses on a particular aspect of study methodology that could introduce bias. For RoB2, the five primary domains are: (1) bias from randomization process, (2) bias from deviations from intended interventions, (3) bias from missing outcome data, (4) bias in outcome measurement, and (5) bias in selection of reported results. An additional "Overall" domain records the final risk-of-bias judgement and predicted direction of bias.

  • Questions are the specific signaling questions within each domain that guide the assessment process. These questions are designed to systematically evaluate potential sources of bias, with predefined allowed answers (typically "Yes", "Probably Yes", "Probably No", "No", "No Information", "Not Applicable"). But if the user requests no predefined answers (using None), then a string answer will be provided. Questions can be marked as required and are indexed for systematic processing.

  • Responses capture the AI model's assessment for each question, structured as ReasonedResponseWithEvidence objects that include:

  • evidence: List of text excerpts from the manuscript that support the assessment
  • reasoning: The model's explanation of how the evidence leads to the conclusion
  • response: The selected answer from the allowed options, or a self selected string if None is provided for the allowed options.

This hierarchical structure ensures that bias assessments are systematic, traceable, and evidence-based, following established methodological guidelines while leveraging AI capabilities for efficient manuscript analysis.

Type Definitions

Framework

risk_of_bias.types._framework_types.Framework

Bases: BaseModel

The top-level container for risk-of-bias assessment frameworks.

A Framework represents a complete methodological assessment tool (e.g., RoB2) that systematically evaluates potential sources of bias in research studies. Frameworks organize the assessment process through a hierarchical structure:

Framework → Domains → Questions → Responses

The Framework serves as both a template (defining the assessment structure) and a container for results (storing responses after manuscript analysis). When a manuscript is analyzed, the AI model works through each question in sequence, populating the response fields with evidence-based assessments.

Attributes:

Name Type Description
domains list[Domain]

The assessment domains that comprise this framework. Each domain focuses on a specific category of potential bias.

name str

A descriptive name for the framework (e.g., "RoB2 Framework for Randomized Trials").

manuscript (str, optional)

The filename (without path) of the manuscript being assessed.

assessor str | None

The name or identifier of the person or system performing the assessment. This can be useful for tracking who completed the assessment, especially in collaborative environments.

__str__()

Provide a comprehensive human-readable representation of the Framework.

This method creates a structured text representation that displays the complete assessment framework hierarchy, making it easy to review the framework structure and any completed assessments. The output format is designed for readability and debugging purposes.

The string representation includes: - Framework name and overview - Each domain with its index and name - All questions within each domain with their indices and text - Allowed answer options for each question (if defined) - Response details for answered questions, including: - The selected response - The reasoning behind the assessment - Supporting evidence excerpts from the manuscript - Clear indication of unanswered questions

Returns:

Type Description
str

A multi-line string representation of the framework showing: - Framework structure (domains and questions) - Assessment progress (which questions have been answered) - Complete response details for answered questions

Examples:

>>> framework = Framework(name="RoB2 Framework")
>>> print(framework)
Framework: RoB2 Framework
Domain 1: Randomization Process
  Question 1.1: Was the allocation sequence random? (['Yes', 'Probably Yes',
  ...])
    Response: Yes
      Reasoning: The study clearly describes using computer-generated
      randomization
        Evidence: "Participants were randomized using a computer-generated
        sequence"
Notes

This method is particularly useful for:

  • Reviewing framework structure during development
  • Debugging assessment workflows
  • Generating human-readable reports of completed assessments
  • Understanding the hierarchical organization of bias assessment questions
export_to_html(path)

Export the framework as an HTML document.

Parameters:

Name Type Description Default
path Path

Destination file for the HTML representation.

required
export_to_markdown(path)

Export the framework as a Markdown document.

This method creates a structured Markdown representation of the framework and its assessment results, suitable for documentation, reporting, or sharing with stakeholders.

The generated Markdown includes: - Framework name as the main heading - Each domain as a section with its index and name - Questions within each domain with their indices and text - Allowed answer options for each question - Response details for answered questions, including: - The selected response - The reasoning behind the assessment - Supporting evidence excerpts from the manuscript

Parameters:

Name Type Description Default
path Path

Destination file for the Markdown representation.

required

Examples:

>>> framework = Framework(name="RoB2 Framework")
>>> framework.export_to_markdown(Path("assessment_report.md"))
load(path) classmethod

Load a framework from a JSON file at path.

Parameters:

Name Type Description Default
path Path

The file to read from.

required

Returns:

Type Description
Framework

An instance populated with the data from the file.

save(path)

Save the framework as formatted JSON to path.

This method excludes raw_data fields from the JSON output to keep the saved files clean and focused on assessment results.

Parameters:

Name Type Description Default
path Path

Location to write the JSON representation.

required

Pre-built Frameworks

The package provides ready-to-use frameworks that implement established risk-of-bias assessment methodologies. These frameworks come pre-configured with all necessary domains, questions, and answer options, allowing you to immediately begin manuscript analysis without manual setup.

RoB2 Framework
risk_of_bias.frameworks.get_rob2_framework()

Get the complete RoB2 (Risk of Bias 2) Framework for Randomized Trials.

This function returns a fully configured RoB2 framework that implements the Cochrane Risk of Bias tool version 2.0 guidelines for systematic evaluation of bias in randomized controlled trials.

The RoB2 framework is the gold standard for assessing risk of bias in randomized trials and is widely used in systematic reviews and meta-analyses. It provides a structured approach to evaluate five key domains where bias commonly occurs in clinical research and includes an additional domain for the overall risk of bias judgement.

Framework Structure

The framework contains five primary assessment domains, each with specific signaling questions designed to systematically evaluate potential sources of bias. A sixth domain captures the overall risk-of-bias judgement:

Domain 1: Bias arising from the randomization process Evaluates the adequacy of the randomization sequence generation and allocation concealment mechanisms.

Domain 2: Bias due to deviations from intended interventions Assesses whether there were deviations from intended interventions and whether the analysis was appropriate.

Domain 3: Bias due to missing outcome data Examines whether outcome data was available for all participants and whether missingness could depend on the true value.

Domain 4: Bias in measurement of the outcome Evaluates whether the outcome measurement was appropriate and whether measurement differed between intervention groups.

Domain 5: Bias in selection of the reported result Assesses whether the reported result was selected from multiple measurements or analyses of the data.

Domain 6: Overall risk of bias Provides the overall judgement for the outcome, including the predicted direction of bias.

Returns:

Type Description
Framework

A configured Framework instance containing the five RoB2 bias domains plus the overall judgement domain, with their respective signaling questions and answer options. The framework is ready for immediate use with run_framework().

Examples:

>>> from risk_of_bias.frameworks.rob2 import get_rob2_framework
>>> from risk_of_bias import run_framework
>>> from pathlib import Path
>>>
>>> # Get the pre-configured framework
>>> framework = get_rob2_framework()
>>> print(f"Framework: {framework.name}")
>>> print(f"Number of domains: {len(framework.domains)}")
>>>
>>> # Use with manuscript analysis
>>> manuscript = Path("manuscript.pdf")
>>> results = run_framework(manuscript=manuscript, framework=framework)
Notes

This framework follows the official RoB2 guidance and includes all standard signaling questions with the appropriate answer options: "Yes", "Probably Yes", "Probably No", "No", "No Information", "Not Applicable".

The framework structure mirrors the official RoB2 tool to ensure consistency with established assessment practices in systematic review methodology.

References

Sterne JAC, Savović J, Page MJ, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ 2019; 366: l4898.

Executing a framework

risk_of_bias.run_framework

run_framework(manuscript, framework=get_rob2_framework(), model=settings.fast_ai_model, guidance_document=None, verbose=False, temperature=settings.temperature, api_key=None)

Perform systematic risk-of-bias assessment on a research manuscript using AI.

This function automates the process of evaluating potential sources of bias in research studies by systematically working through a structured assessment framework. It combines established methodological frameworks (like RoB2) with AI capabilities to provide evidence-based bias assessments.

The Assessment Process

The function implements a comprehensive workflow:

  1. Framework Setup: Uses a pre-defined assessment framework containing organized domains and signaling questions
  2. Context Establishment: Sends system instructions to guide the AI model's assessment approach
  3. Guidance Integration: If provided, incorporates domain-specific guidance document to calibrate AI responses and provide specialized assessment criteria
  4. Document Processing: Converts the manuscript PDF to a format the AI can analyze
  5. Systematic Questioning: Sends all questions within a domain in a single request, reducing the number of API calls while maintaining conversation context
  6. Evidence-Based Responses: For each domain the AI returns a list of structured answers corresponding to each question. Each item includes:
  7. The chosen response from predefined options
  8. Detailed reasoning explaining the assessment
  9. Specific evidence excerpts from the manuscript
  10. Result Integration: Stores all parsed responses back into the framework structure for easy access and analysis

Parameters:

Name Type Description Default
manuscript Path

Path to the research manuscript PDF file to analyze. The file must exist and be readable. Supported formats include standard academic PDFs.

required
framework Framework

The assessment framework defining the structure of the bias evaluation. Defaults to the complete RoB2 framework for randomized controlled trials. Custom frameworks can be provided for specialized assessments.

get_rob2_framework()
model str

The OpenAI model identifier to use for assessment. Different models may provide varying levels of analysis depth and accuracy. The default is optimized for speed while maintaining quality.

settings.fast_ai_model
guidance_document Optional[Path]

Optional path to a PDF guidance document that provides domain-specific assessment criteria and AI calibration instructions. This feature enables:

  • Domain-specific expertise: Specialized interpretation criteria for fields like pediatric studies, surgical interventions, or rare diseases
  • AI bias correction: Systematic adjustments when the AI consistently misinterprets methodological aspects or shows patterns of being overly lenient or conservative in specific assessment domains
  • Standardization: Consistent application of journal-specific guidelines or institutional assessment standards across multiple manuscripts
  • Contextual clarification: Detailed explanations for ambiguous scenarios that frequently arise in specialized research contexts

The guidance document is presented to the AI before manuscript analysis, ensuring that your specified criteria and calibrations are consistently applied throughout the entire assessment process.

None
verbose bool

Whether to print detailed progress information during assessment. When True, displays each question, response, reasoning, and evidence in real-time. Useful for debugging, monitoring progress, or understanding the assessment process in detail.

False
temperature float

Sampling temperature passed to the OpenAI model. Higher values yield more diverse answers while lower values make outputs more deterministic. If a negative value is provided, the temperature parameter is omitted and the server default is used.

settings.temperature
api_key Optional[str]

API key to use for OpenAI calls. If None, OPENAI_API_KEY from the environment will be used.

None

Returns:

Type Description
Framework

The original framework structure populated with AI-generated responses. Each question in the framework will contain a ReasonedResponseWithEvidenceAndRawData object with:

  • response: The selected answer from the allowed options
  • reasoning: Detailed explanation of the assessment logic
  • evidence: List of relevant text excerpts from the manuscript
  • raw_data: Complete raw response data from the AI model

The populated framework maintains the hierarchical structure (Framework → Domains → Questions → Responses) for easy navigation and analysis of results. This complete data structure can be serialized to JSON format for persistence, caching, and data sharing workflows.

Manual Entry

risk_of_bias.human.run_human_framework(manuscript, framework=get_rob2_framework(), console=None)

Interactively complete a risk-of-bias framework.

This function guides the user through each question in the framework using a Rich console. Users select or type responses and may provide optional reasoning and evidence. Required questions must be answered; optional questions can be skipped by pressing Enter. Reasoning and evidence prompts can also be skipped with Enter.

Parameters:

Name Type Description Default
manuscript Path

Path to the manuscript being assessed. Only the file name is stored in the resulting framework.

required
framework Framework

Framework to populate with user responses.

get_rob2_framework()
console Console

Console instance for input and output. A new one is created if None.

None

Returns:

Type Description
Framework

The provided framework populated with user responses.

Domain

risk_of_bias.types.Domain

Bases: BaseModel

A thematic grouping of related bias assessment questions within a framework.

Domains represent the conceptual organization of bias assessment, where each domain focuses on a specific methodological aspect that could introduce bias into research findings. This organizational structure reflects how bias assessment experts think about and categorize different types of threats to study validity.

Conceptual Foundation

The domain concept stems from decades of methodological research showing that bias in research studies tends to cluster around specific aspects of study design and conduct. Rather than having an unstructured list of questions, domains provide logical groupings that:

  • Guide Systematic Thinking: Help assessors consider all major categories of potential bias systematically
  • Enable Targeted Assessment: Allow focus on specific methodological concerns relevant to different study types
  • Support Hierarchical Analysis: Enable both domain-level and overall framework-level bias judgments
  • Facilitate Communication: Provide a shared vocabulary for discussing specific types of bias concerns
Assessment Workflow

During assessment, domains are typically evaluated sequentially, with each domain's questions answered before moving to the next. This approach:

  • Maintains focus on one type of bias at a time
  • Allows for domain-specific reasoning and evidence gathering
  • Enables partial assessments when time or information is limited
  • Supports quality control by domain-expert reviewers

Attributes:

Name Type Description
questions list[Question]

The signaling questions that comprise this domain's assessment. Questions are typically ordered from fundamental to more detailed aspects of the bias type being evaluated.

name str

A descriptive name for the domain that clearly indicates the type of bias being assessed (e.g., "Bias arising from the randomization process").

index int

The sequential position of this domain within the overall framework. Used for organizing assessment workflow and reporting results in a consistent order.

Question

risk_of_bias.types.Question

Bases: BaseModel

An individual signaling question that guides bias assessment within a domain.

Questions are the fundamental building blocks of systematic bias assessment, designed to probe specific methodological aspects that could introduce bias into research findings. Each question represents a focused inquiry that helps assessors systematically evaluate study quality and potential threats to validity.

Question Types and Response Modes

Questions can be configured for different types of assessment needs:

Structured Assessment (default): When allowed_answers contains predefined options (like "Yes", "Probably Yes", "No", etc.), the AI must select from these specific choices. This approach:

  • Ensures consistency across assessments
  • Enables quantitative analysis and meta-analysis
  • Follows established assessment frameworks like RoB2
  • Facilitates automated processing and reporting

Free-Form Assessment: When allowed_answers = None, the AI can provide any string response of arbitrary length. This mode is valuable for:

  • Exploratory questions requiring detailed explanations
  • Capturing nuanced methodological details
  • Gathering qualitative insights about study design
  • Custom assessment criteria not covered by standard frameworks
  • Collecting recommendations for study improvement
Assessment Context and Evidence

Regardless of response mode, each question generates comprehensive assessment data including:

  • Response: The selected answer (structured) or free-form text
  • Reasoning: Detailed explanation of the assessment logic
  • Evidence: Specific text excerpts from the manuscript supporting the conclusion

This evidence-based approach ensures that assessments are transparent, auditable, and grounded in the actual study documentation.

Attributes:

Name Type Description
question str

The text of the signaling question presented to the AI assessor. Should be clear, specific, and answerable based on typical manuscript content. Well-designed questions avoid ambiguity and focus on observable methodological features.

allowed_answers list[str] | None

Defines the response mode for this question:

  • List of strings: Restricts responses to predefined options, ensuring standardized assessment (e.g., ["Yes", "No", "Unclear"])
  • None: Enables free-form text responses of any length, allowing detailed explanations and custom insights

The default provides standard bias assessment options commonly used in systematic review methodologies.

index float, default=0.0

The position of this question within its domain, determining assessment order. Float values allow for flexible question insertion (e.g., 1.5 between questions 1 and 2) without renumbering entire sequences.

is_required bool, default=False

Whether this question must be answered for a complete assessment. Required questions typically address fundamental methodological features essential for bias evaluation, while optional questions may provide additional insights or apply only to specific study types.

response ReasonedResponseWithEvidenceAndRawData | None, default=None

The AI-generated assessment response, populated during framework execution. Contains the structured response, reasoning, supporting evidence, and raw model output data.

Response

risk_of_bias.types.ReasonedResponseWithEvidence

Bases: BaseModel

A structured response container that captures comprehensive AI assessment data.

This class represents the core output of the AI bias assessment process, combining three essential components that make automated assessment both reliable and transparent: the actual response, the reasoning behind it, and the supporting evidence from the manuscript.

The Transparency Imperative

Traditional bias assessment often involves subjective expert judgment that can be difficult to audit or reproduce. This structured response format addresses these limitations by making the assessment process transparent:

  • Explicit Reasoning: Every assessment includes detailed explanation of the logic and criteria used to reach the conclusion
  • Evidence-Based: All conclusions are anchored to specific text from the manuscript, enabling verification and quality control
  • Reproducible: The structured format allows for consistent review, comparison, and potential re-evaluation of assessments
Multi-Modal Assessment Support

The evidence component is designed to work with various types of supporting information:

  • Direct Quotes: Exact text excerpts from methodology sections
  • Paraphrased Content: Summarized information when direct quotes would be too lengthy or fragmented
  • Multi-Source Evidence: Citations from different parts of the manuscript that collectively support the assessment
  • Contextual Information: Background details that inform the interpretation of methodological choices

Attributes:

Name Type Description
evidence list[str]

A collection of text excerpts from the manuscript that support the assessment conclusion. Each item should be a meaningful piece of evidence that directly relates to the question being assessed. Evidence items are typically:

  • Direct quotations from relevant manuscript sections
  • Specific methodological details described by the authors
  • Quantitative information (sample sizes, response rates, etc.)
  • Procedural descriptions that inform bias assessment

Multiple evidence items allow for comprehensive support of complex assessments that may depend on information scattered throughout the manuscript.

reasoning str

A detailed explanation of the assessment logic connecting the evidence to the conclusion. This should include:

  • Interpretation of the evidence in methodological context
  • Application of relevant bias assessment criteria
  • Consideration of alternative interpretations
  • Explanation of how the evidence leads to the specific response

High-quality reasoning demonstrates methodological sophistication and provides the rationale needed for assessment validation.

response str

The actual assessment answer, either selected from predefined options (for structured questions) or provided as free-form text (for open-ended questions). This represents the final conclusion of the assessment process based on the evidence and reasoning.

Summary and Analysis Functions

After completing individual risk-of-bias assessments using frameworks, researchers typically need to analyze results across multiple studies for systematic reviews, meta-analyses, or research synthesis. The summary functions provide essential tools for aggregating, visualizing, and exporting assessment results in formats compatible with established research workflows.

These functions address three critical needs in evidence synthesis:

  1. Batch Processing: Loading and processing multiple completed assessments from saved framework files
  2. Data Aggregation: Extracting domain-level judgements across studies for comparative analysis
  3. Standardized Export: Creating outputs compatible with specialized visualization tools like RobVis

This workflow supports the transition from individual study assessment to systematic evidence synthesis, enabling researchers to identify patterns of bias across study collections and generate publication-ready visualizations for systematic reviews.

Loading Multiple Assessments

risk_of_bias.summary.load_frameworks_from_directory(directory)

Load multiple completed risk-of-bias assessments for batch analysis.

This function enables systematic reviews and meta-analyses by batch-loading previously completed framework assessments from a directory. After conducting individual risk-of-bias assessments on multiple studies, researchers typically need to analyze patterns across their entire study collection. This function streamlines that process by automatically discovering and loading all completed assessments.

The function is fault-tolerant, continuing to load valid frameworks even if some files are corrupted or incompatible. This robustness is essential when working with large collections of assessments that may have been created over time or by different researchers.

Common use cases include: - Preparing data for systematic review summary tables - Generating cross-study bias pattern visualizations - Creating inputs for meta-analysis software - Quality assurance checks across assessment batches

Parameters:

Name Type Description Default
directory Path | str

Directory containing previously saved framework JSON files from completed risk-of-bias assessments.

required

Returns:

Type Description
list[Framework]

List of successfully loaded frameworks. Files that cannot be parsed (due to corruption, format changes, etc.) are silently ignored to ensure batch processing continues.

Examples:

>>> frameworks = load_frameworks_from_directory("./completed_assessments/")
>>> print(f"Loaded {len(frameworks)} completed assessments")

Creating Assessment Summaries

risk_of_bias.summary.summarise_frameworks(frameworks)

Extract domain-level risk judgements for comparative analysis across studies.

This function transforms detailed framework assessments into a simplified summary format suitable for systematic review tables, meta-analysis inputs, and cross-study comparisons. By extracting only the final risk-of-bias judgements for each domain, it creates a standardized view that facilitates pattern recognition and evidence synthesis across multiple studies.

The function specifically looks for "Risk-of-bias judgement" questions within each domain, which represent the final assessment conclusions after considering all signaling questions and evidence. This approach aligns with established risk-of-bias assessment methodologies where detailed questioning leads to domain-level judgements.

Key applications include: - Creating summary tables for systematic review publications - Identifying studies with consistent bias patterns across domains - Preparing data for risk-of-bias visualization tools (e.g., RobVis) - Supporting meta-analysis decisions about study inclusion/weighting

Parameters:

Name Type Description Default
frameworks list[Framework]

Completed framework assessments to summarise. These should contain domain-level "Risk-of-bias judgement" responses.

required

Returns:

Type Description
dict[str, dict[str, str | None]]

Nested mapping structure where: - Outer keys: manuscript/study identifiers - Inner keys: domain names - Values: risk-of-bias judgements ("low", "some concerns", "high") or None if no judgement was recorded

Examples:

>>> summary = summarise_frameworks(loaded_frameworks)
>>> print(summary["Smith2023"]["Randomization Process"])
'low'

Exporting for Visualization

risk_of_bias.summary.export_summary(summary, path)

Export risk-of-bias summary to a CSV file for analysis and visualization.

This function saves the risk-of-bias assessment summary as a CSV (Comma-Separated Values) file, a widely-supported standard format that can be opened in spreadsheet applications like Excel, imported into statistical software like R or Python, or used with specialized risk-of-bias visualization tools.

The exported CSV is specifically formatted to be compatible with RobVis, a popular R package and web application for creating publication-ready risk-of-bias visualizations. This ensures seamless interoperability between this tool and the broader risk-of-bias assessment ecosystem. The RobVis tool can generate traffic light plots and summary plots that are commonly used in systematic reviews and meta-analyses.

The CSV structure includes: - A Study column containing manuscript/study identifiers - Domain columns (D1, D2, ..., Dn) for each risk-of-bias domain - An Overall column. If a domain named Overall exists in the summary it will be used directly; otherwise the column represents the highest (worst) risk rating across all domains

Risk judgements are formatted according to RobVis conventions: - "Low" for low risk of bias - "Some concerns" for moderate risk of bias - "High" for high risk of bias

Parameters:

Name Type Description Default
summary Mapping[str, Mapping[str, str | None]]

Output from :func:summarise_frameworks containing the risk-of-bias assessments.

required
path Path | str

Destination file path for the CSV export. The file will be created or overwritten.

required
Notes

The exported CSV can be directly uploaded to the RobVis web interface at https://www.riskofbias.info/welcome/robvis-visualization-tool or used with the RobVis R package for programmatic visualization generation.

Comparing Assessors

risk_of_bias.compare.compare_frameworks(fw1, fw2)

Compare two completed risk-of-bias frameworks.

Parameters:

Name Type Description Default
fw1 Framework

Completed risk-of-bias frameworks to compare. The domain and question structure must be identical between the two frameworks.

required
fw2 Framework

Completed risk-of-bias frameworks to compare. The domain and question structure must be identical between the two frameworks.

required

Returns:

Type Description
DataFrame

Long-form table with domain_short, question_short, domain, question and one column per assessor containing their responses. If a question was unanswered, the value will be None. The final column agreement indicates if the assessors provided the same response.

Assessor Agreement Plot

risk_of_bias.visualisation.plot_assessor_agreement(df)

Visualise agreement between two assessors.

Parameters:

Name Type Description Default
df DataFrame

Table returned by :func:compare_frameworks.

required

Returns:

Type Description
Figure

Figure with scatter points for each assessor.