Submission Guide

Overview

We welcome high-quality benchmark dataset submissions from the pharmacometrics community. Each submission undergoes rigorous peer review and, upon acceptance, becomes a citable publication with a DOI.

Benchmark Requirements

All benchmark datasets must meet the following criteria:

1. Realism

  • Irregular sampling: Reflect realistic clinical trial sampling schedules
  • Confounding dropouts: Include realistic patient dropout patterns
  • Realistic relationships: Capture true dose-exposure-response relationships

2. Longitudinal Structure

  • Data must include repeated measurements over time
  • Appropriate for pharmacometric modeling approaches

3. Comprehensive Documentation

Your submission must include:

  • Generative process description: For synthetic data, document how the data was generated
  • Realistic scenario description: Explain what real-world situation the dataset represents
  • Clear data dictionary: Describe all variables and their units

4. Associated Tasks

Define specific tasks that reflect real-world decision-making in drug development:

  • Model selection challenges
  • Dose optimization
  • Clinical trial simulation validation
  • Exposure-response characterization

5. Train/Test Split

  • Provide a pre-specified train/test split
  • Evaluation metrics should be computed on the test set
  • Document the rationale for the split strategy

Submission Structure

Each benchmark must follow this directory structure:

benchmarks/<dataset-name>/
├── index.qmd              # Main description and documentation
├── data/
│   ├── train.csv          # Training dataset
│   ├── test.csv           # Test dataset
│   └── data-dictionary.yml # Column descriptions (yspec-style YAML)
├── metadata.yml           # Machine-readable metadata
└── README.md              # Quick reference (generated from index.qmd)

Required Files

1. index.qmd

Your main documentation file must include the following sections:

  • Title and Authors
  • Abstract: Brief overview of the benchmark
  • Background: Context and motivation
  • Data Generation (for synthetic data): Detailed methodology
  • Dataset Description: Variables, sample size, study design
  • Tasks: Specific modeling challenges with evaluation criteria
  • Train/Test Split: Description and rationale
  • References: Relevant citations

2. Data Files

  • train.csv: Training dataset
  • test.csv: Test dataset
  • Both files must use the same column structure

3. data-dictionary.yml

A yspec-style YAML schema documenting each column. Top-level keys are column names (one key per column in train.csv/test.csv); reserved keys ending in __ (e.g. SETUP__) are not treated as columns.

SETUP__:
  description: Brief description of this data dictionary
  glue:
    - "{{ short }}"

ID:
  short: Subject identifier
  type: integer
  values: 1 to N

TIME:
  short: Time since first dose
  type: numeric
  unit: hours
  range: [0, ]

DV:
  short: Dependent variable (plasma concentration)
  type: numeric
  unit: mg/L
  comment: Observation rows only; dose-event rows have DV missing.

DROPOUT:
  short: Dropout indicator
  type: integer
  values:
    0: completed
    1: dropped out

Legacy data-dictionary.csv files (with columns column_name, description, units, type, coding) are still accepted by the validator for backwards compatibility with earlier submissions, but new submissions should use the YAML form.

4. metadata.yml

Machine-readable metadata in YAML format:

name: dataset-name
title: Full Dataset Title
version: 1.0.0
date: 2025-10-16
authors:
  - name: Jane Doe
    affiliation: University Example
    email: jane.doe@example.com
  - name: John Smith
    affiliation: Pharma Corp
description: Brief description of the benchmark
keywords:
  - pharmacokinetics
  - dose-response
  - longitudinal
data_type: synthetic
therapeutic_area: oncology
n_subjects: 250
n_observations: 2500
tasks:
  - name: prediction-accuracy
    type: regression
    description: Predict plasma concentrations in the test set
    target: DV
    output_format: {type: individual_predictions, columns: [ID, TIME, PRED]}
    metric: rmse
license: CC-BY-4.0

Step-by-Step Guide

This guide walks through a complete submission from scratch. If you are comfortable with GitHub and git, skip to Submission Process below.

Prerequisites

Install the following before starting:

  1. A GitHub accountSign up at github.com
  2. Gitgit-scm.com/downloads
    • macOS: brew install git
    • Windows: use the Git for Windows installer
    • Linux: sudo apt install git or sudo yum install git
  3. Git LFSgit-lfs.github.com
    • macOS: brew install git-lfs
    • Windows: included in Git for Windows
    • Linux: sudo apt install git-lfs
  4. Quartoquarto.org/docs/get-started

1. Fork the Repository

  1. Go to github.com/PMxBenchmarks/pmx_benchmarks
  2. Click Fork (top-right corner)
  3. Select your GitHub account as the destination

You now have your own copy at github.com/<your-username>/pmx_benchmarks.

2. Clone Your Fork

git clone https://github.com/<your-username>/pmx_benchmarks.git
cd pmx_benchmarks

3. Set Up Git LFS

Run once after cloning:

git lfs install

Verify tracking is active:

git lfs track

You should see *.csv, *.parquet, *.RData, and other data formats listed.

4. Create a Branch

git checkout -b benchmark/<your-dataset-name>

Example: git checkout -b benchmark/idr-pkpd-covariate

5. Add Your Benchmark

Create the directory structure:

mkdir -p benchmarks/<your-dataset-name>/data

Use BENCHMARK_TEMPLATE.md and benchmarks/example-pk-model-selection/ as references. Required files:

benchmarks/<your-dataset-name>/
├── index.qmd
├── metadata.yml
└── data/
    ├── train.csv
    ├── test.csv
    └── data-dictionary.yml

6. Validate Locally

# Check Quarto renders without errors
quarto render benchmarks/<your-dataset-name>/index.qmd

# Run validation scripts
python .github/scripts/validate_benchmark.py
python .github/scripts/validate_data.py

Fix any errors before continuing.

7. Commit and Push

git add benchmarks/<your-dataset-name>/
git commit -m "Add <your-dataset-name> benchmark"
git push origin benchmark/<your-dataset-name>

8. Open a Pull Request

  1. Go to your fork: github.com/<your-username>/pmx_benchmarks
  2. Click Compare & pull request (appears after pushing)
  3. Set base repository to PMxBenchmarks/pmx_benchmarks, base branch main
  4. Fill in the PR template completely
  5. Click Create pull request

9. Respond to Review

Reviewers will comment on your PR. Push updated commits to the same branch — do not open a new PR. The existing PR updates automatically.


Large Data Files

All benchmark data files are stored using git-lfs. Run git lfs install once after cloning and git will automatically handle tracked file types (*.csv, *.parquet, *.RData, *.rds, *.sas7bdat, *.xpt).


Task Schema

Each task in metadata.yml should declare a type, output_format, and metric. The validator warns when type is absent and enforces output_format and metric only when type is provided. Three task types are supported:

Regression

Predict a continuous outcome for each row in the test set.

- name: prediction-accuracy
  type: regression
  target: DV                    # column in test.csv with ground truth values
  output_format: {type: individual_predictions, columns: [ID, TIME, PRED]}
  metric: rmse                  # rmse | mae | nrmse

Classification

Predict a binary or multi-class outcome.

# Soft predictions
- name: dropout-prediction
  type: classification
  output_format: {type: probabilities, columns: [ID, P_DROPOUT]}
  metric: auroc

# Hard predictions
- name: model-selection
  type: classification
  output_format: {type: class_predictions, columns: [MODEL_SELECTED]}
  metric: accuracy

Counterfactual

Report aggregate statistics under a hypothetical intervention. Output must be aggregates — not individual-level predictions.

- name: cmax-exceedance-2x-dose
  type: counterfactual
  scenario: "2x observed dose, same schedule"
  output_format: {type: summary_stats, stats: [q10, q25, q50, q75, q90]}
  truth_file: tasks/cmax_truth.yml
  metric: quantile_coverage

The truth_file must be deposited in tasks/ and contain pre-computed ground truth from your generative model. Its estimates keys must exactly match the names declared in output_format.stats:

# tasks/cmax_truth.yml
scenario: "2x observed dose, same schedule"
n_sim: 10000
population: all
estimates:
  q10: 45.2
  q25: 58.7
  q50: 74.3
  q75: 95.1
  q90: 118.4

The other supported counterfactual output type is probability (with threshold and direction: above|below; truth file needs a single p key).

One metric per task — create separate tasks if you want multiple metrics evaluated.


Submission Process

Step 1: Prepare Your Benchmark

  1. Create your benchmark following the structure above
  2. Test that your documentation builds correctly with Quarto
  3. Validate that your data files are properly formatted

Step 2: Submit a Pull Request

  1. Fork this repository
  2. Create a new branch: git checkout -b benchmark/<your-dataset-name>
  3. Add your benchmark to benchmarks/<your-dataset-name>/
  4. Commit your changes with a clear message
  5. Push to your fork and submit a Pull Request

Use our Pull Request template which will guide you through the submission checklist.

Step 3: Peer Review

Your submission will undergo peer review:

  • Technical validation (automated checks)
  • Scientific review (expert evaluation)
  • Documentation quality assessment

Reviewers will provide feedback via PR comments. Please address all comments before final acceptance.

Step 4: Acceptance and Publication

Upon acceptance:

  • Your benchmark will be merged into the main repository
  • A DOI will be assigned
  • Your benchmark will appear on the website
  • You can cite it in publications

Tips for a Successful Submission

  • Start early: The review process may take several iterations
  • Be thorough: Complete documentation speeds up review
  • Test your data: Ensure files load correctly and contain expected information
  • Engage with reviewers: Respond promptly to feedback
  • Follow examples: Look at existing benchmarks for guidance

Questions?

If you have questions about the submission process, please contact us or open a discussion on GitHub.