Submission Guide

Overview

We welcome high-quality benchmark dataset submissions from the pharmacometrics community. Each submission undergoes rigorous peer review and, upon acceptance, becomes a citable publication with a DOI.

Benchmark Requirements

All benchmark datasets must meet the following criteria:

1. Realism

Irregular sampling: Reflect realistic clinical trial sampling schedules
Confounding dropouts: Include realistic patient dropout patterns
Realistic relationships: Capture true dose-exposure-response relationships

2. Longitudinal Structure

Data must include repeated measurements over time
Appropriate for pharmacometric modeling approaches

3. Comprehensive Documentation

Your submission must include:

Generative process description: For synthetic data, document how the data was generated
Realistic scenario description: Explain what real-world situation the dataset represents
Clear data dictionary: Describe all variables and their units

4. Associated Tasks

Define specific tasks that reflect real-world decision-making in drug development:

Model selection challenges
Dose optimization
Clinical trial simulation validation
Exposure-response characterization

5. Train/Test Split

Provide a pre-specified train/test split
Evaluation metrics should be computed on the test set
Document the rationale for the split strategy

Submission Structure

Each benchmark must follow this directory structure:

benchmarks/<dataset-name>/
├── index.qmd              # Main description and documentation
├── data/
│   ├── train.csv          # Training dataset
│   ├── test.csv           # Test dataset
│   └── data-dictionary.csv # Column descriptions
├── metadata.yml           # Machine-readable metadata
└── README.md              # Quick reference (generated from index.qmd)

Required Files

1. `index.qmd`

Your main documentation file must include the following sections:

Title and Authors
Abstract: Brief overview of the benchmark
Background: Context and motivation
Data Generation (for synthetic data): Detailed methodology
Dataset Description: Variables, sample size, study design
Tasks: Specific modeling challenges with evaluation criteria
Train/Test Split: Description and rationale
References: Relevant citations

2. Data Files

train.csv: Training dataset
test.csv: Test dataset
Both files must use the same column structure

3. `data-dictionary.csv`

A CSV file describing each column with the following structure:

column_name	description	units	type	coding
ID	Subject identifier	-	integer	-
TIME	Time since first dose	hours	numeric	-
DV	Dependent variable (concentration)	mg/L	numeric	-
…	…	…	…	…

4. `metadata.yml`

Machine-readable metadata in YAML format:

name: dataset-name
title: Full Dataset Title
version: 1.0.0
date: 2025-10-16
authors:
  - name: Jane Doe
    affiliation: University Example
    email: jane.doe@example.com
  - name: John Smith
    affiliation: Pharma Corp
description: Brief description of the benchmark
keywords:
  - pharmacokinetics
  - dose-response
  - longitudinal
data_type: synthetic
therapeutic_area: oncology
n_subjects: 250
n_observations: 2500
tasks:
  - name: task1
    description: Model selection challenge
    metric: AIC
  - name: task2
    description: Prediction accuracy
    metric: RMSE
license: CC-BY-4.0

Submission Process

Step 1: Prepare Your Benchmark

Create your benchmark following the structure above
Test that your documentation builds correctly with Quarto
Validate that your data files are properly formatted

Step 2: Submit a Pull Request

Fork this repository
Create a new branch: git checkout -b benchmark/<your-dataset-name>
Add your benchmark to benchmarks/<your-dataset-name>/
Commit your changes with a clear message
Push to your fork and submit a Pull Request

Use our Pull Request template which will guide you through the submission checklist.

Step 3: Peer Review

Your submission will undergo peer review:

Technical validation (automated checks)
Scientific review (expert evaluation)
Documentation quality assessment

Reviewers will provide feedback via PR comments. Please address all comments before final acceptance.

Step 4: Acceptance and Publication

Upon acceptance:

Your benchmark will be merged into the main repository
A DOI will be assigned
Your benchmark will appear on the website
You can cite it in publications

Tips for a Successful Submission

Start early: The review process may take several iterations
Be thorough: Complete documentation speeds up review
Test your data: Ensure files load correctly and contain expected information
Engage with reviewers: Respond promptly to feedback
Follow examples: Look at existing benchmarks for guidance

Questions?

If you have questions about the submission process, please contact us or open a discussion on GitHub.