FL Applications
FL Use Cases
Start Training
Metadatasets
FL Clients
Docs
Login Icon
Website
Guest user
Signup
cover

HIPAA-Compliant AI: Prostate Cancer Radiation Dose Optimisation

Participants

7

End Date

12.11.27

Dataset
dmhk0x21
Resources2 CPU (8.59 GB) | 1 GPU (22.49 GB)
Compute
0 / 100.00 PF
Submits
0/5

On this page

Book a live demo

Overview

About this use case: Twelve cancer centres, five different EHR systems, HIPAA on one side of the Atlantic and GDPR on the other — two years of collaboration meetings have produced two papers on data harmonisation and no shared model. tracebloc gives every centre a shared workspace where models train on 9,600 prostate cancer records and results land on one leaderboard, with no protected health information leaving any centre's infrastructure. Explore the data, submit your own model, and see how your approach compares.

Problem

Twelve cancer centres are collaborating on outcome prediction for prostate cancer radiation therapy. Every centre has different EHR systems, different imaging archives, different compute infrastructure, and different governance requirements — two are subject to HIPAA, five operate under GDPR, and the others have institution-specific data protection policies that prohibit raw PHI transfer to external parties. Half of every collaboration meeting goes to infrastructure alignment rather than science. HIPAA compliant AI collaboration requires a shared environment where models run on each centre's own data — not a data pipeline project.

Solution

Dr. Marcus Webb, Head of Radiation Oncology Research at the coordinating centre, deploys a tracebloc workspace loaded with 9,600 prostate cancer patient records — structured EHR data, tumour staging, imaging summaries, dosimetry planning metrics, and document-derived embeddings from clinical notes and radiation reports, all within a single 319-feature multimodal dataset. Partner centres submit their radiation dose prediction models to the workspace. Inside tracebloc's containerised training environment, models train on the patient cohort — fine-tuning weights across demographics, staging, imaging, and DVH planning features — without protected health information leaving any centre's controlled environment. tracebloc orchestrates training, scores each model against the held-out evaluation cohort, and publishes results to a live leaderboard accessible to all twelve centres. This is a federated learning application of multi-centre scientific collaboration: the collaboration happens, the science advances, and no centre runs an IT integration project.

Outcome

In this example collaboration, models trained in the shared workspace show consistent performance across centres with different EHR structures — because the workspace standardises the execution environment regardless of each centre's underlying infrastructure. The best-performing model reduces MSE on the radiation dose prediction target by 31% relative to the internal single-centre baseline. The tracebloc workspace stays in place as the consortium grows and new centres join.

The Operational Challenge

Dr. Webb's consortium has been formally constituted for two years. Twelve leading cancer centres across the US and Europe signed a scientific collaboration agreement to develop AI-assisted radiation therapy optimisation for prostate cancer. The clinical rationale was compelling: prostate cancer is the most commonly diagnosed cancer in men, radiation therapy is the primary or adjuvant treatment for a majority of patients, and dose optimisation — matching the prescribed radiation dose to each patient's specific tumour profile, anatomy, and risk category — directly affects both tumour control rates and long-term toxicity outcomes. A model trained on tens of thousands of patients across multiple centres and treatment protocols would substantially outperform any single-centre approach.

Two years later, the consortium has produced two peer-reviewed papers on data harmonisation challenges and no shared predictive model.

The obstacle is not scientific disagreement. It is infrastructure heterogeneity. Five centres run Epic EHR. Three run Cerner. Two use proprietary oncology information systems. The radiation therapy planning data — DVH metrics, organ-at-risk dose constraints, planning technique — is stored in different formats across six commercial treatment planning systems. The imaging data spans three PACS vendors. The clinical documentation — surgery notes, radiation reports, pathology summaries — exists in incompatible formats across all twelve sites. Building a centralised healthcare data sharing pipeline that harmonises all twelve centres' data into a single structure would require a multi-year IT project, custom ETL development, and a governance framework that none of the centres' legal teams have been able to agree on.

HIPAA adds a specific layer of constraint for the five US-based centres. Protected health information — which includes any data that could identify a patient, including indirect identifiers — cannot be transferred across organisational boundaries without a formal Business Associate Agreement and explicit patient consent for research use. Three of the five US centres have research consent frameworks that cover institutional use but not inter-institutional transfer. Under the current governance framework, these centres cannot contribute patient data to a shared pool at all.

The scientific data involved is genuinely multimodal. A complete prostate cancer patient record for dose prediction includes structured clinical data (age, PSA, Gleason grade, T/N/M staging, risk group), imaging summaries (MRI PIRADS score, CT tumour volume, PSMA-PET SUVmax where available), treatment context (prior prostatectomy, androgen deprivation therapy timing, salvage vs. definitive RT), radiation planning metrics (prescribed total dose, dose per fraction, technique, organ-at-risk constraints), and unstructured documentation (treatment notes, radiation reports, surgical summaries) from which signal can be extracted via embeddings. The tracebloc evaluation dataset captures this complexity: 319 features across demographics, staging, imaging, planning metrics, DVH measurements, and 128-dimensional image embeddings, alongside text embeddings from clinical documentation.

A model that actually helps oncologists determine the most effective radiation dosage for individual patients — accounting for tumour biology, anatomical constraints, treatment history, and real-world evidence from comparable patient cohorts — has to have seen data from across that feature space. No single centre generates enough patient volume and feature diversity to reach that ceiling independently. The twelve centres together do.

The collaboration is stalled not because the science is wrong but because the infrastructure problem has consumed the collaboration's capacity. Secure health data infrastructure that removes the IT barrier is the prerequisite for the science to happen.

Stakeholders

  • Dr. Marcus Webb, Head of Radiation Oncology Research (Coordinating Centre): Consortium scientific lead. KPIs: MSE on dose prediction vs. single-centre baseline, time from data access to trained model, publication output per year. Accountable to the consortium steering committee and the funding agencies supporting the collaboration.
  • Chief Information Officers (Each Centre): Responsible for data security and infrastructure compliance. Key concern: no PHI leaves the centre's controlled environment under any scenario. A workspace where models come to the data — rather than data moving to a central server — satisfies this requirement without a new data transfer agreement.
  • HIPAA Compliance Officers (US Centres): US-based centres must operate within HIPAA safe harbour or expert determination frameworks. Any AI collaboration that processes PHI must ensure PHI does not leave the covered entity's environment. Containerised model execution with no data export path is within scope.
  • Head of Radiation Oncology (Each Centre): Clinical champion. KPIs: whether the resulting dose prediction model is clinically credible — calibrated on enough patients to support treatment decisions, validated across EHR types, and explainable to treating physicians and statisticians.
  • Research IT Lead (Each Centre): Responsible for compute infrastructure and workspace deployment. Needs the collaboration to work within existing compute environments — GPU allocation, security policies, network constraints — without a custom integration build.

The Underlying Dataset

The evaluation dataset contains 9,600 prostate cancer patient records with a training set of 9,600 records used for model fine-tuning and an evaluation holdout set for scoring. Full dataset statistics, feature distributions, and modality breakdown are available in the Exploratory Data Analysis tab.

This dataset is augmented. It was constructed to reflect the statistical structure of real-world prostate cancer radiation therapy records — the PSA distribution, Gleason grade composition, dose ranges, organ-at-risk metric distributions, and document embedding structure — without containing any protected health information or identifiable patient data.

PropertyValue
Training samples9,600
Features319 (+ radiation_dose_gy target)
Demographics / baseline8 features (age, etc.)
Tumour risk and staging14 features (PSA, Gleason, T/N/M staging, risk group)
Imaging summaries6 features (MRI PIRADS, CT volume, PSMA-PET SUVmax)
Treatment context7 features (prior prostatectomy, ADT, boost, pelvic nodes, salvage RT)
Radiation planning metrics12 features (prescribed dose, fractionation, technique)
DVH organ-at-risk metrics7 features (bladder mean, rectum mean, femoral head mean)
Document signal features8 features (document count, note token count, derived metrics)
Image embeddings128 dimensions
Text embeddingsRemaining features
Targetradiation_dose_gy — continuous (mean 76.13 Gy, range 66–81 Gy)
Evaluation metricMSE

Key clinical statistics from the dataset:

FeatureMeanStdRange
Age at RT start67.4 years8.045–90
PSA baseline10.2 ng/mL9.60.2–149.8
Prescribed total dose76.1 Gy3.766–81
Dose per fraction2.1 Gy0.31.5–2.9
Bladder mean dose34.9 Gy—9.6–60.3
Rectum mean dose40.1 Gy—14.3–65.5

A note on PSMA-PET data: 5,813 patients (60.5%) show zero SUVmax, indicating absence of PSMA-PET imaging — which is typical in real-world prostate cancer datasets where PSMA-PET availability varies by centre, patient risk group, and year of treatment. Models that handle missing imaging modalities gracefully will outperform those that assume complete multimodal coverage.

How Evaluation Works

Each partner centre submitted their radiation dose prediction model to the tracebloc workspace. The evaluation ran in two phases.

Phase 1 — Single-centre baseline. Each centre's model was benchmarked as trained on their own patient cohort, with no access to the cross-centre dataset. This establishes the honest single-centre baseline: what each centre can achieve with its own patient volume and feature coverage — and how much improvement federated training on the 9,600-patient cross-centre dataset adds.

Phase 2 — Fine-tuning on shared workspace. Partner centres were given access to the training environment inside the tracebloc workspace. Each centre transferred their model into the workspace and ran training on the 9,600-patient dataset. This fine-tuned model weights across the full multimodal feature space — demographics, staging, imaging, planning metrics, DVH constraints, and document embeddings — adapting from a model calibrated on one centre's patient mix to one trained on the complete cross-centre distribution. After training, the adapted model was evaluated automatically against the held-out cohort. Protected health information never left any centre's controlled environment. Each centre received only their own results; no centre had visibility into another's training runs or scores before the leaderboard published.

Each contributor received:

  • Training access: 9,600 prostate cancer patient records (319 features, multimodal) for model fine-tuning inside the workspace
  • Evaluation environment: Sandboxed execution — adapted models run against the evaluation set, no PHI export path available
  • Metrics tracked: MSE on radiation_dose_gy, performance breakdown by risk group (low, intermediate, high, very high), and feature importance outputs for clinical review and statistician validation
  • Modality ablation: Separate evaluation tracks for structured-only models (no embeddings) vs. full multimodal models (structured + image + text embeddings) — enabling fair comparison across architecturally different approaches and centres with different document data availability

Results

→ View the full model leaderboard — complete centre rankings, MSE by risk group, and modality contribution analysis across all submissions.

CentreSingle-Centre MSEAfter Workspace Fine-tuningHigh-Risk MSEModality
Centre A8.416.129.84Structured only
Centre B7.935.788.62Structured + image
Centre C ✅7.645.317.18Full multimodal
Centre D8.876.3410.21Structured only
Centre E9.126.7111.03Structured + text
Centre F–L8.4–10.65.9–7.88.5–13.4Various

What the numbers reveal:

Every centre improved through workspace fine-tuning on the 9,600-patient cross-centre dataset. The gains are consistent and large: the weakest improvement (Centre A) reduced MSE by 27%, from 8.41 to 6.12. The strongest improvement (Centre C) reduced MSE by 31%, from 7.64 to 5.31. These are not marginal refinements — they are the direct quantification of how much signal was being left unreachable at single-centre patient volumes.

Centre C's full multimodal architecture achieves the strongest results overall, and its high-risk patient MSE of 7.18 is particularly important clinically: high-risk prostate cancer patients are the cohort where dose optimisation most directly affects tumour control. The combination of structured EHR staging data, imaging summary features, and document-derived embeddings from radiation reports and surgical notes captures signal that structured data alone misses. The 128-dimensional image embeddings and text embeddings contribute measurable predictive information beyond the structured clinical features.

Centres using structured data only (Centre A, Centre D) show consistent underperformance relative to multimodal approaches, particularly on high-risk patients where imaging and document context is most informative. This finding has a direct implication for the consortium's companion development programme: investing in imaging and document embedding infrastructure at all twelve centres is scientifically justified by the MSE improvement.

Business Impact

Illustrative assumptions: 12 consortium centres / 800 new prostate cancer patients per centre per year (9,600 total consortium volume) / €4,200 cost per patient episode where dose sub-optimisation leads to avoidable toxicity or retreatment / IT integration project cost for centralised data pooling approach: estimated €2.8M over 3 years across all centres

ApproachInfrastructure CostPHI ComplianceMSE (High Risk)Avoidable Toxicity Events (Year 1)
No collaboration (single-centre)€0Compliant9.12 avgBaseline
Centralised data pooling€2.8M (IT project)Risk — requires all centres' DPA5.31 targetTBD (project delay)
tracebloc shared workspace ✅€180K/year (workspace)Compliant — PHI stays local5.31 achievedEstimated 340 fewer events/year

The tracebloc workspace delivers the same predictive performance as centralised data pooling — the 9,600-patient training set and the resulting MSE are identical — at a fraction of the infrastructure cost and without requiring a data transfer agreement that five of the twelve centres cannot legally sign. The €2.8M IT project cost is replaced by €180K/year in workspace access fees. The two-year governance delay is replaced by a deployment timeline measured in weeks.

Decision

Dr. Webb's consortium standardises on Centre C's full multimodal architecture as the shared radiation dose prediction model. Clinical review by the consortium's oncology leads confirms that the model's output on high-risk patients — MSE 7.18 on a dose range of 66–81 Gy — is within the range that supports decision support use: the model's predictions are presented alongside treating physician review rather than replacing clinical judgement.

A formal model governance protocol is established: predictions above a confidence threshold are flagged for statistician review before any treatment planning integration. The tracebloc workspace logs every submission, every metric, and every training run — creating the audit trail that both HIPAA compliance and clinical governance require. Oncologists at each centre can review model outputs without accessing another centre's patient records.

The workspace stays active after the initial evaluation. As each centre's annual patient cohort grows, the shared training dataset expands and models are retrained quarterly. New centres joining the consortium submit their models to the same workspace under the same terms — no new IT integration, no new data transfer agreement negotiation. The leaderboard becomes a live record of how dose prediction performance evolves as the consortium scales.

Explore this use case further:

  • View the model leaderboard — full centre rankings, MSE by risk group, modality contribution analysis
  • Explore the dataset — patient cohort statistics, feature distributions, imaging modality coverage
  • Start training — submit your own radiation dose prediction model to this evaluation

Related use cases: See how the same HIPAA-compliant collaboration approach applies to breast cancer screening and image classification and retinal disease classification. For a broader view of federated learning applications across healthcare and clinical research, see our federated learning applications guide.

Deploy your workspace or schedule a call.

Disclaimer

Disclaimer: The dataset used in this use case is augmented — designed to reflect the statistical structure of real-world prostate cancer radiation therapy records, including PSA distributions, Gleason grade composition, dose ranges, organ-at-risk metric distributions, and document embedding structure, without containing any protected health information or identifiable patient data. The persona, centre labels, performance figures, business impact assumptions, and consortium scenario are illustrative and based on patterns observed across multi-centre oncology research environments. They do not represent any specific institution, centre, clinical trial, or collaboration agreement.