
HIPAA compliant AI collaboration for Cancer Research
Participants
5
End Date
03.12.26
Dataset
dmhk0x21
Resources2 CPU (8.59 GB) | 1 GPU (22.49 GB)
Compute
0 / 0 F
Submits
0/5
Overview

5
03.12.26
Organizations increasingly collaborate with specialized AI/ML vendors who possess deep expertise in specific model architectures or deliver the best embeddings. These partnerships become essential when tackling complex, multimodal problems that require specialized capabilities not available in-house.
Especially for international organisations headquartered in the US this poses a challenge, since HIPAA compliance regulations prohibit data sharing across borders. Yet externals still need a way to run, test, and improve models on real data, otherwise evaluation is not meaningful.
Build a model that helps oncologists determine the most effective radiation dosage for individual cancer patients based on their specific patient journey and clinical profile. The goal is to leverage real-world evidence across comparable patient cohorts to support treatment decisions with meaningful clinical impact.
Basic algorithmic approaches using single-source data are already widely established in clinical practice. However, modern healthcare generates multimodal data across entire patient journeys - imaging studies, demographics, structured EHR records, treatment plans, radiation therapy reports, surgical notes, and unstructured clinician documentation.
By reframing this as a big data problem and taking a more holistic view, one can unlock significantly better outcomes. The approach involves extracting valuable signals from large document stores, like surgery docs, radiation reports, patient characteristics etc. and using embeddings to build a unified semantic view of each patient's complete journey across all data sources. From this comprehensive view, a model can be trained to predict the optimal radiation dosage for a cohort and a specific patient profile. Then it can be checked by statisticians and reviewed by clinicians.
Using tracebloc, one can keep patient data within the organization's secure environment and instead bring vendor models to the data. No PHI leaves the infrastructure at any point.
Vendors access the secure compute environment through the tracebloc platform. They can submit model code, run training and evaluation on the full multimodal dataset, fine-tune their approaches, and iterate rapidly - all without seeing or accessing patient records directly. Model weights and outputs remain under the data owner's control within their infrastructure throughout the entire development process.
A examplatory collaboration was conducted on a multimodal prostate cancer dataset to predict optimal radiation doses based on comprehensive patient characteristics and tumor profiles across multiple data sources.
Get specifics about:
- use tracebloc for research collaborations on sensitive EHR data
- get model benchmarks from external partners and stay 100% compliant