Research

Ongoing Projects

Explore our current research initiatives advancing the intersection of data science and digital health.

10 of 10 projects
CTRAI2
Planning
Clinical trialsRisk assessmentArtificial intelligenceLarge language modelsMulti-taskMultimodal
AI-based Risk Assessment for Clinical Trials on Medicinal Products: A Large-Scale and Integrative Approach

This project proposes a unified AI approach to clinical trial risk assessment by jointly modeling safety, efficacy, and operational risks using clinical trial protocols and results. It will develop large-scale benchmark datasets and multi-task deep learning models based on transformers and graph neural networks, and evaluate robustness through retrospective analyses and prospective testing on ongoing trials.

April 2026 – April 2030

Swiss National Science Foundation (SNSF)

Key Objectives

  • Create and label a collection of clinical trial protocols and results to support multi-dimensional clinical trial risk assessment.

  • Develop and validate predictive approaches to estimate multiple dimensions of clinical trial risk.

  • Run a prospective evaluation on selected ongoing clinical trials.

WHO - SFM
Active
Clinical CorporaSemi-structured datasetsNLPMachine Learning

Automating the triage of incident reports to the WHO’s GSMS platform using Machine Learning

The WHO has the Global Surveillance and Monitoring System (GSMS). This is a platform for member states to report on the incidents related to Substandard or Falsified Medications (SFM). In order to provide support to these incident reports in a timely manner, including in emergency situations, WHO needs to prioritize and analyze them. The aim of this project is to use Machine Learning to help WHO specialists in analyzing those incidents.

October 2025 – September 2027

In collaboration with WHO’s ISF team

Key Objectives

  • Provide an ML model (prototyping, developing, testing, deployment, integration) capable of classifying incoming incidents into 3 priority categories.

  • Providing a natural language summary of incidents.

GESICA
Active
Patient trajectory predictionLongitudinal dataInfection risk predictionTransplant patients
Gestion des situations de crises sanitaires

GESICA addresses the growing complexity and simultaneity of exceptional health situations (EHS), such as epidemics, climate-related crises, and other major threats, which increasingly challenge the resilience of healthcare systems. The project aims to design an intelligent decision-support system capable of aggregating heterogeneous data sources to detect weak signals of emerging EHS, provide early warnings, and support anticipation and preparedness. Beyond detection, the system will propose and evaluate management scenarios and optimize healthcare resource allocation at local, regional, and cross-border scales within the Franco-Swiss context. By fostering collaboration between research institutions and public health authorities, GESICA seeks to improve crisis response, enhance care quality, and reduce system strain. Ultimately, the project aims to strengthen cross-border public health coordination and position the region as a center of excellence for managing health crises linked to epidemics and climate change.

June 2025 – August 2027

Interreg Europe

Key Objectives

  • Design and deploy an intelligent, cross-border decision-support system for exceptional health situations (EHS), capable of detecting weak signals of emerging crises through the aggregation and analysis of heterogeneous data sources.

Secondary Objectives

  • Enable early warning and situational awareness for public health authorities by integrating epidemiological, clinical, organizational, and contextual data.

  • Support anticipation and preparedness by modeling the evolution of EHS and assessing their potential impact on healthcare systems.

  • Propose and compare management scenarios at local, regional, and cross-border scales, with a focus on optimizing healthcare resources and continuity of care.

AIIDKIT
Active
Patient trajectory predictionLongitudinal dataInfection risk predictionTransplant patients
Artificial Intelligence for Improved Infectious Diseases Outcomes in Kidney Transplant Recipients

Kidney transplant recipients face a persistent and dynamic risk of serious infections due to lifelong immunosuppression, challenging traditional fixed-timeline risk assessments and highlighting the urgent need for personalized infection prevention tools. The AIIDKIT project aims to meet this need by developing AI-based clinical decision-support tools based on Large Language Models, and Graph Neural Networks trained on complex, longitudinal patient data from the Swiss Transplant Cohort Study (STCS) to create numerical embeddings for dynamic, patient-specific infection risk prediction.

May 2025 – April 2029

Swiss National Science Foundation (SNSF)

Key Objectives

  • AI-based clinical decision support tool for dynamic, patient-specific infection risk prediction in kidney transplant recipients.

Secondary Objectives

  • Expansion to finer outcomes (e.g., graft loss, death, different infection types: bacterial, viral, fungal; resistance profiles) and other organs.

  • Model interpretability to identify key clinical risk factors for future randomized control trials.

AIDosE
Active
Clinical TrialsLongitudinal dataDosing ErrorsMedication ErrorsNatural Language ProcessingMachine LearningClinical CorporaSemi-structured datasets
Artificial intelligence methods to estimate and predict dosing errors in interventional clinical research

Dosing error has a significant negative impact on Pharma R&D, increasing costs and delaying the launch of new drugs. We will leverage multimodal dosing-related information to identify and predict dosing errors for investigational drugs using machine learning.

February 2025 – January 2027

Innosuisse

Key Objectives

  • Establish a high quality dataset (clean, reproducible, repeatable, open source, content-rich, multi-modal, multi-source) related to dosing/medication error events in clinical research.

  • Organize a public open challenge to the attention of the ML/NLP/Health NLP researchers and practitioners with international contributions.

  • Develop ML/NLP methods to predict dosing errors from multi-modal clinical trials data.

DHSM
Active
Digital HealthLiterature review TrialsLLMsNLP
Digital Health Science Map

The Digital Health Science Map is an automated knowledge platform that continuously collects, curates, and structures the global scientific literature on digital health. By retrieving publications from major public databases, filtering out non‑relevant content, and applying advanced post‑processing, Science‑Map creates a high‑quality, searchable corpus dedicated to digital health. The platform enables researchers, policymakers, and practitioners to explore trends, conduct efficient literature reviews, and ask targeted questions through an AI‑powered chatbot using retrieval‑augmented generation (RAG), transforming the digital health evidence base into an accessible and actionable resource.

January 2023 – December 2026

Swiss Agency for Development and Cooperation (SDC)

Key Objectives

  • Automate and centralize digital health evidence by continuously collecting, filtering, and structuring scientific literature from multiple public sources into a single, up‑to‑date database.

  • Improve access to and use of knowledge by enabling advanced search, trend analysis, and efficient literature reviews tailored to the digital health domain.

  • Support evidence‑informed research and decision‑making through AI‑assisted question answering and exploration of the literature using retrieval‑augmented generation (RAG).

External Links

CTxAI
Completed
Clinical trialsProtocol designRisk assessment
CTxAI: Quality by design of clinical studies using explainable AI

This project develops data-driven tools to identify and assess risk within key clinical trial components, including eligibility criteria, outcomes, and intervention safety. It advances methods to structure eligibility criteria using clustering to support protocol design, and produces benchmark datasets and models for clinical research claim verification and adverse drug events prediction.

September 2022 – September 2024

Innosuisse

Key Objectives

  • Develop data-driven methods to assess risk in eligibility criteria, outcomes, and intervention safety.

  • Develop and validate predictive approaches to estimate multiple dimensions of clinical trial risk.

  • Create benchmark resources for clinical research claim verification and adverse drug event prediction

NLU4EHR
Completed
Biomedical concept normalizationElectronic health recordsUMLSSemantic interoperabilityMultilingual NLPClinical text miningEntity linking
NLU4EHR: Natural Language Understanding for Electronic Health Records Analytics

NLU4EHR addresses semantic interoperability by turning free-text clinical documentation into standardized, machine-readable concepts for downstream analytics. We detect clinically relevant passages in clinical text and map them to UMLS concepts within a chosen target ontology (e.g., SNOMED CT, ATC, ICD). Our multilingual retrieve-then-re-rank pipeline combines BM25 with dense embeddings from discriminative and generative LLMs, plus fusion and re-ranking, to robustly normalize terms across five languages.

January 2022 – March 2025

Innosuisse

Key Objectives

  • Normalize clinical mentions to standard concepts: Map extracted passages from EHR text to UMLS CUIs while targeting a specified ontology (e.g., SNOMED CT, ATC, ICD, LOINC).

  • Support multilingual clinical data: Deliver robust concept normalization across English, French, German, Spanish, and Turkish.

  • Build and evaluate a modular pipeline: Combine sparse retrieval (BM25), dense retrieval (LLM embeddings), re-ranking, and fusion to maximize accuracy and robustness.

CHEM::AI
Completed
Artificial IntelligenceDeep LearningCheminformaticsMachine LearningDrug DiscoveryChemical Space Exploration
CHEM:: AI - Predicting and Exploring Novel Chemical Spaces Using Artificial Intelligence

We built on and extended recent advances in deep learning and cheminformatics to provide effective solutions for virtual synthesis of new molecules. Research challenges related to molecule representation, active learning for reaction models and exploration of chemical spaces were addressed. The project developed an AI platform for increased chemistry innovation in collaboration with SpiroChem. The project enables more efficient drug discovery to reduce costs and time to market for new pharmaceuticals.

October 2020 – April 2023

Innosuisse

Key Objectives

  • Develop predictive and generative models for chemical reactions demonstrating high accuracy, at least 85% for the top-1 products.

  • Create a molecule and reaction database integrating public and private sources with more than 100 million molecules and more than 1 million chemical reactions.

  • Engineer an active learning system that incrementally absorbs new knowledge, adding more than 1000 new reactions per year to a collective knowledge database.

Secondary Objectives

  • Increase the productivity and efficiency of SpiroChem, reducing the number of experimental synthetic steps required.

  • Promote the transfer of knowledge to the industrial partner (SpiroChem) and disseminate research findings.

External Links

Risklick
Completed
Clinical trialssemi-structured datagraph representationsclinical language modelsinformation retrieval
Risklick: Maximizing Likelihood of Success for Clinical Trials

To improve the success rate of clinical trials, we automate the understanding and extraction of risk components from semi-structured, enriched clinical protocols. Based on uncovered graph structures of protocol models, the success of clinical trials is predicted.

January 2020 – October 2022

Innosuisse

Key Objectives

  • Ingesting the clinical trials data and enriching them with other relevant artefacts, such as historical modifications to the CT’s protocols, or the associated publications.

  • Establishing information retrieval indices for similarity search within CT-related data.

  • Developing ML models, levering on Graph Neural Networks (GNN’s) to model the hierarchical structure of CT’s, as well as transformer language models pre-trained on clinical corpora for knowledge transfer and fine-tuning on CT-specific resources.

External Links