In our group, the overarching research theme is trustworthy generative AI systems, with the following specific focus:
- Structural synthetic data generation
We study generative models for structured data such as tables and time series. Our work spans Generative Adversarial Networks (GANs), Large Language Models (LLMs) and Diffusion models, focusing on synthetic data quality, downstream utility, and privacy. We also explore federated and decentralized settings, which commonly arise in real-world collaborative scenarios.
- Watermarking generative models
Generative models are increasingly used to create synthetic data across domains, from text and images to tabular data and time series. However, the proliferation of AI-generated content raises critical concerns about model ownership, data provenance, and intellectual property protection. Watermarking techniques offer a solution by embedding verifiable signatures into generative content, enabling creators to prove authenticity and trace misuse while maintaining the utility and quality of generated content.
- Privacy and security in AI
Federated Learning (FL) allows training to be performed in a distributed manner. This is however vulnerable to model poisoning, inversion attack or inference attacks, under different adversarial assumptions. We study novel techniques to perform attacks, defenses and mitigations.
- AI in Science The semiconductor industry faces complex problems such as root-cause analysis and predicting maintainance which can benefit from collaboration between multiple parties. However, strict confidentiality requirements prevent such cooperation. We enable collaborative machine learning while protecting proprietary information.