Thesis
Synthetic Generation of Ham Emails for Improved Spam Classification Training
Supervisor
Thesis: Master’s Thesis
This thesis explores the use of synthetic data generation techniques (could be Generative AI or ‘classical’ techinques) to create legitimate (“ham”) emails while ensuring privacy compliance. The goal is to improve the training of spam detection models without exposing sensitive user data.
Prerequisities
Required
- Basic understanding of machine learning and articifial intelligence, e.g., Autoencoders or GANs (finished the course Foundations of Artificial Intelligence)
- Familiarity with Natural Language Processing techniques
- Proficiency in at least one programming language (preferably Python)
- Proficiency in using LaTeX
Optional
- You took the following courses:
- Internettechnologies & Web Engineering
- Advanced Methods of Machine Learning
- Security in Communication Networks
- Familiarity with evaluation metrics for AI models
- Basic knowledge of privacy-preserving techniques
- Basic knowledge of principles related to spam and phishing