Thesis
Evaluating the Effectiveness of Single Words, Chargrams, N-Grams, and Sentence Embeddings in Spam and Phishing Classification
Supervisor
Thesis: Bachelor’s/Master’s Thesis
This thesis compares the effectiveness of different text representation techniques, i.e., single words, chargrams, n-grams, or sentence embeddings, for spam and phishing detection. The goal is to identify the differences, weaknesses, and strengths for each method and compare them.
Prerequisities
Required
- Basic understanding of machine learning and articifial intelligence (finished the course Foundations of Artificial Intelligence)
- Familiarity with Natural Language Processing techniques, e.g., text classification and feature extraction techniques
- Proficiency in at least one programming language (preferably Python)
Optional
- You took the following courses:
- Internettechnologies & Web Engineering
- Advanced Methods of Machine Learning
- Security in Communication Networks
- Familiarity with evaluation metrics for AI models
- Proficiency in using LaTeX