
Advanced Machine Learning Thesis Topics 2019-2020
Explore cutting-edge thesis topics in applied Data Mining/Machine Learning, such as word embeddings for source code, chatbots for social good, and analyzing social media influencers. Engage with Python, Deep Learning frameworks, and NLP for groundbreaking research in these areas.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Survival Guide You will need to read and understand many (I mean many ) research papers. You are expected to be independent and self-motivated but also report, update and teach me new things as well I will push you to do good work and provide the needed conditions for it. I won t pressure with deadlines, etc. You need to know how to program and love (unconditionally) to program. You will program (& debug). Quite a lot. You will need to be able to get things to work . It will be mostly hard work but you will be reimbursed by the quality results (most of the time ) My research interests are in applied Data Machine/Machine Learning in the fields highlighted in the topics. I am open to working on other Data Mining/Machine Learning areas but you have to convince me they are interesting.
Thesis Topics 2019 Thesis Topics 2019- -20 20 General info oEach thesis has a short description with several different tasks (and perhaps different datasets) oIn the comments below the slide you can find some useful references/starting points oDifferent goals might lead to different topics General requirements (see slide #1 again) oUnconditional love for programming oMachine learning/data mining understanding on a good level oFor language related theses, assume extra overhead if you are not familiar with text mining For an updated version of these slides and/or new topics check: https://dke.maastrichtuniversity.nl/jerry.spanakis/theses/
Word embeddings for source code Word embeddings have been successfully applied to language field (see word2vec) but also other domains (like recommender systems e.g. item2vec). Your goal in this topic is to explore how techniques like word2vec can be applied to source code and more specifically to how different students/users write code. The ultimate goal is to be able to provide more constructive feedback to students. Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Chatbots for Social Good DKE is running an interdisciplinary project with the ultimate goal to build a chatbot that assists survivors of sexual harassment (#MeTooMaastricht). There is already a working prototype but the goal is now to make it more interactive and more conversational based on new data that are collected. Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Social Media Influencers Social media content (tweets, facebook posts, instagram posts) is unregulated and from a legal perspective questions arise for when content needs to be regulated (e.g. contains an advertisement). Our goal is to build new datasets (e.g. by including influencer and not influencer content) and then utilize ML models (e.g. classifiers) in order to identify which are the characteristics (images, text, etc.) that denote when a social media posting is to be regulated. Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing This thesis is in collaboration with Maastricht Law & Tech Lab
Topic detection & tracking over time Identifying trends and how you can track their evolution in different forms of temporal data is a difficult task (to be performed and to be evaluated), albeit useful (e.g. news tracking, publication trends, etc.). There is already work done on identifying storylines from large news corpora. The goal now would be to apply these techniques or similar to legal datasets and explore how topics that have evolved (over time). Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing This thesis is in collaboration with Maastricht Law & Tech Lab
Detecting the biases of datasets Recent advances in Machine Learning and Adversarial Training provide successful results in many promising fields. However, even if in many cases the datasets used seem to be unbiased (i.e. demographic information have been removed), these models might still capture these human biases in their internal structure even when solving a completely different task (e.g. topic classification). In this project, you will explore if and how this bias can be detected (using e.g. the hidden states of neural networks) and explore techniques to remove it. There is a variety of tasks you could work on: Sentiment analysis of tweets (and check whether there is a gender or age bias) Chatbot hidden state classification (and check whether there is a specific topic bias) Lyrics data (from website with song lyrics) and see how gender (or sth else) is represented Jodel data (from Maastricht compared to another city) and see how gender (or sth else) is represented Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing This thesis is in collaboration with Maastricht Law & Tech Lab