
Advanced Machine Learning Research Opportunities in Data Science
Explore cutting-edge research opportunities in applied Data Science and Machine Learning. Dive into topics like ML in Arts and Graphics, longitudinal data analysis, and more. Develop deep learning models, understand research papers, and program extensively to achieve quality results. Embrace challenges with unconditional love for programming and an independent, self-motivated approach.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Survival Guide You will need to read and understand many (I mean many ) research papers. You are expected to be independent and self-motivated but also report, update and teach me new things as well I will push you to do good work and provide the needed conditions for it. I won t pressure with deadlines, etc. You need to know how to program and love (unconditionally) to program. You will program (& debug). Quite a lot. You will need to be able to get things to work . It will be mostly hard work but you will be reimbursed by the quality results (most of the time ) My research interests are in applied Data Machine/Machine Learning in the fields highlighted in the topics. I am open to working on other Data Mining/Machine Learning areas but you have to convince me they are interesting.
Thesis Topics 2018 Thesis Topics 2018- -19 19 General info oEach thesis has a short description with several different tasks (and perhaps different datasets) oIn the comments below the slide you can find some useful references/starting points oDifferent goals might lead to different topics General requirements (see slide #1 again) oUnconditional love for programming oMachine learning/data mining understanding on a good level oFor language related theses, assume extra overhead if you are not familiar with text mining For an updated version of these slides and/or new topics check: https://dke.maastrichtuniversity.nl/jerry.spanakis/theses/
ML in Arts and Graphics DKE is participating in an interdisciplinary project about applying techniques from Artificial Intelligence to Arts. More specifically we are working on three different sub-tasks: Classification of visual compositions (black & white shapes) to harmonic or not Relation between visual and audio compositions and how a computer can generate and interchange between the two Automatically generate logos based on their color (this already has been attempted) and some keywords Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Artistic nature and interdisciplinary qualities
ML in Longitudinal Data Mobile applications offer rich datasets for analysis. However these datasets are hierarchical and longitudinal (i.e. multiple variables, different people, different timestamps etc.). So far the application of deep learning in such kind of data has not been that successful as in other fields. Your task(s) would be to explore current models that are successful (e.g. mixed effects models) and then investigate how you could apply "shallow" neural models (such as convolutional or recurrent neural networks) to such problems. You will use publically available data and data available to DKE. Competences that you need: Python or R Machine (Deep) learning
Sentiment Analysis on Text Data There is plenty of existing work towards sentiment analysis on short text data (tweets, etc.) and online reviews (imdb reviews, patient opinions, etc.). Possible tasks in this direction are: Transform the sentiment of a sentence (from positive to negative and vice versa) Target the aspects that drive the sentiment (e.g. price, service, etc.) and do it with few-labeled data Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Opinion summarization / Storyline extraction Language generation is blooming, although techniques applied are not really delivering promising results. There are two possible tasks to explore here: Summarizing reviews is a crucial task for quickly extracting accurate information. You will apply state-of-the-art muti-task learning techniques for the reviews of careopinion.co.uk dataset (available). Extend previous work done on identifying storylines from large news corpora. The goal now would be to actually generate coherent stories. Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Publications topics detection and tracking Identifying trends and how you can track their evolution in different forms of temporal data is a difficult task (to be performed and to be evaluated), albeit useful (e.g. news tracking, publication trends, etc.). There is already work done on identifying storylines from large news corpora. The goal now would be to apply these techniques or similar to the UM publications archive and explore how topics that researchers are publishing has evolved. Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Abusive language of #Maastricht Sexual harassment is becoming more of an issue for the city of Maastricht. Many people refer to social media for telling their stories however they never report them officially to the police. Your goal is in this project would be to scrap data from publically available sources (Facebook sharing is caring, Jodel, etc.) in order to accurately spatio-temporally map these events and understand different forms of harassment. There are already datasets that you could use (e.g. SafeCity) but we are interested into getting data about Maastricht. Competences that you need: Scrapping skills (to acquire the data) Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Detecting the biases of datasets Recent advances in Machine Learning and Adversarial Training provide successful results in many promising fields. However, even if in many cases the datasets used seem to be unbiased (i.e. demographic information have been removed), these models might still capture these human biases in their internal structure even when solving a completely different task (e.g. topic classification). In this project, you will explore if and how this bias can be detected (using e.g. the hidden states of neural networks) and explore techniques to remove it. There is a variety of tasks you could work on: Sentiment analysis of tweets (and check whether there is a gender or age bias) Chatbot hidden state classification (and check whether there is a specific topic bias) ... Competences that you need: Python and Deep Learning frameworks (pytorch, tensorflow/keras or equivalent) Machine (Deep) learning Text Mining and Natural Language Processing
Glass ceiling in CS The glass ceiling is a powerful metaphor for the unethical, invisible, and yet virtually impenetrable barrier that prevents highly achieving women and minorities from obtaining equal access to senior career opportunities. In this project, you will develop empirical methods to confirm/disprove the existince of such a phenomenon in Computer Science. For this reason you are going to use the DBLP dataset (comprising of all Computer Science bibliography. Competences that you need: Good skills on parsing files (e.g. XML) Python or R Machine learning