Are you a bachelor/master student in Innsbruck or abroad and want to do a thesis involving machine learning, reinforcement learning, and/or robotics? Here are the topics I can offer. I am also open to working on other topics; if you have something in mind, let me know at samuele.tosatto@uibk.ac.at.
Symbolic Computations with Policy Gradients
Large language models (LLM) like transformers are revolutionising the way we interact with machines. However, LLMs have limited reasoning and computational abilities (math, logic, etc). There is a wide debate in the research community whether continuous systems (like neural networks) are really suited to solve such problems.
In AI early days, symbolic systems seemed a natural solution to solve “logic” or arithmetic problems. Neural networks took the lead thanks to their differentiability, which allows gradient training.
In this thesis, the student will explore an alternative system for solving math and logic problems based on symbolic computation that is trainable via policy gradient algorithms.
Training these symbolic machines is, however, nontrivial. There are possible theses on this topics, ranging from mainly software development to pure research.
Tags: theoretical-computer-science, reinforcement-learning, machine-learning.
Deep Movement Primitives in Reinforcement Learning
The application of reinforcement learning to robotics is hard.
In this thesis, the student will explore some action representation most suitable for robot learning by using deep movement primitives. The student will work both in simulation and with a real robot manipulator!
Tags: reinforcement-learning, imitation-learning, representation-learning, robot-learning.
A Temporal-Difference Approach to Policy Gradient
Temporal difference is widely used in reinforcement learning for policy evaluation and value-based improvement. In a recent work, we show that temporal diffeence is also very useful for estimating the policy gradient!
In this thesis, the student will implement a deep version of this novel reinforcement learning algorithm.
Tags: reinforcement-learning.