General Deep Learning
After switching from High Energy Physics to Deep Learning, I started working in Reinforcement Learning before pivoting towards Associative Memories and modern Transformer networks. Recent years have shown that scalable ideas, improving the datasets, and clever engineering are the ingredients for ever better Deep Learning models. This totally coincides with my experience, and – needless to say – I will continue working on general large-scale Deep Learning directions.
In Reinforcement Learning, we introduce RUDDER a novel model-free RL approach to overcome delayed reward problems. RUDDER directly and efficiently assigns credit to reward-causing state-action pairs and thereby speeds up learning in model-free reinforcement learning with delayed rewards dramatically. I have written a lengthy blogpost on the RUDDER idea. With Align-RUDDER, we extended the RUDDER framework by assuming that episodes with high rewards are given as demonstrations. Finally, we prove converge for actor-critic methods like RUDDER or PPO.
In Hopfield Networks is All You Need, we introduced a new energy function and a corresponding new update rule which is guaranteed to converge to a local minimum of the energy function. The new modern Hopfield Network with continuous states keeps the characteristics of its discrete counterparts, i.e., exponential storage capacity and fast convergence. Due to its continuous states this new modern Hopfield Network is differentiable and can be integrated into deep learning architectures. Typically patterns are retrieved after one update which is compatible with activating the layers of deep networks. Surprisingly, the new update rule is the attention mechanism of transformer networks introduced in Attention Is All You Need. I have written a lenthy blogpost on modern Hopfield Networks. One SOTA application of modern Hopfield Networks can be found in our paper Modern Hopfield Networks and Attention for Immune Repertoire Classification. Here, the high storage capacity of modern Hopfield Networks is exploited to solve a challenging multiple instance learning (MIL) problem in computational biology called immune repertoire classification. Finally, we found out that linearized attention models, such as the Performer idea resemble the update rule of classical Hopfield Networks. Our blogpost was published in the ICLR2022 blogpost track.
I will keep working on understanding large scale architectures. There are quite a few interesting projects I am involved in.