School of Physics Faculty Search Colloquium Series- Dr. Blake Bordelon

Blake Bordelon (Harvard) The Physics of Neural Networks: Mean-field Theory of Deep Learning Dynamics

Speaker:  Dr. Blake Bordelon

Title: The Physics of Neural Networks: Mean-field Theory of Deep Learning Dynamics

Abstract: Recent machine learning research has produced models with remarkable capabilities in a variety of domains including computational biology, visual object recognition, image and video generation, strategy game playing, and language modeling. Despite these empirical successes, these systems still suffer from fundamental challenges including limited interpretability, large compute and data costs, and uncertain trustworthiness during deployment. In addition, many key theoretical problems about their generalization, training dynamics and scaling behavior remain mysterious. In this talk, I will discuss how physics can help improve our understanding of deep learning systems and guide improvements to their scaling strategies. I will first discuss mathematical results based on mean-field techniques from statistical physics to analyze learning dynamics of neural networks. This theory will provide insights to develop initialization and optimization schemes for neural networks that admit well defined infinite width and depth limits and behave consistently across model scales, providing practical advantages. These limits also enable a theoretical characterization of the types of learned solutions reached by deep networks, enabling prediction of their generalization and scaling laws in certain cases. To conclude, I will discuss a few future directions including predicting regimes of memorization and generalization in diffusion models, simple models of reasoning in language models, and theory of reinforcement learning. 

Bio: Blake Bordelon is a PhD student in Applied Math at Harvard University where he researches the theory of natural and artificial neural networks. Some specific research topics in this area include neural network learning dynamics, generalization, memorization capacity, and large width and depth limits of neural networks. He is generally interested in theory of intelligence in neural networks, both for models in neuroscience and also as artificial deep learning systems. His work often employs ideas and methods from statistical physics, dynamical systems, and random matrix theory which are helpful for modeling learning and memory in large neural networks. As an undergraduate he studied Physics and Electrical and Systems Engineering at Washington University in St. Louis, where he also did research in nuclear physics and spiking neural networks. 

Event Details

Date/Time:

  • Date: 
    Monday, January 27, 2025 - 3:30pm to 4:30pm

Location:
Marcus Nanotechnology 1116-1118