Speaker: 
Christos Thrampoulidis
Speaker Affiliation: 
UBC
Speaker Link: 
https://sites.google.com/view/cthrampo

March 21, 2025

ESB 2012 and Zoom
Canada

To be held at ESB 2012 and on Zoom: https://ubc.zoom.us/j/68285564037?pwd=R2ZpLy9uc2pUYldHT3laK3orakg0dz09
Meeting ID: 682 8556 4037
Passcode: 636252

Reception and refreshments at 14:30 in the PIMS Lounge (ESB 4th floor).

View All Events

Abstract: 

Deep learning models are often seen as black boxes, their complexity stemming from integrating numerous architectural components across multiple layers while being trained on high-dimensional datasets with carefully tuned hyperparameters.  In this talk, I will present recent work  uncovering structural invariants in the geometry of deep neural representations, providing insights into how these models learn from data.

I will focus on language models trained with the next-token prediction objective as a case study. Despite the apparent conceptual simplicity of this training paradigm, large models trained on vast text corpora demonstrate an extraordinary ability to capture linguistic structure. I will show that in well-trained language models, representations of words and contexts (aka sequences of words) organize themselves into geometries characterized by sparse and low-rank matrix decompositions of training statistics.

I will also discuss how these findings establish connections with the theoretical frameworks of implicit regularization and neural collapse, contributing to the development of a more principled understanding of how deep-learning models extract and encode information from data.

Event Details

March 21, 2025

3:00pm to 4:00pm

ESB 2012 and Zoom

, , CA

View Map

Categories

  • Department Colloquium