CompInterp

CompInterp uncovers the structure of neural representations, showing how simple features compose into complex behaviours. By unifying tensor and neural network paradigms, model weights and data are treated as a single modality. This compositional lens on design, analysis and control paves the way for inherently interpretable AI without compromising performance.

Compositional architectures capture rich non-linear (polynomial) relationships between representation spaces. Instead of masking them through linear approximations, CompInterp methods expose their inherent hierarchical structure accross levels of abstraction. This allows for weight-based subcircuit analysis, grounding interpretability in formal (de)compositions rather than post-hoc activation-based heuristics.

We’re now scaling compositional interpretability to transformers and CNNs by leveraging their low-rank structure through tensor decomposition and information theory. Learn more in our latest talk!

news

Oct 15, 2025	We are presenting our work at Flanders AI research day.
Oct 01, 2025	We got a spotlight at the mechinterp workshop (NeurIPS’25)!
Apr 05, 2025	We have a website now!
Mar 04, 2025	We are presenting our poster at CoLoRAI (AAAI’25)!

selected publications

MI @ NeurIPS
Finding Manifolds With Bilinear Autoencoders

Thomas Dooms and Ward Gauderis

In Mechanistic Interpretability Workshop: At the Thirty-Ninth Annual Conference on Neural Information Processing Systems, Oct 2025

Abs Bib HTML PDF

Sparse autoencoders are a standard tool for uncovering interpretable latent representations in neural networks. Yet, their interpretation depends on the inputs, making their isolated study incomplete. Polynomials offer a solution; they serve as algebraic primitives that can be analysed without reference to input and can describe structures ranging from linear concepts to complicated manifolds. This work uses bilinear autoencoders to efficiently decompose representations into quadratic polynomials. We discuss improvements that induce importance ordering, clustering, and activation sparsity. This is an initial step toward nonlinear yet analysable latents through their algebraic properties.
@inproceedings{dooms_bilinear_2025, title = {Finding Manifolds With Bilinear Autoencoders}, url = {https://openreview.net/forum?id=ybJXIh4vcF}, urldate = {2025-10-13}, booktitle = {Mechanistic Interpretability Workshop: {At} the Thirty-Ninth Annual Conference on Neural Information Processing Systems}, author = {Dooms, Thomas and Gauderis, Ward}, month = oct, year = {2025}, }
CoLoRAI @ AAAI
Compositionality Unlocks Deep Interpretable Models

Thomas Dooms^*, Ward Gauderis^*, Geraint Wiggins, and Jose Oramas

In Connecting Low-Rank Representations in AI: At the 39th Annual AAAI Conference on Artificial Intelligence, Nov 2024

Abs arXiv Bib HTML PDF Video Poster Slides

We propose χ-net, an intrinsically interpretable architecture combining the compositional multilinear structure of tensor networks with the expressivity and efficiency of deep neural networks. χ-nets retain equal accuracy compared to their baseline counterparts. Our novel, efficient diagonalisation algorithm, ODT, reveals linear low-rank structure in a multilayer SVHN model. We leverage this toward formal weight-based interpretability and model compression.
@inproceedings{dooms_compositionality_2024, title = {Compositionality {Unlocks} {Deep} {Interpretable} {Models}}, url = {https://openreview.net/forum?id=bXAt5iZ69l}, urldate = {2025-02-17}, booktitle = {Connecting {Low}-{Rank} {Representations} in {AI}: {At} the 39th {Annual} {AAAI} {Conference} on {Artificial} {Intelligence}}, author = {Dooms, Thomas and Gauderis, Ward and Wiggins, Geraint and Oramas, Jose}, month = nov, year = {2024}, }