Hello there — welcome to my homepage.

I am Yuxuan Wu, a PhD candidate in Machine Learning at MBZUAI, advised by Professor Gus Xia and Professor Bhiksha Raj. My work focuses on hierarchical representation learning and self-supervised learning, with applications in structured perceptual domains such as vision, audio and music.

My research asks a simple yet basic question:

How is intelligence structured across levels of abstraction?

A hierarchical organization of intelligence

Humans perceive, describe, and predict the same underlying reality through multi-modal representations that range from low-level sensory patterns (e.g. video and audio) to high-level symbolic systems (e.g. language, mathematics and music notations). These heterogeneous representations serve complementary functional roles. I aim to build AI systems that reflect this hierarchical organization of intelligence. In particular, I investigate:

  • How hierarchical and structured representations can emerge from raw perceptual signals under principled inductive biases; and
  • How heterogeneous representations can be aligned at the function/model level for coordinated understanding, reasoning and generation.

A hierarchical organization of intelligence

Outside of research, I am also a music artist. I create, experiment, and evolve through sound, treating music as both a language of expression and a landscape for exploration. I started playing the keyboard at 6, and started making music at 13. Over the years I publish most of my works in the name of BowOfAtlas.

updatedupdated2026-03-122026-03-12