Smule Labs - Exploring Novel Audio Interaction Paradigms

The world has changed over the last few years, to put it lightly. For the first time ever, a computer can listen to you hum a melody and turn it into a full arrangement. It can seamlessly shift your voice into different timbres, create fully fleshed songs from a string of text, and generate endless variations of a beat with the click of a button.

If the audio generation problem is essentially solved, maybe it was never the hard part.

The magic of making music has always been in the interaction: a human control input, a result, a response; a continuous cycle. The interface between artistic intention and sonic result.

Though we now possess audio models that a decade ago would have seemed a fever dream, we're still figuring out what it means to play them.

Here are a few important questions we're asking:

How do you collaborate with an AI that can generate everything, but not the specific thing you want?

How do you learn to play an AI system as an instrument and how should we design systems so they can be expressively performed?

Do we want real-time response or should we embrace the inherent latency of machine “thought” in exchange for more magical experiences?

Can these systems really generate impactful music or just "good output”? And, who decides what “good” is? Can you? Should you?

If everyone speaks their own musical language, how do we build systems that understand them all, especially when these languages contradict each other?

These are just some of the questions Smule Labs exists to answer.

We're not just building better audio generative models (though we might). We're designing the experiences and interaction paradigms that allow human beings to use models to make something that matters to them and makes them feel something. We believe putting humans in the loop isn't just ethically necessary but necessary, period.

Smule emerged from the fertile creative environment of Stanford’s CCRMA, Stanford's music and computer research center and we've been exploring how people make music together since 2008. Now we're bringing that same human-first approach to the AI era. Through collaborations with researchers, engineers, and artists, we're building the interfaces, experiences, and frameworks that will define how people create with AI audio.

We are committed to the following principles:

We will design neural networks with human controllability built in from the start and not as an add-on or afterthought.

We will build public creation experiences that anyone can try. It's impossible to know whether an interaction paradigm works until real people use, misuse, or abuse it.

We will share our research and engineering findings with the academic, development, and arts communities to push forward what's possible.

This is early territory. Most of the paradigms we try will fail. But, somewhere in these experiments is the answer to what it means to collaboratively make music with AI.

We can't do this alone.

Come join us in Smule's long-established ethos:

Connecting the world through music