Boson AI

Boson AI · 2025-04-10T21:17:25.851Z

At Boson AI, we work on making communication with AI as easy, natural and fun as talking to a human. Today, we are excited to introduce Higgs Audio Understanding and Higgs Audio Generation — two powerful tools designed to build customized AI agents tailored for diverse audio understanding and generation needs. Higgs Audio Generation: Realistic, Emotionally Intelligent Speech Traditional text-to-speech (TTS) systems may sound robotic, miss emotional nuance, and struggle with names, accents, or multiple voices. Higgs Audio Generation changes the game by offering emotionally rich speech and realistic multi-voice conversations. Our model understands the implied tone, urgency, hesitation and nuance in the text and renders it in a way a real human would. It pronounces foreign names and places correctly and with the correct accent. This makes our model ideal for games, audiobooks or screenplays. Higgs Audio Generation can accomplish this through the backing Large Language Model that ensures that it doesn't just speak words but understands them within context. Our model is trained on massive text-audio datasets for stunning realism. But don't just take our word for it - we have the benchmarks to prove it with Higgs Audio Generation beating openAI chat, Gemini and ElevenLabs in comparisons. Or try it out on our site. Complementing the suite is Higgs Audio Understanding, a model that can understand voice and other audio inputs. This makes it ideally suited for tasks such as transcription (speech recognition), including for meetings with multiple and sometimes slightly overlapping speakers. It also allows us to offer a model that can directly answer questions regarding the received audio, i.e. to perform Audio Understanding, without the need to hand the signal off to a separate dedicated Language Model. As a result, it is capable of reasoning about sounds (how many times did I clap my hands, where was the recording made, etc.) and music (what's the chord) at a high level of accuracy. Check out our magic broom shop demo to see how voice generation and audio understanding can work in harmony, e.g. for a retail application. Just like Audio Generation, our model is trained on massive text-audio datasets and it uses an underlying LLM to allow it to understand rather than just transcribe speech. In particular, this allows us to see benefits from Chain of Thought reasoning for complex understanding tasks. For more information, see https://xmrrwallet.com/cmx.plnkd.in/gYp_uBRk

Research Services

Santa Clara, CA 3,281 followers

Making communication with AI as easy, natural and fun as talking to a human

See jobs Follow

View all 27 employees

About us

We are transforming how stories are told, knowledge is learned, and insights are gathered.

Website: https://xmrrwallet.com/cmx.pboson.ai/
External link for Boson AI
Industry: Research Services
Company size: 11-50 employees
Headquarters: Santa Clara, CA
Type: Privately Held
Founded: 2023
Specialties: Artificial Intelligence and Machine Learning

Locations

Primary

Santa Clara, CA 95054, US

Get directions
Toronto, CA

Get directions

Employees at Boson AI

See all employees

Updates

Boson AI

3,281 followers
2w
Report this post
We're #hiring a new Member of Technical Staff, Modeling in Santa Clara, California. Apply today or share this post with your network.

Member of Technical Staff, Modeling

Boson AI, Santa Clara, CA

Like Comment Share
Boson AI

3,281 followers
2w
Report this post
We're #hiring a new Senior Software Engineer - Agentic Systems and Platform in Santa Clara, California. Apply today or share this post with your network.

Senior Software Engineer - Agentic Systems and Platform

Boson AI, Santa Clara, CA

Like Comment Share
Boson AI

3,281 followers
2w
Report this post
We're #hiring a new Member of Technical Staff, Modeling in Toronto, Ontario. Apply today or share this post with your network.

Member of Technical Staff, Modeling

Boson AI, Toronto, Ontario, Canada

Like Comment Share
Boson AI

3,281 followers
2w
Report this post
We're #hiring a new Member of Technical Staff, Evaluation in Santa Clara, California. Apply today or share this post with your network.

Member of Technical Staff, Evaluation

Boson AI, Santa Clara, CA

Like Comment Share
Boson AI

3,281 followers
3w
Report this post
We're #hiring a new Senior Full Stack Engineer - Agentic Systems and Platform in Santa Clara, California. Apply today or share this post with your network.

Senior Full Stack Engineer - Agentic Systems and Platform

Boson AI, Santa Clara, CA

Like Comment Share
Boson AI

3,281 followers
3w
Report this post
We're #hiring a new Machine Learning Engineer - Enterprise in Toronto, Ontario. Apply today or share this post with your network.

Machine Learning Engineer - Enterprise

Boson AI, Toronto, Ontario, Canada

1 Comment

Like Comment Share
Boson AI

3,281 followers
3w
Report this post
We're #hiring a new Senior Software Engineer - Ceph in Toronto, Ontario. Apply today or share this post with your network.

Senior Software Engineer - Ceph

Boson AI, Toronto, Ontario, Canada

Like Comment Share
Boson AI

3,281 followers
2mo
Report this post
We're #hiring a new Deep Learning Scientist in Santa Clara, California. Apply today or share this post with your network.

Deep Learning Scientist

Boson AI, Santa Clara, CA

Like Comment Share
Boson AI

3,281 followers
4mo
Report this post
Kudos to my team at Boson AI for delivering our first audio model. Mu Li, Xingjian Shi, Yizhi Liu, Shuai Zheng, Ruskin Raj Manku, Dongming Shen, Yi Zhu, Silin Meng, Ke Bai, Yuyang (Rand) Xie, Jielin Qiu, Sergii Tiugaiev, Jaewon Lee, Alex Tay, Martin Ma, Zhangcheng (Zach) Zheng

Boson AI

3,281 followers
4mo

At Boson AI, we work on making communication with AI as easy, natural and fun as talking to a human. Today, we are excited to introduce Higgs Audio Understanding and Higgs Audio Generation — two powerful tools designed to build customized AI agents tailored for diverse audio understanding and generation needs. Higgs Audio Generation: Realistic, Emotionally Intelligent Speech Traditional text-to-speech (TTS) systems may sound robotic, miss emotional nuance, and struggle with names, accents, or multiple voices. Higgs Audio Generation changes the game by offering emotionally rich speech and realistic multi-voice conversations. Our model understands the implied tone, urgency, hesitation and nuance in the text and renders it in a way a real human would. It pronounces foreign names and places correctly and with the correct accent. This makes our model ideal for games, audiobooks or screenplays. Higgs Audio Generation can accomplish this through the backing Large Language Model that ensures that it doesn't just speak words but understands them within context. Our model is trained on massive text-audio datasets for stunning realism. But don't just take our word for it - we have the benchmarks to prove it with Higgs Audio Generation beating openAI chat, Gemini and ElevenLabs in comparisons. Or try it out on our site. Complementing the suite is Higgs Audio Understanding, a model that can understand voice and other audio inputs. This makes it ideally suited for tasks such as transcription (speech recognition), including for meetings with multiple and sometimes slightly overlapping speakers. It also allows us to offer a model that can directly answer questions regarding the received audio, i.e. to perform Audio Understanding, without the need to hand the signal off to a separate dedicated Language Model. As a result, it is capable of reasoning about sounds (how many times did I clap my hands, where was the recording made, etc.) and music (what's the chord) at a high level of accuracy. Check out our magic broom shop demo to see how voice generation and audio understanding can work in harmony, e.g. for a retail application. Just like Audio Generation, our model is trained on massive text-audio datasets and it uses an underlying LLM to allow it to understand rather than just transcribe speech. In particular, this allows us to see benefits from Chain of Thought reasoning for complex understanding tasks. For more information, see https://xmrrwallet.com/cmx.plnkd.in/gYp_uBRk

Boson AI

boson.ai

Like Comment Share
Boson AI

3,281 followers
4mo
Report this post
At Boson AI, we work on making communication with AI as easy, natural and fun as talking to a human. Today, we are excited to introduce Higgs Audio Understanding and Higgs Audio Generation — two powerful tools designed to build customized AI agents tailored for diverse audio understanding and generation needs. Higgs Audio Generation: Realistic, Emotionally Intelligent Speech Traditional text-to-speech (TTS) systems may sound robotic, miss emotional nuance, and struggle with names, accents, or multiple voices. Higgs Audio Generation changes the game by offering emotionally rich speech and realistic multi-voice conversations. Our model understands the implied tone, urgency, hesitation and nuance in the text and renders it in a way a real human would. It pronounces foreign names and places correctly and with the correct accent. This makes our model ideal for games, audiobooks or screenplays. Higgs Audio Generation can accomplish this through the backing Large Language Model that ensures that it doesn't just speak words but understands them within context. Our model is trained on massive text-audio datasets for stunning realism. But don't just take our word for it - we have the benchmarks to prove it with Higgs Audio Generation beating openAI chat, Gemini and ElevenLabs in comparisons. Or try it out on our site. Complementing the suite is Higgs Audio Understanding, a model that can understand voice and other audio inputs. This makes it ideally suited for tasks such as transcription (speech recognition), including for meetings with multiple and sometimes slightly overlapping speakers. It also allows us to offer a model that can directly answer questions regarding the received audio, i.e. to perform Audio Understanding, without the need to hand the signal off to a separate dedicated Language Model. As a result, it is capable of reasoning about sounds (how many times did I clap my hands, where was the recording made, etc.) and music (what's the chord) at a high level of accuracy. Check out our magic broom shop demo to see how voice generation and audio understanding can work in harmony, e.g. for a retail application. Just like Audio Generation, our model is trained on massive text-audio datasets and it uses an underlying LLM to allow it to understand rather than just transcribe speech. In particular, this allows us to see benefits from Chain of Thought reasoning for complex understanding tasks. For more information, see https://xmrrwallet.com/cmx.plnkd.in/gYp_uBRk

Boson AI

boson.ai

4 Comments

Like Comment Share

Boson AI

Research Services

Santa Clara, CA 3,281 followers

Making communication with AI as easy, natural and fun as talking to a human

About us

Locations

Employees at Boson AI

Alex Smola

Large Models for All

Mu Li

Cofounder of BosonAI

Yizhi Liu

Co-Founder at Boson AI

Oksana Bundzyak, CPA

Director of Finance @ Boson AI. Experienced CPA in SaaS & Technology & Healthcare Startups I Venture Capital I Global Controller I Equity Management…

Updates

Member of Technical Staff, Modeling

Boson AI, Santa Clara, CA

Senior Software Engineer - Agentic Systems and Platform

Boson AI, Santa Clara, CA

Member of Technical Staff, Modeling

Boson AI, Toronto, Ontario, Canada

Member of Technical Staff, Evaluation

Boson AI, Santa Clara, CA

Senior Full Stack Engineer - Agentic Systems and Platform

Boson AI, Santa Clara, CA

Machine Learning Engineer - Enterprise

Boson AI, Toronto, Ontario, Canada

Senior Software Engineer - Ceph

Boson AI, Toronto, Ontario, Canada

Deep Learning Scientist

Boson AI, Santa Clara, CA

Join now to see what you are missing

Similar pages

Anuttacon

Boson

Boston AI

Anthropic

Inflection AI

Cohere

xAI

Perplexity

miHoYo

HoYoverse

Browse jobs

Engineer jobs

Software Engineer jobs

Scientist jobs

Senior Software Engineer jobs

Junior Project Manager jobs

Partner jobs

Support Engineer jobs

Machine Learning Engineer jobs

Director jobs

Project Manager jobs

Manager jobs

Analyst jobs

Solutions Engineer jobs

Windows Engineer jobs

System Administrator jobs

Site Reliability Engineer jobs

Solutions Architect jobs

Full Stack Engineer jobs

Customer Service Representative jobs

Senior Product Manager jobs