Fireworks AI now support Supervised Fine Tuning on gpt-oss models- first to support it! We have it enabled for both the 20b and 120b versions. You can run it via UI or CLI Note: Currently the SFT has limited batch size of 8192. When running the fine-tuning job, make sure to click on "Advanced Options" and change the "Batch Size" to 8192. PS: Looks like some of you already found it before the announcement 😄 Let us know if you have any feedback.
Fireworks AI
Software Development
Redwood City, CA 21,837 followers
Generative AI platform empowering developers and businesses to scale at high speeds
About us
Fireworks.ai offers generative AI platform as a service. We optimize for rapid product iteration building on top of gen AI as well as minimizing cost to serve. https://xmrrwallet.com/cmx.pfireworks.ai/careers
- Website
-
http://xmrrwallet.com/cmx.pfireworks.ai
External link for Fireworks AI
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- Redwood City, CA
- Type
- Privately Held
- Founded
- 2022
- Specialties
- LLMs and Generative AI
Locations
-
Primary
Redwood City, CA 94063, US
Employees at Fireworks AI
Updates
-
Join Roberto Barroso-Luque & Aishwarya Srinivasan for a 30-minute deep dive into OpenAI’s new gpt-oss models, on 12th Aug! We’ll break down the architecture, explore high-impact use cases, and show you how these open-weight reasoning models perform against other OSS models using chat.fireworks.ai. Plus, we’ll walk through a live demo of deploying gpt-oss in a real-world application. Whether you’re an AI engineer, developer, or builder, this session will show you how to get the most out of gpt-oss in production.
OpenAI gpt-oss Model DeepDive with Fireworks AI
www.linkedin.com
-
Have you ever wanted a no-friction way to compare large language models side-by-side? Well, we hear you! Fireworks AI just launched a beta version of chat.fireworks.ai, and I’d love for you to give it a spin. Here’s how it works → Choose any two models to chat with- whether it’s gpt-oss-120B, or any of our serverless models (Qwen, Deepseek, Llama, Kimi, etc) → Experiment with prompts in real time and see both responses side-by-side → Enable function calling to test which models perform best for your agentic use cases → Manually benchmark performance on your own use cases before you write a single line of code → For increased usage and advanced options please add an API KEY to the left hand side We built this tool so you can explore model behaviors, spot strengths and weaknesses, and make informed decisions, all in one place. We will be improving the app in the coming weeks so please let us know any feedback! 👉 Take it for a test drive at https://xmrrwallet.com/cmx.pchat.fireworks.ai/
-
We’re excited to announce a joint effort between Fireworks AI and AMD to bring OpenAI models to AMD’s latest MI355 GPUs. This collaboration will make powerful AI models more accessible and cost-efficient, coming soon to the Fireworks AI platform. OpenAI’s latest open-weight models- gpt-oss-20b and gpt-oss-120b are available on Fireworks AI. Go try them out! gpt-oss-120b: https://xmrrwallet.com/cmx.plnkd.in/d2i5-Bib gpt-oss-20b: https://xmrrwallet.com/cmx.plnkd.in/dXxqYzA9
-
Today, OpenAI released their first truly open-source models since GPT-2- and they’re solid. gpt-oss-20b andgpt-oss-120b are reasoning-first models built for real-world use. They support long context, multi-step tool use, and offer adjustable reasoning levels (low/mid/high). Think o3/o4-mini performance, but open. They’re built on a MoE architecture, but what really makes them shine is the quality of training data and reinforcement learning tuning. ✅ Both gpt-oss-20b and gpt-oss-120b are now live on Fireworks AI Read detailed blog here: https://xmrrwallet.com/cmx.plnkd.in/g3vXSNaR
-
Fireworks AI reposted this
🚀 OpenAI new open models on Fireworks AI 🚀 OpenAI just released its first truly open-weight LLM since GPT-2 in 2019, and the models rock! gpt-oss-20b and gpt-oss-120b are reasoning-first models built for real-world use. They support long context, multi-step tool use, and offer adjustable reasoning levels (low/mid/high). Think o3/o4-mini performance, but open. They’re built on a MoE architecture, but what really makes them shine is the quality of training data and reinforcement learning tuning. We are excited to launch with a few partners, including Hugging Face, Ollama, OpenRouter. You can deploy both models on Fireworks AI or through them. You can use Fireworks supervised fine-tuning on these models today and reinforcement fine-tuning is coming shortly! 👉 Here are the model links to get started: https://xmrrwallet.com/cmx.plnkd.in/e9S-5Y-C https://xmrrwallet.com/cmx.plnkd.in/eKHYFwMS
-
-
As the number of available AI models increases and their capabilities grow along multiple dimensions, it is more important than ever to be able to rapidly, specifically, and repeatably evaluate those models. Today, we launched evalprotocol.io to assist with this crucial element of AI model selection, with the aim to make it simple for teams to implement, and to assist with standardization of findings, so you can select what’s right for your use case.
-
-
Fireworks AI reposted this
Excited to announce our Batch Inference launch! 🚀 We're making large-scale AI workflows 50% more cost-effective with our new batch processing capability: ✅ 50% off serverless pricing ✅ Access to 1000+ models (including your fine-tuned models) ✅ Fully managed infrastructure - no rate limits, no complexity ✅ OpenAI-compatible batch file format for easy migration Perfect for teams running: - Model evaluation pipelines - Large-scale data processing - Synthetic data generation for training - Bulk analytics and classification tasks Process thousands of requests with simple JSONL uploads. Focus on your AI applications while we cut costs and handle the infrastructure. Read the Blog here: https://xmrrwallet.com/cmx.plnkd.in/dj6vzu46 Read the Docs here: https://xmrrwallet.com/cmx.plnkd.in/d4Sk2gme
-
Excited to announce our Batch Inference launch! 🚀 We're making large-scale AI workflows 50% more cost-effective with our new batch processing capability: ✅ 50% off serverless pricing ✅ Access to 1000+ models (including your fine-tuned models) ✅ Fully managed infrastructure - no rate limits, no complexity ✅ OpenAI-compatible batch file format for easy migration Perfect for teams running: - Model evaluation pipelines - Large-scale data processing - Synthetic data generation for training - Bulk analytics and classification tasks Process thousands of requests with simple JSONL uploads. Focus on your AI applications while we cut costs and handle the infrastructure. Read the Blog here: https://xmrrwallet.com/cmx.plnkd.in/dj6vzu46 Read the Docs here: https://xmrrwallet.com/cmx.plnkd.in/d4Sk2gme
-
Fireworks AI reposted this
🔥 Fireworks AI reaching 149 t/s for Qwen3-coder 480B A35B 🔥 What a month for open weight models! We launched APIs for 5 SOTA models - Kimi K2, Qwen3-coder 480B, Qwen3 235B 2507, Qwen3 235B Thinking, GLM 4.5. We know both latency and cost are important to creative application developers. Today, we are consistently #1 providing the best speed, price/performance with production scaling to 6000 requests per minute on serverless. As we continue to optimize our new model serving stack, we are excited to see a wide range of applications onboarded from coding assistants, general search and research, agentic workflows, avatars, domain specific research across legal, finance, marketing, etc. Can't wait to see the next explosion of innovative applications! Easy to start with all new models on Fireworks: https://xmrrwallet.com/cmx.plnkd.in/exkNPYit
-