GPT-4o mini Audio Preview is a compact multimodal model for building conversational audio applications. It supports speech input and output alongside text, enabling speech recognition, speech synthesis, and mixed text-audio dialogs with tool/function calling for structured actions. Typical uses include voice assistants, streaming transcription with summarization, IVR and call-bot workflows, and audio-enabled in-app helpers. Technical highlights include audio I/O, streaming responses, instruction following, and integration via chat and tools APIs.
Commercial Use
Features
Pricing
API
Versions
Pricing for GPT-4o mini Audio Preview
Explore competitive pricing for GPT-4o mini Audio Preview, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how GPT-4o mini Audio Preview can enhance your projects while keeping costs manageable.
Comet Price (USD / M Tokens)
Official Price (USD / M Tokens)
Discount
Input:$75/M
Output:$300/M
Input:$93.75/M
Output:$375/M
-20%
Sample code and API for GPT-4o mini Audio Preview
Access comprehensive sample code and API resources for GPT-4o mini Audio Preview to streamline your integration process. Our detailed documentation provides step-by-step guidance, helping you leverage the full potential of GPT-4o mini Audio Preview in your projects.
Versions of GPT-4o mini Audio Preview
The reason GPT-4o mini Audio Preview has multiple snapshots may include potential factors such as variations in output after updates requiring older snapshots for consistency, providing developers a transition period for adaptation and migration, and different snapshots corresponding to global or regional endpoints to optimize user experience. For detailed differences between versions, please refer to the official documentation.