GPT-4o mini Audio Preview

Input:$75/M

Output:$300/M

GPT-4o mini Audio Preview is a compact multimodal model for building conversational audio applications. It supports speech input and output alongside text, enabling speech recognition, speech synthesis, and mixed text-audio dialogs with tool/function calling for structured actions. Typical uses include voice assistants, streaming transcription with summarization, IVR and call-bot workflows, and audio-enabled in-app helpers. Technical highlights include audio I/O, streaming responses, instruction following, and integration via chat and tools APIs.

Commercial Use

Features

Pricing

API

Versions

Pricing for GPT-4o mini Audio Preview

Explore competitive pricing for GPT-4o mini Audio Preview, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how GPT-4o mini Audio Preview can enhance your projects while keeping costs manageable.

Comet Price (USD / M Tokens)	Official Price (USD / M Tokens)	Discount
Input:$75/M Output:$300/M	Input:$93.75/M Output:$375/M	-20%

Versions of GPT-4o mini Audio Preview

The reason GPT-4o mini Audio Preview has multiple snapshots may include potential factors such as variations in output after updates requiring older snapshots for consistency, providing developers a transition period for adaptation and migration, and different snapshots corresponding to global or regional endpoints to optimize user experience. For detailed differences between versions, please refer to the official documentation.

version
gpt-4o-mini-audio-preview-2024-12-17
gpt-4o-mini-audio-preview