Model Details
Released
Oct 2024
Status
Active
Context Length
128K
Max Output
4K
Provider
OpenAI
Where is gpt-4o-audio-preview a perfect fit?
GPT-4o-Audio-Preview is a multimodal audio model built for speech comprehension and generation. It enables live transcription, language translation, and human-like voice output—optimized for natural, low-latency conversational experiences.
Perfect Fit For:
- Real-time transcription and translation systems
- Conversational AI with natural voice responses
- Audio-based accessibility and assistive tools
- Content narration, dubbing, and media automation
Perfect Fit For:
- Real-time transcription and translation systems
- Conversational AI with natural voice responses
- Audio-based accessibility and assistive tools
- Content narration, dubbing, and media automation
Quick Model Estimate
Your GPT-5 Cost Estimate
💰 Total Cost
USD 3.00
for 1000 input + 1000 output tokens
📥 Input (1000 × $40.000000)
USD 1.5000
📤 Output (1000 × $80.000000)
USD 1.5000
Cost Breakdown
📥 Input 50%
📤 Output 50%
Prices updated daily from official provider data.
Pricing
|
Provider
↕
|
Modality
↕
|
Input Price
(per 1M tokens) ↕ |
Output Price
(per 1M tokens) ↕ |
Context Window
↕
|
Last Updated
↕
|
View
|
|---|---|---|---|---|---|---|
OpenAI
|
Audio | $40.0000 | $80.0000 | 128,000 tokens | 2025-11-22 | → |
OpenAI
|
Text | $2.5000 | $10.0000 | 128,000 tokens | 2025-11-18 | → |
