Alibaba Qwen models Qwen2.5-72B-Instruct & Qwen2.5-Coder-32B-Instruct
under review
endu
Merged in a post:
add Qwen2.5-Max
tag t
Ali's new model Qwen2.5-max High ability in coding mathematics.I believe it can make us more efficient.
endu
under review
Samuel Jackson
Qwen2.5-Coder-32B-Instruct
Price:
Input tokens: $0.80 per 1M tokens.
Output tokens: $0.80 per 1M tokens.
Blended price (3:1 input-to-output ratio): $0.80 per 1M tokens.
Context Window: Up to 131,072 tokens.
Output Speed: Approximately 72 tokens per second3.
Latency (Time to First Token): Around 0.37–0.49 seconds depending on the provider.
Rate Limits:
Specific rate limits depend on the API provider (e.g., Fireworks, DeepInfra, etc.), but no explicit hard limit is mentioned in the sources.
Qwen2.5-72B-Instruct
Price:
Input tokens: $0.40 per 1M tokens.
Output tokens: $0.75 per 1M tokens.
Blended price (3:1 input-to-output ratio): $0.40–$0.90 per 1M tokens based on provider.
Context Window: Up to 131,072 tokens.
Output Speed: Approximately 67.9–77 tokens per second depending on the provider.
Latency (Time to First Token): Around 0.31–0.82 seconds depending on the provider.
Rate Limits:
Similar to Qwen2.5-Coder, rate limits depend on the API provider, but no specific hard limits are provided in the sources.