AI Trends

This page summarizes qualitative trends across model families – how pricing, capabilities, and use cases are evolving – rather than ranking individual models.

1. Text → multimodal default

Most “text” LLMs are now multimodal by default, supporting images, long context, and sometimes audio/video. Pricing pressure is pushing providers to offer cheap “mini” tiers for bulk workloads, while premium models focus on reasoning and tools.

2. Speech models commoditizing

Speech‑to‑text quality is converging at the top end; differentiation increasingly comes from latency, diarization, language coverage, and pricing per minute rather than raw accuracy alone.

3. Image & video racing forward

Image and video models are moving from pure generation to editing, in‑painting, and controllable outputs. Expect rapid iteration here, with frequent model version bumps and changing pricing.