We Benchmarked MiniCPM-o 4.5 in Korean. Here's What Actually Happens.
We benchmarked MiniCPM-o 4.5's Korean performance side by side with English. Image descriptions, OCR, document extraction — what works, what breaks, and why the root cause is architecture, not prompts.

We Benchmarked MiniCPM-o 4.5 in Korean. Here's What Actually Happens.
MiniCPM-o 4.5 is an omni model optimized for English and Chinese. How well does it handle Korean?
We tested with the same images, same questions — one in Korean, one in English, side by side. Image description, OCR, document extraction, and fine-tuning, all tested hands-on.
The short answer: Korean works. But there are fascinating failure modes, and the root cause isn't what you'd expect.
Test Setup
Related Posts

Why GPT-4o Is So Fast: The Critical Difference Between Multimodal and Omni Models
A token-level analysis comparing the pipeline approach (STT→LLM→TTS) text bottleneck with native omni model token fusion. Explains why GPT-4o and MiniCPM-o are fundamentally faster.

On-Device GPT-4o Has Arrived? A Deep Dive into MiniCPM-o 4.5
OpenBMB's MiniCPM-o 4.5 achieves GPT-4o-level vision performance with just 9B parameters, running on only 11GB VRAM with Int4 quantization. A deep analysis of the architecture, benchmarks, and practical deployment guide.

PaperBanana: AI Now Generates Publication-Quality Academic Illustrations
PaperBanana from Google and Peking University is an agentic system that automatically generates publication-ready academic illustrations from paper text.