#benchmark | nghia-pham.dev

Tiếng Việt tốn hơn x2 token? Data nói khác Apr 21, 2026 ~14 min read
Does Vietnamese really cost 2x+ tokens in LLM prompts? Data from 5626 real messages Apr 21, 2026 ~13 min read
Evaluation: MMLU, GSM8K, HumanEval, custom benchmark May 17, 2026 ~10 min read
Local LLM 2026, bài 2: Apple Silicon vs CUDA vs CPU benchmark May 21, 2026 ~14 min read
Gần một nửa AI code có lỗi security: đừng hoảng, hãy đặt gate May 29, 2026 ~3 min read