~ nghia-pham.dev _
$ blog $ series $ tags $ about
$ Esc
Type to search posts...

> ls ./blog/

62 posts

  • LLM hoạt động thế nào: mental model cho dev

    Bạn gõ câu hỏi vào ChatGPT, 3 giây sau nhận được câu trả lời. Ở giữa có gì? Bài viết mở hộp đen: tokenize, embed, attention, sample — không dùng một công thức toán nào, chỉ mental model cho dev đã quen code nhưng lần đầu đọc kỹ về LLM.

    Apr 22, 2026 · ~10 min read
    llmaimachine-learningtransformertutorial
  • Calculus cho LLM: gradient, chain rule, backprop intuition

    Đạo hàm nghe sợ nhưng cốt lõi chỉ là đo độ dốc. Gradient là đạo hàm của hàm nhiều biến. Chain rule là cách chuyền gradient ngược qua nhiều layer. Backprop = chain rule áp dụng có hệ thống. Bài này xây intuition cho dev, không giải bài tập toán.

    Apr 22, 2026 · ~10 min read
    llmaimachine-learningmathcalculus
  • LLM từ zero: Series Plan

    Roadmap 30 bài học LLM từ foundation math đến production deployment cho senior dev muốn pivot AI — mental model, tokenization, attention, training, fine-tuning, inference, advanced topics. Hybrid approach: 70% hands-on code + 30% blog.

    Apr 22, 2026 · ~6 min read
    llmaimachine-learningserieslearning-path
  • Linear algebra cho LLM: vector, matrix, dot product

    Bài 1 nói mọi thứ bên trong LLM đều là vector và matrix. Vector là gì? Matrix là gì? Tại sao dot product là backbone của attention và RAG? Bài này phá băng math foundation cho dev — chỉ 4 khái niệm, không công thức phức tạp.

    Apr 22, 2026 · ~13 min read
    llmaimachine-learningmathlinear-algebra
  • Neural network tối giản: perceptron, MLP từ zero

    Ghép linear algebra + calculus + probability thành neural network đầu tiên. Từ perceptron 1957 đến MLP đa layer, code 60 dòng NumPy train XOR không cần PyTorch. Sau bài này, bạn hiểu building block gốc của mọi LLM hiện đại.

    Apr 22, 2026 · ~12 min read
    llmaimachine-learningneural-networkperceptron
  • Probability cho LLM: softmax, cross-entropy, perplexity

    LLM output là xác suất, không phải lựa chọn cứng. Softmax biến logits thành phân phối. Cross-entropy là loss function chuẩn. Perplexity là metric đánh giá model. Bài này giải thích tại sao mấy khái niệm này là trái tim của training và evaluation, với code NumPy minh hoạ.

    Apr 22, 2026 · ~11 min read
    llmaimachine-learningmathprobability
  • Build BPE tokenizer từ đầu (theo Karpathy minbpe)

    Bài 6 giới thiệu BPE. Bài này code từ zero — 150 dòng Python thuần không dependency. Train tokenizer trên Shakespeare, encode/decode, visualize merges. Sau bài này hiểu 100% BPE thay vì chỉ đọc paper thấy abstract.

    Apr 22, 2026 · ~12 min read
    llmaitokenizationbpepython
  • Attention mechanism: Query, Key, Value intuition

    Paper 'Attention is All You Need' (2017) là điểm bùng nổ của Transformer. Nhưng Q/K/V từ đâu ra, nghĩa gì, tại sao 3 cái thay vì 1? Bài này giải thích bằng analogy thư viện, không công thức - xây intuition trước khi đến code ở bài 10.

    Apr 22, 2026 · ~11 min read
    llmaiattentiontransformerqkv
  • Embeddings: word2vec, contextual, và positional encoding (RoPE)

    Token ID biến thành vector - đó là embedding. Nhưng vector đó từ đâu? word2vec (2013) dạy model hiểu semantic. Contextual embedding (BERT/GPT) khác word2vec thế nào? Tại sao cần thêm positional encoding, và RoPE làm điều đó cách nào?

    Apr 22, 2026 · ~11 min read
    llmaiembeddingsword2vecrope
  • Multi-head attention: tại sao chia nhiều head

    Bài 10 code single-head attention. GPT/Llama có 32-128 heads. Tại sao chia? Mỗi head làm gì khác nhau? Cost tính thêm bao nhiêu? Bài này: intuition + code multi-head bằng NumPy, visualize head specialization (syntax, coreference, long-range).

    Apr 22, 2026 · ~13 min read
    llmaiattentionmulti-headtransformer
  • nanoGPT: 300 dòng PyTorch tái tạo GPT từ đầu

    Capstone Part 3. Karpathy nanoGPT là implementation GPT-2 trọn vẹn trong ~300 dòng PyTorch. Bài này walk-through code, train GPT nhỏ trên Shakespeare trong 15 phút CPU, generate text. Sau bài này bạn code được GPT-2 nhỏ không cần HuggingFace.

    Apr 22, 2026 · ~12 min read
    llmaigptpytorchnanogpt
  • Self-attention: code từ đầu bằng NumPy

    Bài 9 đã xây intuition QKV. Bài này code từ zero một self-attention layer hoàn chỉnh bằng NumPy thuần - 80 dòng, xử lý batch, causal mask, scaling. Verify output matches PyTorch implementation. Sau bài này, attention không còn là hộp đen.

    Apr 22, 2026 · ~10 min read
    llmaiattentionself-attentionnumpy
  • Transformer block: attention + MLP + layer norm + residual

    Multi-head attention là một nửa Transformer. Nửa còn lại: MLP (feed-forward), layer normalization, residual connection. Bài này ghép 4 thành phần thành 1 block hoàn chỉnh, stack 12 block thành GPT-2, giải thích thứ tự (pre-norm vs post-norm) và tại sao residual quan trọng.

    Apr 22, 2026 · ~13 min read
    llmaitransformermlplayer-norm
  • Tokenization: BPE, WordPiece, SentencePiece

    Bài 1 nói input text biến thành tokens. Nhưng cách biến là gì? BPE, WordPiece, SentencePiece có gì khác nhau? Tại sao tokenizer quyết định nhiều hơn bạn nghĩ - từ cost API đến khả năng model xử lý tiếng Việt. Deep dive cho dev.

    Apr 22, 2026 · ~14 min read
    llmaimachine-learningtokenizationbpe
  • AI Coding Providers Series: Chọn đúng plan cho workload của bạn

    Series research và so sánh các AI coding plan (subscription + API pay-per-token) của Anthropic, Alibaba, GLM, Moonshot, OpenAI. Giúp lập trình viên chọn đúng provider cho ngân sách và workflow thực tế.

    Apr 21, 2026 · ~1 min read
    aillmcodingpricingcomparison
  • Mua AI Coding Plan nào? Research 5 providers lớn (2026-04)

    So sánh chi tiết subscription plan và API pay-per-token pricing của Anthropic, Alibaba, GLM, Moonshot, OpenAI tại thời điểm tháng 4/2026. Kèm decision framework và cảnh báo billing pitfall.

    Apr 21, 2026 · ~11 min read
    aillmcodingpricingcomparison
  • Tiếng Việt tốn hơn x2 token? Data nói khác

    Benchmark trên 5626 prompt thực tế từ 555 sessions Claude Code. Claim 'tiếng Việt tốn hơn x2 token' chỉ đúng 2.9% use case. Phần lớn thời gian mix-lang Việt-Anh còn tiết kiệm hơn pure English, và data cho thấy lý do.

    Apr 21, 2026 · ~14 min read
    llmprompt-engineeringtoken-optimizationbenchmark
  • Does Vietnamese really cost 2x+ tokens in LLM prompts? Data from 5626 real messages

    Benchmark across 5626 real prompts from 555 Claude Code sessions shows the '>2x token' claim for Vietnamese only applies to 2.9% of actual usage. Mixed Vietnamese-English prompts are more token-efficient than pure English on longer messages, and the data shows why.

    Apr 21, 2026 · ~13 min read
    llmprompt-engineeringtoken-optimizationbenchmark
  • Canvas: dựng report branded cho stakeholder

    Dùng Canvas của Kibana để dựng infographic pixel-precise có brand công ty: khác Dashboard thế nào, expression language pipeline, data source ESSQL, dynamic image/color theo value, và export PDF multi-page giao CEO/CFO — dành cho developer backend và platform team.

    Apr 16, 2026 · ~8 min read
    kibanacanvasreportingessqlvisualization
  • Discover nâng cao: Runtime fields, filter phức tạp, highlighting

    Nâng Discover từ mức cơ bản lên power-user: tạo Runtime field không cần reindex, filter nested object và regex, bật highlighting để scan log nhanh, phân biệt Saved Query với Saved Search, inspect request để debug query và tối ưu performance.

    Apr 16, 2026 · ~8 min read
    kibanadiscoverruntime-fieldspainlesselasticsearch
  • KQL và ES|QL: So sánh hai ngôn ngữ query của Kibana

    Phân biệt KQL và ES|QL trong Kibana 8.x: triết lý khác nhau, cú pháp đối chiếu, pitfall phổ biến, và quy tắc tay chọn ngôn ngữ nào cho filter, aggregation, alert và dashboard — dành cho developer backend và DevOps.

    Apr 16, 2026 · ~10 min read
    kibanaelasticsearchkqlesqlquery-language
  • Lens: từ drag-drop tới công thức phức tạp

    Dựng visualization trong Kibana 8.x bằng Lens: drag-drop chart cơ bản, Formula mode với function và time shift, annotation layer cho deploy marker, reference line cho SLO, pitfall về cardinality và time interval — dành cho developer backend muốn tự làm dashboard production-grade.

    Apr 16, 2026 · ~8 min read
    kibanalensvisualizationdashboardformula
  • Kibana cho Developer: Filter log, Saved Search, Dashboard và REST API

    Hướng dẫn toàn diện sử dụng Kibana cho lập trình viên backend: filter error log bằng KQL, tránh pitfall với ES|QL, tạo Saved Search và Dashboard qua GUI, tương tác Kibana qua REST API và quản lý API key an toàn.

    Apr 15, 2026 · ~11 min read
    kibanaelasticsearchloggingelkobservability
  • Kibana từ A đến Z: Series Plan

    Roadmap series 28 bài học Kibana từ cơ bản đến production — cover Discover, KQL/ES|QL, Lens, Dashboard, Alerts, RBAC, ILM, automation và troubleshooting cho developer backend.

    Apr 15, 2026 · ~5 min read
    kibanaserieslearning-pathelkobservability
  • Backstage on Kubernetes: Practical Platform Engineering Guide

    Implement a practical Internal Developer Platform with Backstage on Kubernetes, software templates, service catalog, and golden paths for engineering teams.

    · ~2 min read
    backstagekubernetesplatform-engineeringidpdeveloper-experience
  • ArgoCD Advanced Patterns: App of Apps and Promotion Flows

    Implement advanced ArgoCD patterns for scalable GitOps: App of Apps, environment promotion, sync waves, and safe progressive delivery workflows.

    · ~2 min read
    argocdgitopskubernetesprogressive-deliverycicd
  • [24/24] E is for Etcd: Understanding the Brain of Kubernetes

    A deep dive into etcd, the distributed key-value store that powers Kubernetes. Learn about consistency, high availability, and backup strategies.

    · ~2 min read
    kubernetesa-to-z-seriesetcddatabasedistributed-systems
  • [23/24] B is for Best Practices: Building Secure and Reliable Apps

    The second post in our Kubernetes A-to-Z series covering essential best practices for security, reliability, and resource management.

    · ~3 min read
    kubernetesa-to-z-seriesbest-practicessecurityreliability
  • [19/24] A is for Authentication and RBAC: Securing Your Cluster

    The sixteenth post in our Kubernetes A-to-Z series covering authentication mechanisms, Role-Based Access Control, security contexts, and cluster security best practices.

    · ~6 min read
    kubernetesa-to-z-seriesauthenticationrbacsecurity
  • [4/24] D is for Deployments: Managing Application Lifecycle

    The fourth post in our Kubernetes A-to-Z series covering Deployments, rolling updates, rollbacks, and application lifecycle management strategies.

    · ~7 min read
    kubernetesa-to-z-seriesdeploymentsrolling-updatesrollbacks
  • [2/24] C is for Containers: Docker Fundamentals Before Kubernetes

    The second post in our Kubernetes A-to-Z series covering container fundamentals, Docker basics, and essential concepts needed before learning Kubernetes.

    · ~8 min read
    dockercontainersa-to-z-serieskubernetesfundamentals
  • [20/24] F is for Federation: Multi-Cluster Management

    The seventeenth post in our Kubernetes A-to-Z series covering multi-cluster architectures, federation patterns, service mesh, disaster recovery, and cross-cluster communication.

    · ~6 min read
    kubernetesa-to-z-seriesfederationmulti-clusterservice-mesh
  • [22/24] G is for GitOps: Modern Deployment Workflows

    A comprehensive guide to GitOps principles and practices, comparing ArgoCD and FluxCD with practical examples, deployment strategies, and production best practices.

    · ~10 min read
    gitopsargocdfluxcdkubernetesci-cd
  • Building Internal Developer Platforms on Kubernetes: A Comprehensive Guide

    Learn how to build an Internal Developer Platform (IDP) on Kubernetes with Backstage, self-service capabilities, golden paths, and platform engineering best practices.

    · ~12 min read
    platform-engineeringkubernetesbackstagedeveloper-experiencedevops
  • [11/24] I is for Ingress: Managing External Access

    The tenth post in our Kubernetes A-to-Z series covering Ingress controllers, routing rules, TLS termination, and advanced traffic management patterns.

    · ~6 min read
    kubernetesa-to-z-seriesingressnetworkingtls
  • [1/24] K is for Kubernetes: Understanding the Basics and Architecture

    The first post in our Kubernetes A-to-Z series covering Kubernetes fundamentals, architecture, components, and basic cluster setup.

    · ~7 min read
    kubernetesa-to-z-seriesarchitecturebasicstutorial
  • [7/24] J is for Jobs and CronJobs: Batch Processing in Kubernetes

    Learn how to run one-off tasks and scheduled batch jobs in Kubernetes using Jobs and CronJobs resources.

    · ~2 min read
    kubernetesa-to-z-seriesjobscronjobsbatch-processing
  • Kafka Partition Design for IoT: Throughput and Ordering

    Design Kafka topic and partition strategy for IoT workloads with practical guidance on throughput, ordering, consumer scaling, and operational limits.

    · ~3 min read
    kafkaiotstreamingpartitionsarchitecture
  • Kubernetes Backup and Disaster Recovery: Velero and etcd

    Design a practical backup and disaster recovery strategy for Kubernetes with etcd snapshots, Velero, restore drills, and RTO/RPO planning.

    · ~2 min read
    kubernetesdisaster-recoverybackupveleroetcd
  • [12/24] H is for Helm: Package Management for Kubernetes

    The eleventh post in our Kubernetes A-to-Z series covering Helm charts, repositories, templating, values, and application lifecycle management.

    · ~7 min read
    kubernetesa-to-z-serieshelmpackage-managementcharts
  • Kubernetes Multi-Tenancy: Namespace, RBAC, and Quota Design

    Design a practical multi-tenant Kubernetes model with namespace boundaries, RBAC, network isolation, quotas, and operational guardrails.

    · ~2 min read
    kubernetesmulti-tenancyrbacnamespaceresourcequota
  • Kubernetes Cost Optimization in Production

    A practical guide to reducing Kubernetes infrastructure spend with right-sizing, autoscaling, scheduling strategy, and workload-level optimization.

    · ~3 min read
    kubernetesfinopscost-optimizationautoscalingperformance
  • Kubernetes Security Hardening Checklist for Production

    A practical security hardening checklist for production Kubernetes clusters, covering identity, network, workloads, supply chain, and runtime controls.

    · ~3 min read
    kubernetessecurityhardeningrbacnetworkpolicy
  • Kubernetes A-to-Z Series: Complete Learning Path

    A comprehensive 24-part blog series covering Kubernetes from beginner to advanced level, with practical examples and real-world scenarios.

    · ~5 min read
    kubernetesserieslearning-pathdevopscontainers
  • Kubernetes vs Docker Swarm: Complete Comparison Guide with Command Cheatsheets

    A comprehensive comparison of Kubernetes and Docker Swarm container orchestration platforms, including detailed command cheatsheets, architecture differences, and practical examples.

    · ~8 min read
    kubernetesdocker-swarmcontainer-orchestrationdevopscomparison
  • [10/24] M is for ConfigMaps and Secrets: Managing Configuration

    The ninth post in our Kubernetes A-to-Z series covering ConfigMaps, Secrets, configuration management patterns, and environment-specific deployments.

    · ~7 min read
    kubernetesa-to-z-seriesconfigmapssecretsconfiguration
  • [15/24] L is for Logging and Monitoring: Observability in Kubernetes

    The thirteenth post in our Kubernetes A-to-Z series covering logging architectures, Prometheus metrics, distributed tracing, and observability best practices.

    · ~7 min read
    kubernetesa-to-z-seriesloggingmonitoringobservability
  • [13/24] O is for Operators: Extending Kubernetes Functionality

    The twelfth post in our Kubernetes A-to-Z series covering Operators, Custom Resource Definitions (CRDs), controller patterns, and extending Kubernetes.

    · ~6 min read
    kubernetesa-to-z-seriesoperatorscrdcustom-resources
  • [8/24] N is for Namespaces: Organizing Your Cluster

    The seventh post in our Kubernetes A-to-Z series covering Namespaces, multi-tenancy, resource quotas, and cluster organization strategies.

    · ~8 min read
    kubernetesa-to-z-seriesnamespacesmulti-tenancyresource-quotas
  • [3/24] P is for Pods: The Basic Building Blocks of Kubernetes

    The third post in our Kubernetes A-to-Z series covering pods, their lifecycle, networking, storage, and multi-container patterns.

    · ~10 min read
    kubernetesa-to-z-seriespodscontainersmulti-container
  • Kubernetes Observability Stack: Prometheus, OpenTelemetry, and Loki

    Build a practical Kubernetes observability stack using metrics, logs, and traces with Prometheus, OpenTelemetry, Loki, and actionable SLO-driven alerting.

    · ~2 min read
    kubernetesobservabilityprometheusopentelemetryloki
  • PostgreSQL Index Size Deep Dive: Why Indexes Grow Fast

    Understand why PostgreSQL indexes can grow quickly in production and how to control index bloat with better schema design, maintenance, and query patterns.

    · ~2 min read
    postgresqldatabaseindexperformancestorage
  • [17/24] Q is for Quality Assurance: Testing in Kubernetes

    The fifteenth post in our Kubernetes A-to-Z series covering testing strategies, chaos engineering, CI/CD integration, and quality assurance best practices.

    · ~6 min read
    kubernetesa-to-z-seriestestingquality-assurancechaos-engineering
  • [6/24] R is for ReplicaSets: Ensuring High Availability

    The sixth post in our Kubernetes A-to-Z series covering ReplicaSets, scaling strategies, pod disruption budgets, and high availability patterns.

    · ~7 min read
    kubernetesa-to-z-seriesreplicasetshigh-availabilityscaling
  • Stateful Workloads on Kubernetes: PostgreSQL and Kafka Operators

    Run stateful workloads safely on Kubernetes with operator-based patterns for PostgreSQL and Kafka, including storage, scaling, backup, and failure recovery.

    · ~2 min read
    kubernetesstatefulpostgresqlkafkaoperators
  • Service Mesh Deep Dive: Istio vs Linkerd vs Consul Connect

    A comprehensive comparison of service mesh platforms including architecture, features, performance benchmarks, and practical implementation guides for Istio, Linkerd, and Consul Connect.

    · ~11 min read
    service-meshistiolinkerdconsulkubernetes
  • [5/24] S is for Services: Networking and Service Discovery

    The fifth post in our Kubernetes A-to-Z series covering Services, networking patterns, service discovery, and load balancing in Kubernetes.

    · ~7 min read
    kubernetesa-to-z-seriesservicesnetworkingservice-discovery
  • [16/24] T is for Troubleshooting: Common Issues and Solutions

    The fourteenth post in our Kubernetes A-to-Z series covering debugging techniques, common issues, diagnostic commands, and systematic troubleshooting approaches.

    · ~8 min read
    kubernetesa-to-z-seriestroubleshootingdebuggingdiagnostics
  • [18/24] U is for Upgrades: Managing Cluster Lifecycle

    Master the art of Kubernetes upgrades. Learn about version skew policies, node draining, and strategies for zero-downtime cluster maintenance.

    · ~2 min read
    kubernetesa-to-z-seriesupgradesmaintenancelifecycle
  • [9/24] V is for Volumes: Persistent Storage in Kubernetes

    The eighth post in our Kubernetes A-to-Z series covering Volumes, PersistentVolumes, PersistentVolumeClaims, storage classes, and stateful application patterns.

    · ~8 min read
    kubernetesa-to-z-seriesvolumespersistent-storagepv
  • [14/24] Y is for YAML: Mastering the Language of Kubernetes

    Love it or hate it, YAML is the language of Kubernetes. Learn syntax tips, common pitfalls, and tools to validate your manifests.

    · ~2 min read
    kubernetesa-to-z-seriesyamlconfigurationtools
  • [21/24] Z is for Zero-Downtime Deployments: Advanced Deployment Strategies

    The final post in our Kubernetes A-to-Z series covering advanced deployment strategies, GitOps, progressive delivery, canary deployments, and production-ready patterns.

    · ~6 min read
    kubernetesa-to-z-serieszero-downtimedeployment-strategiesgitops
$ echo "built with Astro"
© 2026 Nghia Pham | RSS | GitHub | nghia-pham.com