Blog — Shailesh Mishra

June 11, 2026 · 13 min read

How AI Actually Helps You Fix PostgreSQL Performance Problems (and Where It Lies)

AI won’t replace your EXPLAIN ANALYZE instincts — but grounded in real stats, it compresses the diagnose-to-fix loop from an hour to minutes. A DBA’s field guide to where it helps and where it lies.

postgresql ai-database-tuning explain-analyze pgvector

June 8, 2026 · 5 min read

AI Agent Evals: Production Readiness Guide

Benchmarks tell you whether an agent can solve a task. Production evals tell you whether it will behave safely when the task gets messy.

ai-agent-evals production-readiness SWE-bench behavior-contracts

July 17, 2025 · 25 min read

Build the Eval System: Three Graders, 38 Tasks, and the $3-8 Safety Net (Part 2 of 2)

The complete practitioner’s guide: three grader types ($0/$0/$$), four task patterns, CI architecture, three real regressions caught, and a 4-week playbook — all for $3-8 per eval run.

ai-agent-evaluation agent-grading ci-evals series-part-2

July 17, 2025 · 13 min read

AI Agent Evals: Why SWE-bench Isn't Enough Before Production (Part 1 of 2)

Your AI agent scores 78% on SWE-bench. It also just told a developer it deployed infrastructure — without calling a single tool. Here’s what benchmarks miss, and the $0 eval that catches it.

ai-agent-evaluation github-copilot agentic-ai series-part-1

May 7, 2026 · 12 min read

Spend Fewer Tokens, Get Better Code: A Context Engineering Guide for AI Code Assistants (Part 1 of 2)

Anthropic cut tool context by 85%. Accuracy improved from 49% to 74%. Five context engineering practices that make your AI code assistant produce better output — while spending fewer tokens.

context-engineering github-copilot ai-code-assistant series-part-1

May 11, 2026 · 10 min read

Invisible Compound Savings: Caching, Workflow Discipline, and the Habits That Add Up (Part 2 of 3)

90% of your AI prompt context repeats across every request. Prompt caching gives you 90% off. The retry tax costs you 1.4x. Here is how structural habits compound into invisible savings.

prompt-caching github-copilot ai-code-assistant series-part-2

May 11, 2026 · 10 min read

The 120x Spread: Understanding What You Pay For and When It Matters (Part 3 of 3)

The cheapest AI model costs 0.25x. The most expensive costs 30x. A three-tier task taxonomy for matching model capability to task complexity, plus the complete three-layer optimization playbook.

model-selection github-copilot ai-code-assistant series-part-3

April 26, 2026 · 12 min read

PostgreSQL EXPLAIN BUFFERS: How We Cut Checkout Latency 96%

A real-world e-commerce case study: one word added to EXPLAIN ANALYZE diagnosed a checkout regression from 50ms to 1.2s that three days of network debugging missed.

postgresql performance explain-buffers case-study