Agent Skills
Portable, on-demand workflow packages for coding agents -- what they are, how they differ from rules and project memory, how to use them across tools, and how to author your own with examples.
Portable, on-demand workflow packages for coding agents -- what they are, how they differ from rules and project memory, how to use them across tools, and how to author your own with examples.
What makes an LLM system an agent, how tool use works, the canonical multi-agent patterns, and the MCP and A2A protocols that connect agents to tools and to each other.
Short, plain-English definitions of the core AI, LLM, agent, and RAG terms used across this section, with links to the deep-dive pages.
Product and UX patterns for shipping LLM features to users -- interaction models, streaming, trust, when not to use AI, and graceful degradation.
What guardrails are and how they work, their documented limitations, the attack surface (prompt injection and jailbreaking), red-teaming as an evaluation method, and the layered, defense-in-depth approach to deploying LLMs responsibly.
Using LLM coding agents inside an engineering workflow -- the alignment-before-generation methodologies (SPDD, architect-as-orchestrator), architecture patterns that suit AI (deep modules, vertical slices), and the economics driving adoption.
A hands-on guide to running a model locally with Ollama or LM Studio and talking to it from your own simple app -- with copy-paste code for a browser app, a Node.js CLI, and the OpenAI SDK.
When to use enterprise cloud AI platforms (Bedrock, SageMaker, Azure / Microsoft Foundry, Vertex AI) versus running open-weights models locally with Ollama, LM Studio, llama.cpp, or vLLM -- with a decision framework and trade-off tables.
The context window as a finite budget, why prompt engineering grew into context engineering, context rot, and the long-horizon techniques -- compaction, structured note-taking, sub-agents, and just-in-time retrieval.
Token economics, latency drivers, and practical patterns for choosing model tiers, caching, and fallback chains -- without treating cost as an afterthought.
Production troubleshooting for LLM features -- classify the failure, inspect prompts retrieval tools and logs, and fix the right layer without guessing.
How embedding models work, the 2026 model landscape, how to choose one, dimensions and Matryoshka, vector quantization, domain adaptation, hybrid retrieval, and the production pitfalls that quietly halve recall.
How to test non-deterministic LLM systems with datasets, scorers, and LLM-as-judge; eval-driven development and harness engineering; and the LLMOps discipline of operating prompts, models, and agents in production.
Approval gates, escalation, and accountability patterns for agents and LLM features that act in the real world -- maker-checker, confidence thresholds, and audit trails.
The design axis behind RAG, just-in-time retrieval, structured note-taking, the LLM-wiki pattern, and llms.txt -- where synthesized knowledge lives, who maintains it, and when to use each.
What a large language model is, how it produces text token by token, how it is trained, and what it is and is not good at.
Practical guidance for builders on PII in prompts, retention and residency, logging risks, and minimization patterns when shipping LLM features.
Always-on context for coding agents -- AGENTS.md, CLAUDE.md, Cursor rules, and how to split conventions from on-demand skills.
How retrieval-augmented generation grounds an LLM in external knowledge using embeddings and vector databases, how it compares to fine-tuning, and the production levers that make it work.
Getting reliable JSON and schema-bound responses from LLMs -- native structured output modes, validation and repair loops, and when structure beats free-form prose.
A map of the AI application tooling landscape -- orchestration frameworks, connectivity protocols, vector databases, evaluation and observability, and the LLMOps discipline that ties them together.
A decision guide for the AI section -- pick the right combination of LLM, RAG, agents, skills, structured outputs, and human review for common goals without rereading every page.