LLMOps#
Production concerns for LLM applications — evaluation, observability, experimentation, safety, and end-to-end system integration.
Topics#
Evaluation Metrics — Measuring RAG quality with RAGAS, DeepEval, Arize Phoenix, and MLflow: faithfulness, answer relevance, and context metrics
LLMOps & Observability — Production monitoring with LangFuse, LangSmith, and cost optimization strategies
Experiment Comparison — Rigorous comparison of Naive RAG, GraphRAG, and Hybrid architectures using evaluation metrics
AI Safety & Guardrails — Implementing safety boundaries, content filtering, and quality controls for production LLM applications
Building RAG Agents — Capstone: assembling a complete RAG agent using the ReAct pattern, retriever tools, and the agent reasoning loop
Prerequisites#
Complete Foundations, RAG Optimization, and Agents first.