LLMOps#

Production concerns for LLM applications — evaluation, observability, experimentation, safety, and end-to-end system integration.

Topics#

  • Evaluation Metrics — Measuring RAG quality with RAGAS, DeepEval, Arize Phoenix, and MLflow: faithfulness, answer relevance, and context metrics

  • LLMOps & Observability — Production monitoring with LangFuse, LangSmith, and cost optimization strategies

  • Experiment Comparison — Rigorous comparison of Naive RAG, GraphRAG, and Hybrid architectures using evaluation metrics

  • AI Safety & Guardrails — Implementing safety boundaries, content filtering, and quality controls for production LLM applications

  • Building RAG Agents — Capstone: assembling a complete RAG agent using the ReAct pattern, retriever tools, and the agent reasoning loop

Prerequisites#

Complete Foundations, RAG Optimization, and Agents first.