langfuse · jannikmaierhoefer · Apr 20, 2026 · Apr 21, 2026 · Apr 21, 2026 · Apr 22, 2026
diff --git a/.../cookbook/evaluation_of_rag_with_ragas.md → ...cookbook/evaluation_of_rag_with_ragas.mdx b/.../cookbook/evaluation_of_rag_with_ragas.md → ...cookbook/evaluation_of_rag_with_ragas.mdx
@@ -1,7 +1,8 @@
 ---
-title: Evaluation of Rag with Ragas
-description: Use RAGAS to evaluate your RAG pipelines traced with Langfuse to measure the quality of your retrieval and sythesis.
-category: Evaluation
+source: "⚠️ This file is auto-generated from cookbook/evaluation_of_rag_with_ragas.ipynb. Do not edit this file directly — update the .ipynb file and regenerate with `bash scripts/update_cookbook_docs.sh`."
+title: "Evaluation of RAG pipelines with Ragas"
+description: "Use RAGAS to evaluate your RAG pipelines traced with Langfuse to measure the quality of your retrieval and sythesis."
+category: "Evaluation"
 ---
 
 # Evaluation of RAG pipelines with Ragas
@@ -52,16 +53,6 @@ fiqa_eval = load_dataset("explodinggradients/fiqa", "ragas_eval")['baseline']
 fiqa_eval
 ```
 
-
-
-
-    Dataset({
-        features: ['question', 'ground_truths', 'answer', 'contexts'],
-        num_rows: 30
-    })
-
-
-
 ## The Metrics
 For going to measure the following aspects of a RAG system. These metric are from the Ragas library:
 
@@ -143,14 +134,6 @@ row = fiqa_eval[0]
 row['question'], row['answer']
 ```
 
-
-
-
-    ('How to deposit a cheque issued to an associate in my business into my business account?',
-     '\nThe best way to deposit a cheque issued to an associate in your business into your business account is to open a business account with the bank. You will need a state-issued "dba" certificate from the county clerk\'s office as well as an Employer ID Number (EIN) issued by the IRS. Once you have opened the business account, you can have the associate sign the back of the cheque and deposit it into the business account.')
-
-
-
 Now lets init a Langfuse client SDK to instrument you app.
 
 
@@ -169,9 +152,6 @@ else:
     print("Authentication failed. Please check your credentials and host.")
 ```
 
-    Langfuse client is authenticated and ready!
-
-
 Here we are defining a utility function to score your trace with the metrics you chose.
 
 
@@ -237,21 +217,6 @@ print("RAGAS Scores:", ragas_scores)
 ragas_scores
 ```
 
-    calculating faithfulness
-    calculating answer_relevancy
-    calculating llm_context_precision_without_reference
-    RAGAS Scores: {'faithfulness': 0.8, 'answer_relevancy': np.float64(0.9825100521118072), 'llm_context_precision_without_reference': 0.9999999999}
-
-
-
-
-
-    {'faithfulness': 0.8,
-     'answer_relevancy': np.float64(0.9825100521118072),
-     'llm_context_precision_without_reference': 0.9999999999}
-
-
-
 Once the scores are computed you can add them to the trace in Langfuse:
 
 
@@ -340,13 +305,6 @@ traces_sample = sample(traces, NUM_TRACES_TO_SAMPLE)
 len(traces_sample)
 ```
 
-
-
-
-    3
-
-
-
 Now lets make a batch and score it. Ragas uses huggingface dataset object to build the dataset and run the evaluation. If you run this on your own production data, use the right keys to extract the question, contexts and answer from the trace
 
 
@@ -386,24 +344,13 @@ ds = Dataset.from_dict(evaluation_batch)
 r = evaluate(ds, metrics=[Faithfulness(), ResponseRelevancy()])
 ```
 
-
-    Evaluating:   0%|          | 0/6 [00:00<?, ?it/s]
-
-
 And that is it! You can see the scores over a time period.
 
 
 ```python
 r
 ```
 
-
-
-
-    {'faithfulness': 0.5516, 'answer_relevancy': 0.9294}
-
-
-
 You can also push the scores back into Langfuse or use the exported pandas dataframe to run further analysis.
 
 
@@ -419,66 +366,6 @@ df.head()
 
 
 
-<div>
-<style scoped>
-    .dataframe tbody tr th:only-of-type {
-        vertical-align: middle;
-    }
-
-    .dataframe tbody tr th {
-        vertical-align: top;
-    }
-
-    .dataframe thead th {
-        text-align: right;
-    }
-</style>
-<table border="1" class="dataframe">
-  <thead>
-    <tr style="text-align: right;">
-      <th></th>
-      <th>user_input</th>
-      <th>retrieved_contexts</th>
-      <th>response</th>
-      <th>faithfulness</th>
-      <th>answer_relevancy</th>
-      <th>trace_id</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <th>0</th>
-      <td>Do I need a new EIN since I am hiring employee...</td>
-      <td>[You don't need to notify the IRS of new membe...</td>
-      <td>\nNo, you do not need a new EIN since you are ...</td>
-      <td>0.750000</td>
-      <td>0.992491</td>
-      <td>9a96d48d96d45b1bb6d28d48b7cc93d4</td>
-    </tr>
-    <tr>
-      <th>1</th>
-      <td>Privacy preferences on creditworthiness data</td>
-      <td>[See the first item in the list: For our every...</td>
-      <td>\nThe best answer to this question is that you...</td>
-      <td>0.571429</td>
-      <td>0.875799</td>
-      <td>18e23692aa5b2b245c176574e247a236</td>
-    </tr>
-    <tr>
-      <th>2</th>
-      <td>Have plenty of cash flow but bad credit</td>
-      <td>[This is probably a good time to note that cre...</td>
-      <td>\nIf you have plenty of cash flow but bad cred...</td>
-      <td>0.333333</td>
-      <td>0.919893</td>
-      <td>877d64dc4355743e2d2f1b2607d9ec14</td>
-    </tr>
-  </tbody>
-</table>
-</div>
-
-
-
 
 ```python
 for _, row in df.iterrows():

diff --git a/...des/cookbook/evaluation_with_langchain.md → ...es/cookbook/evaluation_with_langchain.mdx b/...des/cookbook/evaluation_with_langchain.md → ...es/cookbook/evaluation_with_langchain.mdx
@@ -1,7 +1,8 @@
 ---
-title: Evaluation with Langchain
-description: Cookbook that demonstrates how to run Langchain evaluations on data in Langfuse.
-category: Evaluation
+source: "⚠️ This file is auto-generated from cookbook/evaluation_with_langchain.ipynb. Do not edit this file directly — update the .ipynb file and regenerate with `bash scripts/update_cookbook_docs.sh`."
+title: "Run Langchain Evaluations on data in Langfuse"
+description: "Cookbook that demonstrates how to run Langchain evaluations on data in Langfuse."
+category: "Evaluation"
 ---
 
 # Run Langchain Evaluations on data in Langfuse
@@ -75,7 +76,9 @@ else:
     print("Authentication failed. Please check your credentials and host.")
 ```
 
-    Langfuse client is authenticated and ready!
+```
+Langfuse client is authenticated and ready!
+```
 
 
 ### Fetching data
@@ -114,7 +117,9 @@ generations[0].id
 
 
 
-    'adb5ba6beab14984ab89006ee09e9cd6'
+```
+'adb5ba6beab14984ab89006ee09e9cd6'
+```