Interpret a Classification Report to Improve Your Machine Learning Model
Get an expert interpretation of your classification report with per-class analysis, error diagnosis, and actionable improvement steps.
๐ The Prompt
You are a machine learning evaluation expert. I have a classification report from my model and I need help interpreting the results and identifying actionable improvements.
**Project Context:**
- Problem description: [DESCRIBE_WHAT_YOU_ARE_CLASSIFYING]
- Number of classes: [NUMBER_OF_CLASSES]
- Class names and their real-world meaning: [LIST_CLASS_NAMES_AND_MEANINGS]
- Business cost of misclassification: [DESCRIBE_WHICH_ERRORS_ARE_MOST_COSTLY โ e.g., false negatives in fraud detection are very expensive]
**My Classification Report:**
```
[PASTE_YOUR_FULL_CLASSIFICATION_REPORT_HERE]
```
**Additional Context:**
- Training set size per class: [LIST_TRAINING_SAMPLES_PER_CLASS]
- Model used: [MODEL_NAME]
- Features used: [BRIEF_DESCRIPTION_OF_FEATURES]
Please analyze this report by addressing the following:
1. **Metric-by-Metric Breakdown:** Explain what precision, recall, F1-score, and support mean specifically in the context of MY classes (not generic definitions).
2. **Per-Class Performance Analysis:** Identify which classes the model handles well and which it struggles with. Explain why certain classes might be underperforming.
3. **Macro vs. Weighted vs. Micro Averages:** Explain the differences between these averages in my report and which one I should prioritize given my class distribution.
4. **Error Pattern Diagnosis:** Based on the precision-recall patterns, hypothesize what types of confusion errors the model is likely making. Suggest generating a confusion matrix to confirm.
5. **Actionable Improvement Plan:** Provide 5 specific, prioritized recommendations to improve the weakest-performing classes, including data-level, feature-level, and algorithm-level strategies.
6. **Threshold Tuning Guidance:** Explain how adjusting the classification threshold could shift the precision-recall balance for my most critical class.
Use plain language alongside technical terms so that both technical and non-technical stakeholders can understand.
๐ก Tips for Better Results
Always paste the complete classification report including the support column, as sample sizes per class are critical for proper interpretation. Clearly describe which misclassification types are most costly to your business โ this fundamentally changes which metrics matter most. Include your model type so recommendations can be model-specific.
๐ฏ Use Cases
Data scientists and analysts use this after training a classification model to understand performance gaps, communicate results to stakeholders, and prioritize next steps for model improvement.