Accuracy Score: 79.35%


AUC-ROC score: 0.831


Classification Report

Precision Recall F1-score Support
Died 0.79 0.81 0.80 80
Lived 0.79 0.77 0.78 75
Accuracy 0.79 155
Macro Avg 0.79 0.79 0.79 155
Weighted Avg 0.79 0.79 0.79 155

Purpose of the Accuracy Score

The accuracy score is a measure used to evaluate how well a machine learning model performs. It tells us the percentage of correct predictions the model makes out of all the predictions it tries to make.

How It Works

We have a dataset containing information on whether a horse with colic will live or die based on certain symptoms and treatments. We also have the actual outcomes (whether each horse survived or not) within the dataset for each case. The accuracy score compares the model's predictions to the actual outcomes. It calculates how many times the model's predictions were correct and divides this by the total number of predictions made using this formula:

Accuracy = (Total Number of Correct Predictions) / (Total Number of Predictions)

The accuracy score gives us a straightforward way to understand the effectiveness of the model. A higher accuracy score means the model is making more correct predictions, which is desirable in most scenarios. An accuracy measurement of anything between 70%-90% is consistent with industry standards.

In this case, which is predicting the survival probability of horses with colic, an accuracy score of 79.35% means the model would be considered reliable in helping horse owners and veterinarians make informed decisions.

Purpose of the AUC-ROC (Area Under the Receiver Operating Characteristic) Score

The AUC-ROC Score measures how well a model can distinguish between two classes (in our case, lived and died). The score ranges from 0 to 1.

What Our AUC-ROC Score Tells Us

An AUC-ROC Score of 0.831 means that our model is excellent at distinguishing between horses that will survive and those that won't. This score indicates that our model has an 83.1% chance of correctly predicting a live/die outcome.

Purpose of the Classification Report

The classification report is a summary of how well our machine learning model performs when making predictions. It provides important metrics that help us understand the accuracy and reliability of the model's predictions. Think of it as a report card for our model, showing us how well it did in predicting whether horses with colic would live or die. It helps us see the strengths and weaknesses of the model's predictions.

What Our Classification Report Tells Us

  • Precision: Out of all the predictions for each class (died or lived), how many were correct?
    • Died: 0.79 - This means 79% of the time the model predicted a horse would die, it was correct.
    • Lived: 0.79 - This means 79% of the time the model predicted a horse would live, it was correct.
  • Recall: Out of all the actual instances of each class, how many did the model correctly identify?
    • Died: 0.81 - This means the model correctly identified 81% of the horses that actually died.
    • Lived: 0.77 - This means the model correctly identified 77% of the horses that actually lived.
  • F1-Score: A balance between precision and recall.
    • Died: 0.80 - This is a good balance between precision and recall for predicting death.
    • Lived: 0.78 - This is a good balance between precision and recall for predicting survival.
  • Support: The number of actual instances of each class in the test data.
    • Died: 80
    • Lived: 75
  • Accuracy: The overall percentage of correct predictions.
    • Accuracy: 0.79 - This means the model correctly predicted the outcome for 79% of the cases.
  • Macro Average: The average of precision, recall, and F1-score across all classes.
    • Macro avg: 0.79 - This is the unweighted average, treating each class equally.
  • Weighted Average: The average of precision, recall, and F1-score across all classes, weighted by the number of true instances for each class.
    • Weighted avg: 0.79 - This takes into account the number of instances in each class.

Evaluation

  • Precision and Recall: Both classes (died and lived) have fairly balanced precision and recall, indicating the model performs consistently across different outcomes.
  • F1-Score: The F1-scores are close to each other and fairly high, showing a good balance between precision and recall.
  • Overall Accuracy: An accuracy of 79% is considered good, especially if the problem is complex and the data is "noisy".

Conclusion

This classification report is considered good for our model. The metrics are well-balanced and show that the model is performing well in predicting both classes.