written by
Paul Richardson

Predicting Obesity

1 min read

January 17, 2021

This is a Jupyter notebook I created and published on Kaggle to that uses a dataset and machine learning to predict obesity. This notebook explores exploratory data analysis, data preparation, machine learning and parameter hyper tuning.

https://www.kaggle.com/pmrich/obesitydataset-eda-data-prep-ml-hypertuning

This data comes from the UCI Machine Learning Repository. This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition.

This notebook explores several popular machine learning classification models to predict the weight classification of patients. This notebook provides a walk-thru of the data treatment and evaluating various parameters for each model to optimize model performance. This analysis identifies the most accurate model and the associated set of parameters to accurately classify a patient’s weight as it pertains to being underweight to obese.

In this analysis, we will specifically explore and evaluate the accuracy performance of K Nearest Neighbors (KNN), Decision Trees, Random Forest, and SVM in the classification of weight categories with the UCI Machine learning Repository estimation of obesity levels dataset.