Prediction Model of Surgital Site Infection in Patients Undergoing Hip Arthroplasty
Undergraduate Final Thesis with real medical data in collaboration with Ramon Y Cajal University Hospital
Project Information
- University: Rey Juan Carlos University of Madrid
- Project date: Sep. 2021 - Jun. 2022
- Data: Ramon Y Cajal University Hospital
- Techniques: Data preprocessing and feature engineering. Machine learning model training. Hyperparameter tuning with cross-validation. Evaluation of model performance.
Summary
Surgical site infections (SSI) are a major concern in public health, particularly in hip arthroplasty (HA), where they significantly impact patient health and healthcare costs due to the need for prosthesis replacement. This study utilizes a database from Hospital Universitario Ramón y Cajal (2010-2020) to develop machine learning (ML) models that predict the likelihood of infection based on surgical parameters. After preprocessing and feature engineering, visual analyses were conducted to understand variable significance. Techniques such as undersampling, SMOTE, one-hot encoding, and mutual information were applied before training supervised ML models, including logistic regression, decision trees, random forest, XGBoost, and multilayer perceptron. Logistic regression yielded the best overall results with 94.2% accuracy and 95.5% specificity, while random forest achieved the highest sensitivity (78%). These promising results highlight the potential of ML in SSI prediction for HA, paving the way for improved prevention strategies in clinical practice.