Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Brain Tumor Regression Pipeline and Analysis
Project type
Python: Notebook
Date
2025-07-16
Location
Calgary, AB
This project presents a comprehensive machine learning pipeline for the regression analysis of a brain tumor dataset, with the objective of understanding factors influencing patient survival and predicting clinical outcomes.
This focuses on predicting patient survival rates based on a comprehensive set of clinical and demographic variables. Utilizing multiple regression algorithms, including Linear Regression, Ridge, Lasso, Decision Tree, Random Forest, Gradient Boosting, Support Vector Regression (SVR), and K-Nearest Neighbors—this analysis aims to quantify the impact of features such as age, tumor size, tumor growth rate, stage, treatment modalities, and patient history on survival outcomes.
Advanced preprocessing techniques, including polynomial feature expansion, interaction terms, and binning of continuous variables, were integrated into a streamlined modeling pipeline. Each model was evaluated using standard performance metrics: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²) to assess predictive accuracy and generalizability.
The regression results revealed that ensemble models such as Random Forest and Gradient Boosting provided near-perfect fits (R² ≈ 1.00), suggesting strong predictive capability when capturing complex nonlinear relationships in the data. Moreover, visualization techniques like KDE plots and distribution heatmaps were employed to interpret the relationship between tumor progression and survival outcomes.
Overall, the regression analysis offers valuable insights into prognosis and can potentially support personalized treatment planning by identifying key predictors of long-term survival in brain tumor patients.
































