Welcome to ClimateWins Machine Learning Project Case Study
Context
ClimateWins is a European nonprofit tackling extreme weather through advanced analytics. The project explored how machine learning can improve disaster preparedness by classifying weather conditions and exploring future-facing climate modeling approaches.
Objective
Test whether machine learning can classify daily weather as pleasant or unpleasant, identify the minimum features needed for strong performance, and explore extensions to image-based classification and climate forecasting.
Methods & Tools
- Python libraries: pandas, NumPy, scikit-learn, matplotlib, seaborn, tensorflow/keras
- Models tested: KNN, Decision Tree, ANN, Random Forest, CNN, EfficientNetB0 (transfer learning)
- Techniques: Gradient descent optimization, PCA, clustering, feature selection
Data Summary
- Source: European Climate Assessment & Dataset Project (ECA&D)
- Daily observations, 1960–2022, from 15 European weather stations
- Features: temperature, precipitation, humidity, cloud cover, etc.
- Challenge: Class imbalance → evaluated models with weighted F1 score instead of accuracy
Key Findings
Classical models:
- KNN, Decision Tree, ANN → moderate performance, KNN strongest baseline.
- Random Forest → achieved near-perfect weighted F1 (~0.99–1.0).
- Identified three most influential stations: Madrid, Belgrade, Budapest.
- Only three features (max temperature, precipitation, sunshine) were required for robust classification → reduced computational cost and resilience to missing data.
Figure 1. Feature importance ranking from the Random Forest model (Madrid dataset).
Deep learning models:
- CNN classified pleasant vs. unpleasant weather with weighted F1 ~0.99.
- Extended CNN to image classification (Cloudy, Rainy, Sunny, Sunrise).
- Using EfficientNetB0 transfer learning, the model reached a weighted F1-score of ~0.92..
- Proof of concept: indicates potential application to radar and satellite imagery, where rapid image classification could provide timely insights for disaster preparedness and prevention.
Figure 2. Confusion matrix of the EfficientNetB0-based CNN used for image classification.
Challenges and Solutions
-
Unbalanced data:
Some weather types were rare, so I used a fairer performance metric (weighted F1 score) instead of plain accuracy. -
Overfitting:
Decision trees were too tailored to the training set; I reduced this with pruning and ensemble methods (Random Forests). -
Too many variables:
With many weather features, I simplified the data using Principal Component Analysis (PCA) and feature selection. -
Missing data:
Chose models that remain reliable even when some inputs are unavailable. -
Image classification performance:
Standard Convolutional Neural Networks (CNNs) did not perform optimally on images, so I applied a more advanced architecture (EfficientNetB0), which improved accuracy and efficiency.

Future Directions
Why It Matters
This project demonstrates how machine learning bridges climate science and decision-making:
- Simple models (Random Forests) → scalable, practical, resource-efficient.
- Deep models (CNNs, transfer learning) → unlock new potential in image-based prediction.
- Future directions → offer actionable insights for policymakers, urban planners, and resilience organizations.
Machine learning can help identify safer regions, improve early-warning systems, and support long-term adaptation planning.
📂 Project Resources
- Final Presentation → Download slides (PDF)
- GitHub Repository → View full project scripts
- Other Case Studies → Explore more projects
Code Examples
Code 1. Generating the feature importance chart for the Madrid Random Forest model (Figure 1).
Code 2. Randomized search optimization of the Random Forest model for Madrid weather data.
Code 3. Building and compiling the EfficientNetB0-based CNN for 4-class weather image classification.
Code 4. Evaluating the EfficientNetB0 CNN with weighted F1-score, classification report, and confusion matrix (Figure 2).