Welcome to ClimateWins Machine Learning Project Case Study

Context
ClimateWins is a European nonprofit tackling extreme weather through advanced analytics. The project explored how machine learning can improve disaster preparedness by classifying weather conditions and exploring future-facing climate modeling approaches.

Objective
Test whether machine learning can classify daily weather as pleasant or unpleasant, identify the minimum features needed for strong performance, and explore extensions to image-based classification and climate forecasting.

Methods & Tools

Python libraries: pandas, NumPy, scikit-learn, matplotlib, seaborn, tensorflow/keras
Models tested: KNN, Decision Tree, ANN, Random Forest, CNN, EfficientNetB0 (transfer learning)
Techniques: Gradient descent optimization, PCA, clustering, feature selection

GitHub

Data Summary

Source: European Climate Assessment & Dataset Project (ECA&D)
Daily observations, 1960–2022, from 15 European weather stations
Features: temperature, precipitation, humidity, cloud cover, etc.
Challenge: Class imbalance → evaluated models with weighted F1 score instead of accuracy

Key Findings

Classical models:

KNN, Decision Tree, ANN → moderate performance, KNN strongest baseline.
Random Forest → achieved near-perfect weighted F1 (~0.99–1.0).
- Identified three most influential stations: Madrid, Belgrade, Budapest.
- Only three features (max temperature, precipitation, sunshine) were required for robust classification → reduced computational cost and resilience to missing data.

Figure 1. Feature importance ranking from the Random Forest model (Madrid dataset).

Deep learning models:

CNN classified pleasant vs. unpleasant weather with weighted F1 ~0.99.
Extended CNN to image classification (Cloudy, Rainy, Sunny, Sunrise).
- Using EfficientNetB0 transfer learning, the model reached a weighted F1-score of ~0.92..
- Proof of concept: indicates potential application to radar and satellite imagery, where rapid image classification could provide timely insights for disaster preparedness and prevention.

Figure 2. Confusion matrix of the EfficientNetB0-based CNN used for image classification.

Challenges and Solutions

Unbalanced data:
Some weather types were rare, so I used a fairer performance metric (weighted F1 score) instead of plain accuracy.
Overfitting:
Decision trees were too tailored to the training set; I reduced this with pruning and ensemble methods (Random Forests).
Too many variables:
With many weather features, I simplified the data using Principal Component Analysis (PCA) and feature selection.
Missing data:
Chose models that remain reliable even when some inputs are unavailable.
Image classification performance:
Standard Convolutional Neural Networks (CNNs) did not perform optimally on images, so I applied a more advanced architecture (EfficientNetB0), which improved accuracy and efficiency.

Future Directions

1. Radar & Satellite Imagery:

Use deep learning for images (Convolutional Neural Networks, or CNNs) to classify weather patterns.
Apply sequence-aware models (Convolutional LSTMs) to track how storms develop over time.
Explore advanced AI models (Generative Adversarial Networks and Transformers) for “nowcasting” — short-term forecasts — and for detecting longer-range patterns.

2. Climate Twin Cities:

Apply dimensionality reduction techniques (such as Principal Component Analysis) combined with clustering to group cities that share similar weather conditions.
Use Graph Neural Networks (GNNs) to map how these similarities evolve as climate changes.
Develop tools to track which cities are diverging most quickly, offering insight into where adaptation may be most urgent.

3. Hybrid Models with Climate Projections:

Combine historical weather station data (from the European Climate Assessment & Dataset, ECA&D) with future climate scenarios from the Intergovernmental Panel on Climate Change (IPCC).
Use these models to deliver early warnings of unusual weather events and provide evidence-based guidance for long-term adaptation policies.

Why It Matters

This project demonstrates how machine learning bridges climate science and decision-making:

Simple models (Random Forests) → scalable, practical, resource-efficient.
Deep models (CNNs, transfer learning) → unlock new potential in image-based prediction.
Future directions → offer actionable insights for policymakers, urban planners, and resilience organizations.

Machine learning can help identify safer regions, improve early-warning systems, and support long-term adaptation planning.

📂 Project Resources

Final Presentation → Download slides (PDF)
GitHub Repository → View full project scripts
Other Case Studies → Explore more projects

Code Examples

Code 1. Generating the feature importance chart for the Madrid Random Forest model (Figure 1).

Code 2. Randomized search optimization of the Random Forest model for Madrid weather data.

Code 3. Building and compiling the EfficientNetB0-based CNN for 4-class weather image classification.

Code 4. Evaluating the EfficientNetB0 CNN with weighted F1-score, classification report, and confusion matrix (Figure 2).

Welcome to ClimateWins Machine Learning Project Case Study

Key Findings

Challenges and Solutions

Unbalanced data:

Overfitting:

Too many variables:

Missing data:

Image classification performance: