Students develop a mini-project implemented through four 1-hour challenges.
The project focuses on intelligent energy consumption and fairness-aware analytics
for smart cities, combining data cleaning, clustering, fairness evaluation, SQL-based
sampling, and machine learning modelling.
The four challenges are designed to be done sequentially and build on each other:
1. Challenge 1 – Data Cleaning & Outlier Detection for Smart-City Energy
2. Challenge 2 – DEI aware and fairness in data preparation (introduction)
2. Challenge 2 – Data Quality, Fairness & SQL Sampling with AIF360 (application)
3. Challenge 3 – Modelling Smart-City Energy Consumption & Fairness of Errors
At the end of the project, you will produce a final infographic that summarises
your workflow, results, and main insights. The description of its content is here.
1. Project Learning Goals
By completing this project, students should be able to:
- Work with a real-world open dataset on smart-city energy consumption.
- Apply data-quality checks and statistical data cleaning techniques.
- Use k-means clustering to identify atypical days in energy consumption.
- Use the IBM AI Fairness 360 (AIF360) library to measure bias in a socio-economic dataset.
- Implement fairness-aware sampling strategies using SQL.
- Build and evaluate machine learning models for daily energy consumption prediction.
- Analyse the fairness of prediction errors across building groups.
- Communicate results and insights through a concise, visually engaging infographic.
2. Datasets Used
2.1 Smart-City Energy Data (Open Power System Data – Household Data)
- Source: Open Power System Data (OPSD) – Household Data package (60-minute resolution).
- Example link: https://data.open-power-system-data.org/household_data/2020-04-15/
- Main file used: household_data_60min_singleindex.csv (hourly smart-meter readings).
- Role in project:
- Used in Challenge 1 to build a cleaned daily energy dataset.
- The resulting aggregated dataset (energy_daily_features.csv) is reused in Challenge 3.
2.2 Socio-Economic & Fairness Data (Adult Census Income Dataset)
- Source: UCI Machine Learning Repository – Adult dataset.
- Accessed via: IBM AI Fairness 360 (AIF360) library (AdultDataset class).
- Role in project:
- Used in Challenge 2 to measure bias and experiment with fairness-aware methods.
- Provides the classification-based fairness concepts that inspire regression fairness
- Metrics in Challenge 3.