One province. Eleven Microclimates.

How can predictive modeling help us understand climate change in Alberta, Canada?

Rebecca Duke Wiesenberg
3 min readSep 25, 2020

Alberta, Canada, has a continental climate, with balmy summers and freezing winters. Scattered around the province are 11 weather stations, which collect and report meteorological data daily.

Over time, meteorologists can use this data to see trends and make predictions about how climate will change (although not too far into the future!).

By being able to accurately predict where in the province data is being collected from, researchers will be able to pinpoint which communities are being affected by certain effects of climate change — and hopefully lead policy makers in Alberta to better serve their constituents.

Methodology

The dataset, logged by the Meteorological Service of Canada, Environment and Climate Change Canada, contains weather information from January 2000 to the present, and is updated daily. My models use data until September 1.

To prevent leakage — when a specific aspect of the dataset gives away the answer before making a prediction — I removed information about the coordinates of each weather station. Without latitude and longitude, my models must depend on other parts of the data, such as temperature and precipitation.

Between splitting the data and building the model, we first establish a baseline accuracy. Before fitting the model, our split data made a prediction with 13 percent accuracy.

Like a signpost, baseline accuracy scores indicate how effective the model is. As the model is fitted and iterated, new accuracy scores will be generated and compared to the baseline.

To predict, I used logistic regression and XGBoost models. Logistic regression models take a non-numerical category of data, like names, and predicts which category a piece of data belongs to. In regards to my dataset, the logistic model takes weather data, and decides whether it was recorded at Blatchford Station, Thornsby Station or any of the other nine weather stations in Alberta.

While logistic regression and other types of linear models are make say whether there , XGBoost models will state how likely two categories of data are related.

Even with how different each type of model makes predictions, both my logistic regression and XGBoost models increased the accuracy by almost the same amount (20%).

Conclusion

While both models were able to improve the accuracy, they will still need to be tweaked. As climates shift, and seasons become more extreme, it would be interesting to see how the model’s accuracy changes — or even if the model is still relevant.

However, this is a start to taking a deeper look into climate change’s effects, and allow local governments with limited funding to be able to more preventative environmental measures and waste less.

--

--

Rebecca Duke Wiesenberg

Data scientist with a focus on advocacy and public records. Combining data and language to increase public access to information.