Publication: Pardon My French: Assessing the Potential for Data Centers in Northern Quebec With Machine Learning Models
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis aims to explore the feasibility of data centers in Northern Quebec using machine learning models to determine feature importance of site selection factors. Specifically, Random Forests are used to learn feature importance on a large, multisource dataset of hyperscale data centers and corresponding relevant data points captured across national and regional levels through 2006-2024. SVMs, LASSO regression, and XGBoost models are used to corroborate the feature importance results of the Random Forest. Installed Solar PV Power Capacity and Internet Adoption—representing categorical features of Renewable Electricity Supply and Closeness to Customers, respectively—are determined to be robust predictors for the existence of a data center at a given location in a given year. Canada boasts a healthy supply of renewable electricity supply, with abundant hydro energy and rapid growth in nuclear energy, as well as comparable closeness to customers with large cities and proximity to American cities across the border. Correspondingly, Canadian hyperscale data centers have begun to arise across Montreal, Toronto, and Vancouver. Northern Quebec, specifically, has high potential for renewable electricity supply with favorable geographical factors for nuclear plants and cheap hydro energy. However, its rural geography lacks closeness to customers, making it currently only viable for low-latency use cases such as model training. While unlikely in the near future, advancements in technology and corresponding reductions in latency may unlock the potential for data centers in Northern Quebec.