Princeton University Users: If you would like to view a senior thesis while you are away from campus, you will need to connect to the campus network remotely via the Global Protect virtual private network (VPN). If you are not part of the University requesting a copy of a thesis, please note, all requests are processed manually by staff and will require additional time to process.
 

Publication:

Pardon My French: Assessing the Potential for Data Centers in Northern Quebec With Machine Learning Models

Loading...
Thumbnail Image

Files

ORFE_Senior_Thesis_Tyler_Berretta.pdf (2.3 MB)

Date

2025-05

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

This thesis aims to explore the feasibility of data centers in Northern Quebec using machine learning models to determine feature importance of site selection factors. Specifically, Random Forests are used to learn feature importance on a large, multisource dataset of hyperscale data centers and corresponding relevant data points captured across national and regional levels through 2006-2024. SVMs, LASSO regression, and XGBoost models are used to corroborate the feature importance results of the Random Forest. Installed Solar PV Power Capacity and Internet Adoption—representing categorical features of Renewable Electricity Supply and Closeness to Customers, respectively—are determined to be robust predictors for the existence of a data center at a given location in a given year. Canada boasts a healthy supply of renewable electricity supply, with abundant hydro energy and rapid growth in nuclear energy, as well as comparable closeness to customers with large cities and proximity to American cities across the border. Correspondingly, Canadian hyperscale data centers have begun to arise across Montreal, Toronto, and Vancouver. Northern Quebec, specifically, has high potential for renewable electricity supply with favorable geographical factors for nuclear plants and cheap hydro energy. However, its rural geography lacks closeness to customers, making it currently only viable for low-latency use cases such as model training. While unlikely in the near future, advancements in technology and corresponding reductions in latency may unlock the potential for data centers in Northern Quebec.

Description

Keywords

Citation