Operations Research and Financial Engineering, 2000-2024
Permanent URI for this collectionhttps://theses-dissertations.princeton.edu/handle/88435/dsp011r66j119j
Browse
Browsing Operations Research and Financial Engineering, 2000-2024 by Title
- Results Per Page
- Sort Options
A Hybrid GARCH and LSTM Model for Forecasting Volatility and Investment Horizons
(2025-04-10) Le, Jason; Scheinerman, DanielAccurately forecasting financial volatility is a critical component of modern finance, underpinning tasks such as risk management, asset pricing, and portfolio optimization. However, the stochastic and dynamic nature of financial markets poses significant challenges for existing models. Econometric approaches like Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models are effective at capturing short-term volatility clustering but are limited in addressing nonlinearities and long-term dependencies in financial time series. Machine learning models such as Long Short-Term Memory (LSTM) networks can model complex patterns and sequential dependencies but often lack the interpretability and theoretical grounding of traditional econometric methods.
This thesis develops a hybrid GARCH-LSTM model designed to improve the precision of volatility forecasts by combining the strengths of both methodologies. The hybrid model uses GARCH to estimate conditional volatilities and feeds these estimates, along with historical price data, into an LSTM network for further refinement. A central application of this hybrid approach lies in solving a practical investment problem: determining the maximum time horizon an investor can remain invested without exceeding a predefined loss tolerance, given a specific confidence level.
The time horizon is estimated by combining the hybrid model's volatility forecasts with Monte Carlo simulations, which generate potential price paths based on predicted volatilities. These simulations provide a probabilistic framework for quantifying the likelihood of maintaining an investment within acceptable loss thresholds.
By focusing on optimizing investment time horizons, this thesis contributes a model for integrating advanced forecasting techniques into practical financial decision-making. Additionally, the results aim to equip investors and risk managers with tools to make informed decisions in the face of uncertainty.
A Look Into Risk and Returns: The Predictive Value of Risk Indicators in Emerging Market Equities
(2025-04-08) Sukha, Deven P.; Rigobon, DanielEmerging Markets (EMs) are known to exhibit greater volatility in risk, meaning the indicators used to track risk fluctuate more than those in developed markets. This raises an important question: does the movement of risk indicators contain information that can aid in predicting returns in EM equity markets? To address this, we focused on five types of risk—credit, financial, political, economic, and composite- using values from several financial services firms. However, we found that changes in risk scores were not correlated across providers during the period studied, leading us to utilize a single provider S&P Global for sovereign credit and the International Country Risk Guide (ICRG) for the remaining risk indicators. We also modeled equity returns using pooled and country-specific Random Forest models, incorporating a range of macroeconomic variables that are relevant for return prediction. Predictive performance was evaluated using R2 and root mean squared error (RMSE), and the contribution of risk indicators was assessed through feature importance. We trained baseline models excluding risk indicators to test whether macroeconomic factors could compensate. Our results highlight the inherent difficulty of predicting equity returns: model performance was poor across the board, with R2 values near or below zero. While models that included risk indicators performed slightly better, the improvement was marginal. These findings suggest that changes in the selected risk indicators provide limited additional predictive value under the modeling approach. However, this does not necessarily mean that such indicators generally have no predictive usefulness. One plausible explanation is that equity indices may already reflect or even precede changes in these risk metrics, making any subsequent shifts in the risk indicators appear to have little effect. Further research could investigate different lags and modeling strategies to understand whether, and under what conditions, these risk indicators might enhance equity return predictions.
A Supervised Learning Framework for Generating DJ Transitions
(2025-04-10) Hein, Michael; Hubert, EmmaA disc jockey (DJ) curates a seamless auditory experience by skillfully transitioning between tracks. While these transitions can sometimes involve complex loops and sound effects, their most fundamental components often involve manipulating volume and adjusting frequency ranges to blend two songs. Prior work on automating DJ transitions has largely relied on heuristics or unsupervised learning approaches such as generative adversarial networks (GANs). In this paper, we present a unique supervised learning framework for generating DJ transitions between two tracks, providing an interpretable, data-driven alternative to previous methods. Using a dataset from 1001Tracklists containing real DJ mixes and their source tracks, we extract mel-spectrograms of the audio and train a convolutional neural network (CNN) to predict control signals that specify how volume and equalizer (EQ) bands should change over time. These predicted control signals are then applied to the source tracks to produce a transition, which is compared to the original transition from the DJ mix. To generate labeled input-output training pairs, we developed a full preprocessing pipeline that includes track-to-mix alignment using dynamic time warping (DTW), supported by both theoretical and empirical analyses of feature selection. While inspired by differentiable digital signal processing (DDSP), our learning phase operates entirely in the mel-spectrogram domain for simplicity and interpretability. We trained the model on a single example and found that it was able to replicate the corresponding ground truth transition with reasonable accuracy, offering early evidence that the task is learnable and that our framework has the capacity to produce non-trivial transitions. This work demonstrates the potential of supervised learning in generating realistic DJ transitions and lays the foundation for future research training on more data.
A Sustainable Extension of the Fama-French Factor Models: The Role of Carbon Emissions-Based Factors in Describing U.S. Stock Returns
(2025-04-10) Huang, Elaine L.; Cattaneo, Matias DamianAmidst climate change concerns, many investors are incorporating climate-related considerations, such as a company's carbon dioxide (CO2) emissions, into their investment decisions. Unfortunately, CO2 data is often missing or estimated. Therefore, we aim to understand how companies' carbon emissions can describe --- and how sector membership and carbon disclosure can impact --- excess stock returns. We extend the Fama-French (FF) three-factor and five-factor models, which describe stock returns using financial metrics, to also include our constructed ``Green-Minus-Brown" (GMB) factors: GMB_U (based on Log(CO2) emissions), and GMB_S (based on CO2 intensity). Our results show (1) Both GMB_U and GMB_S are statistically significant and have negative associations with excess stock returns; (2) Stocks in greener sectors have more positive interactions with the GMB factors, stocks in browner sectors have more negative interactions, and sectors with less polarizing CO2 emissions tend to have statistically insignificant interactions; and (3) The returns of companies with reported CO2 data are more sensitive to changes in the GMB factors than those with estimated CO2 data. Our research supports existing literature that carbon emissions can be used to describe stock returns while being the first to build factors based on both unscaled and scaled carbon emissions and to analyze performance across sectors and CO2 data sources (i.e., estimated vs. reported). In addition, our GMB factors can be used by companies and investors alike to track the monthly spreads between the excess returns of green stocks and the excess returns of brown stocks.
A Temporal Network Approach to Modeling Quantitative Success in Venture Capital Ecosystems
(2025-04-10) Tziampazis, George E.; Akrotirianakis, IoannisThis thesis investigates how temporal network structures can predict financial success in early-stage startups. Using investment data from Pitchbook, it constructs a dynamic graph of the North American Venture Capital (NAVC) ecosystem, capturing evolving relationships between investors and startups over time. From this network, node-level features such as temporal centrality and community embedding are computed to represent each startup’s structural identity. These features are used as inputs to train an Extreme Gradient Boosting (XG- Boost) supervised ML model, to predict a binary classification target of successful exits (IPO or acquisition) or failure within a fixed time window. Results show that models incorporating temporal network features consistently outperform baselines and results from similar problems, particularly on Precision@K metrics, which are practically relevant to VC decision-making. The findings demonstrate that interpretable, time-aware network metrics can meaningfully enhance startup evaluation frameworks. This work contributes to the intersection of finance, network science, and predictive modeling, offering new tools for data-driven early-stage investment.
AI-Enhanced Adaptive Portfolio Optimization: Beyond the Markowitz Model
(2025) Jimenez, Julian C.; Almgren, RobertThis thesis examines the progression of portfolio optimization techniques from traditional (Markowitz and CAPM) to much more computationally advanced techniques such as Machine Learning and LLMs. Using a 15 year dataset of daily S&P500 returns, we show that Long Short-Term Memory (LSTMs) excel at capturing much shorter-term return forecasting compared to Deep Neural Networks (DNNs) which excel at discerning complex, otherwise invisible patterns in the long term and map straight from input data to asset weights. Both approaches surpass classical benchmarks in risk-adjusted performance. Lastly, we introduce a Large Language Model (LLM)–based simulator, demonstrating how ChatGPT can effectively synthesize (e.g., news headline sentiment, policy announcements) into allocation decisions. Our findings highlight the promising future of prompt engineering as well as LLM’s promising ability to combine numerical and textual insight into, potentially, better understood portfolio strategies.
An Analysis of MOVES Style Transportation in New York City
(2025-05-10) Ginder, Koby; Kornhauser, Alain LucienToday, we stand at a critical moment in the evolution of automotive technology. Driverless technology has made tremendous progress over the past decade, and driverless vehicles have begun to permeate our society. The growth of this technology and the path it takes is sure to redefine how we think about mobility. This exploration aims to introduce, simulate and test an innovative transportation style that has only recently been made possible by the strides in automotive driverless technology. This network, known as MOVES style transportation, will be analyzed in America’s most populous city: New York City. This paper will first analyze the current patterns of transportation systems in the city; by inspecting public transportation data it will show the current movement patterns of New Yorkers and visualize it. It will introduce and describe the MOVES style autonomous driving network as it will be implemented in this specific use case. It will then model and simulate the performance of this system using specialized software developed by the Princeton Department of Operations Research and Financial Engineering. Financial performance will also be discussed based on the simulated results.
Arbitrage-Free and Simulation-Based Election Forecasting
(2025-04-10) O'Keefe, Edward P.; Tangpi, LudovicAfter mainstream electoral forecasts inaccurately predicted the outcome of the 2016 U.S. Presidential Election, alternative approaches to election forecasting became more prominent. Among these alternatives, prediction markets and arbitrage-free forecasting models have gained attention for offering more disciplined forecasts that can be interpreted as the price one would pay to wager on an election outcome. This thesis extends and enhances a popular-vote forecasting model developed by Fry & Burke. Specifically, our model addresses the inherent measurement errors in polling data, explicitly incorporating them into the forecasting methodology. Evaluations conducted across U.S. presidential elections from 1972 to 2024 demonstrate that this explicit consideration of polling errors significantly improves forecast accuracy. Additionally, comparisons with popular vote prediction market data from 2016 to 2024 show that prediction markets consistently underperform in forecasting outcomes for non-contested states, indicating systematic biases. To forecast electoral vote outcomes – a more complicated problem – we introduce a simulation-based approach that integrates Fry & Burke’s popular-vote forecasting techniques with Monte Carlo simulation. Our electoral forecasting method outperforms forecasts provided by Nate Silver’s FiveThirtyEight and prediction markets from 2016 to 2024. These findings underscore the effectiveness of our electoral vote forecasting model and highlight the potential biases present in electoral vote prediction markets.
Assessing the Effectiveness and Disruption Resilience of the Grand Paris Express
(2025-04-10) Bangalore, Sheetal; Rebrova, ElizavetaThis thesis examines the Metro system in the Greater Paris region, a key case study for understanding urban transportation dynamics. Using 2021 data, we model the Metro as a network, simulate passenger movement, and apply several analysis metrics. With the ongoing Grand Paris Express (GPE) project set to expand the Metro by 2030, this thesis compares the 2021 and projected 2030 systems to predict, quantify, and assess the effectiveness of these expansions. Furthermore, we predict how various populations of commuters will be affected by this project. We also visualize the impact of various service interruptions across the 2030 Metro network to offer recommendations for how authorities can mitigate disruptions from strikes and other unforeseen incidents through proactive planning.
Bayesian Adaptive Clinical Trials: A Soft Actor-Critic Reinforcement Learning Approach
(2025-04-13) Willer, Matt; Rigobon, DanielAdaptive clinical trial designs are aimed to improve efficiency and enhance ethical considerations by dynamically allocating patients to treatments based on accruing evidence. In this thesis, we formulate an adaptive clinical trial as a finite-horizon Markov Decision Process (MDP). The trial state comprises patient outcomes and Bayesian-updated treatment success probabilities, and is sequentially updated at each decision point. To solve the resulting treatment allocation decision-making problem, we implement a Soft Actor-Critic (SAC) framework that leverages maximum entropy reinforcement learning to balance exploration and exploitation effectively. To further capture this balance, we employ a weight-adjusted Total Variation Distance (TVD) component to the reward function, enabling us to quantify the value of information gathered between decision points. We conducted numerical simulations under two training schemes: one in which outcomes were generated using the true treatment success probabilities, and another where outcomes were based on the agent’s estimated probabilities. Across diverse hypothetical scenarios varying in cohort size, trial length, and prior knowledge, our SAC-based policy consistently approximated the ideal (oracle) policy in the true-probability setting. The agent was able to achieve success proportions close to that of the optimal policy while judiciously allocating more patients to the superior treatment. When the model was trained on estimated probabilities, performance degraded under high uncertainty or poorly specified priors, sometimes favoring a fixed, non-adaptive approach. Our results underscore the potential and limitations of employing SAC in adaptive trial design. Our proposed model provides a foundation for utilizing reinforcement learning in a clinical trial setting, highlighting the need for accurate prior information to fully realize its benefits. Our framework establishes a rigorous testbed for adaptive patient allocation, providing both theoretical insights and practical guidelines for future clinical trial designs.
Beyond the Binder: A Machine Learning Exploration of Decision-Making in College Football
(2025-04-10) Matheson, Thomas J.; Kornhauser, Alain LucienThis thesis investigates the decision-making behavior of college football coaches through the lens of data-driven optimal strategy. Literature has shown that despite the growing influence of analytics in sports, football is lagging behind, and many coaching decisions, such as fourth-down and two-point conversion attempts, continue to reflect conservative tendencies that may reduce a team’s chances of winning. To analyze this phenomenon, we develop a machine learning win probability model using XGBoost, trained on play-by-play data from the four most recent NCAA FBS football seasons, beginning in 2021. Uniquely, we develop three models to estimate the likelihood of winning. These are based on game state variables including score differential, time remaining, field position, and possession, as well as the point spread and advanced statistics for the second and third more advanced models, giving each successive model more information to learn from. Taking another novel approach, we then use an expected win probability framework to determine optimal decisions in high-leverage fourth-down situations and compare them against the actual coaching choices to quantify ”coaching aggression” by measuring deviations from optimality. The results refute the consensus that college coaches often exhibit excessive risk aversion, and actually finds that excessive aggression leads to the loss of about a quarter of a game above expectation every year. These findings dissent from the growing body of sports analytics literature and offer insight into how traditional intuition may still override data-driven logic on the sidelines, though additional data validation and confirmation is still necessary.
Beyond the Stats: Quantifying Intangible Qualities in NFL Draft Prospects
(2025-04-10) Beyene, Jonathan; Akrotirianakis, IoannisNFL teams invest heavily in the scouting and evaluation of college players before the draft, however many high draft picks underperform while later round picks emerge as stars. This thesis investigates whether intangible traits, such as leadership, competitiveness, and work ethic can be quantified from scouting reports and used to better predict a player’s success in the NFL. Using a combination of zero-shot classification and sentiment analysis, trait-specific sentiment scores are computed across multiple positions. These scores are then incorporated alongside quantitative combine data and college career statistics for K-Means clustering to group players with similar profiles. For each position where the incorporation of intangible traits was more explanatory than using strictly quantitative statistics, six regression models were trained to predict a custom-defined career success metric based on positional performance and Approximate Value (AV). Cluster assignments were one-hot encoded to determine their predictive impact on career success. Clusters with high cluster coefficients and late average draft positions were identified as “undervalued,” while clusters with early average draft positions and lower coefficients were considered “overvalued.” Results indicated that qualitative clustering often yielded higher explanatory power than models based purely on quantitative features. In several cases, we observed higher cluster coefficients with a later average draft pick, suggesting that NFL teams may be systematically overlooking certain high-potential players. This thesis demonstrates the potential of utilizing Natural Language Processing and qualitative data in the evaluation of professional football scouting, and how it can prove more effective than the traditional quantitative approach.
Blood Glucose Prediction and Control for Type I Diabetes Management: A Machine Learning Approach
(2025-04-09) Dantzler, Aaron; Akrotirianakis, IoannisType I Diabetes is a chronic disease in which patients cannot make insulin or make very little insulin to regulate their blood glucose. It affects over 1.7 million adults in the United States. People with Type I Diabetes are reliant on taking insulin every day, and recently insulin pumps and specifically Automated Insulin Delivery (AID) systems have revolutionized diabetes care, making treatment easier and more effective. There are three components needed for an AID system: a Continuous Glucose Monitor which relates patient blood glucose, an Insulin Pump which infuses insulin into the body, and an algorithm which translates information from the first two components to an amount of insulin necessary to keep blood glucose in the target range. Our focus will be on the last component. First, this thesis will provide an overview of machine learning techniques for blood glucose prediction on the novel DiaTrend dataset (2023) which has not been extensively studied before (although research on machine learning models has been applied to previous datasets). Our work finds that adding complexity to our model only barely improves performance and does not justify longer run times and less interpretable results. Rather, we recommend a simple Autoregressive time series model which reaches similar impressive performance to the rest of our models while being simpler for healthcare providers to interpret. In the second part of the thesis, we propose two new AID algorithms which utilize our Autoregressive model: the Threshold Controller and IOB Controller. Rather than a PID or MPC approach, these algorithms rely on a set of simple heuristics similar to what an actual patient would use. We find that in a stressful scenario, these controllers are able to improve time in Target Range by up to 12% more than the leading Open Source OpenAPS oref0 algorithm, while providing safety by mitigating low blood glucose. This work lays the foundation for researchers and healthcare providers to implement new AID algorithms which utilize a combination of machine learning models and patient-based heuristics.
Branching Out: An Alternative Approach to Variational Inference Based Clonal Tree Reconstruction Using Wilson’s Algorithm
(2025-04-10) Tsai, Kyle; Raphael, BenIn this thesis, we explore the application of variational inference in reconstructing tumor phylogenies, or clone trees, from copy number aberrations measured in single-cell DNA sequencing data. As a first step, we identify a key computational bottleneck in existing variational inference algorithms for clone tree inference [10], and propose a computationally attractive alternative. Specifically, we analyze and test the weighted spanning tree sampling algorithm LARS used in the clone tree inference pipeline VicTree [10]. Through comprehensive testing, we discover that LARS is not robust and fails to properly sample from its target sampling distribution. As an alternative, we propose applying Wilson’s sampling algorithm [13], and find that it significantly outperforms LARS at sampling from the target distribution. Furthermore, Wilson’s algorithm provides substantial computational benefits over LARS, and scales much better in the problem size. Having demonstrated the superior performance of Wilson’s sampling algorithm to LARS, we attempt to incorporate it into the VicTree variational inference pipeline. Preliminary results show that the clone tree reconstruction with the modified VicTree algorithm is promising, as it is more accurate and significantly faster than before, though our analysis also identifies several issues with the modified VicTree pipeline.
Egregiously Expensive Electricity: Bringing Your Bill Back to Earth Using Weather Data in a Deterministic Weighted Linear Program to Forecast Day-Ahead Prices
(2025-04-10) Witt, Harrison Matthew; Fan, JianqingRenewable energy is beginning to power the world. In 2024, 23% of the United States' electricity came from wind, hydropower, solar, and geothermal sources (EIA). This transition away from coal and natural gas has introduced challenges, particularly the unreliability of weather-dependent generation. Wind and solar rely on stochastic natural processes to produce power, making their output difficult to predict in advance. When these assets underperform their forecasted production, local grid operators must purchase emergency electricity in the spot market from fossil-fuel-powered peaker plants—at significant financial and environmental cost. These costs are passed on to consumers, raising electricity bills and introducing instability into the system.
A solution to this problem would be to have grid operators probabilistcally account for the uncertainty of renewable energy production. There exists a robust body of academic research offering stochastic alternatives to the current deterministic mixed-integer linear program (MILP) used by Independent System Grid Operators (ISOs). Unfortunately, federal regulatory agencies have rejected these proposals, citing concerns that overhauling the grid's optimization algorithm could cause unacceptable blackouts to essential public infrastructure like hospitals, water treatment facilities, and emergency response networks. This disconnect between academia, industry, and grid operators has created a critical gap in public-domain research: the need for an interpretable forecasting model that accounts for stochasticity while retaining the deterministic structure required for real-time system deployment. My thesis bridges this gap by enhancing the predictive power of deterministic day-ahead electricity pricing models by incorporating relevant weather features. My ultimate goal is to reduce consumer electricity prices and stabilize the renewable energy transition.
To address this challenge, I created three deterministic optimization models that map day-ahead electricity prices to their realizations. I began by constructing a baseline model that minimizes absolute error between day-ahead and real-time prices. I then extend this model to include weather-based features such as temperature, dewpoint, and relative humidity. These variables are selected based on domain knowledge of their influence on renewable generation variability. Using data from PJM's PSEG node with the greatest price volatility, I fit a linear program that minimizes mean absolute error (MAE) between forecasted and realized prices. My final reduced model includes only the most relevant weather predictors and demonstrates improved predictive accuracy without sacrificing interpretability.
My results show that incorporating weather features into a deterministic framework improves forecasting accuracy, reducing the MAE from 9.16 to 9.04. While this reduction is modest, it validates my hypothesis that deterministic models can be enhanced without requiring probabilistic and stochastic components. More importantly, my approach lays the groundwork for real-world integration, because I maintain compatibility with the current deterministic MILP structure used by ISO.
Equitable Staffing in Heterogenous Queueing Systems: A Framework for Delay-Minimizing Resource Allocation
(2025-04-10) Njuguna, Moses; Massey, William AlfredPublic resources are fundamentally limited in capacity, and delays in access are often unavoidable. Traditional approaches to resource planning typically ask: how many resources are needed to reduce delay? However, this question becomes significantly more complex when a single resource system must be partitioned into multiple, structurally distinct subsystems—such as by gender, geographic region, or urgency of need. Each of these subsystems exhibits its own demand profile and delay behavior, yet they all draw from a common, shared resource pool.
In such environments, we cannot make staffing decisions for each group in isolation. The overall system functions as an interconnected whole, and performance in one subsystem is inherently tied to the behaviors of the others. This thesis presents a queueing-theoretic framework for analyzing and staffing such group-structured public resource systems. Rather than minimizing delay within each group independently, we propose a method for balancing delay across heterogeneous subsystems in a way that reflects their joint use of limited capacity.
We term this approach equitable staffing—a strategy that allocates resources not merely in proportion to demand, but with sensitivity to delay patterns across groups. Applications to gender-partitioned restrooms and regionally-deployed ambulance fleets demonstrate how this framework can guide system-aware staffing decisions, ensuring both fairness and efficiency in public resource design.
Evaluating Domain-Specific Topic Reduction for Sparse Vector Document Retrieval
(2025-05-10) Irons, Carson P.; Hanin, BorisThis thesis investigates the limitations of current document retrieval systems and introduces an alternative architecture leveraging topic-level sparse indexing of contextual embeddings. This theoretical retrieval system seeks to achieve high computational efficiency through low latency and indexing overhead, while also achieving high semantic understanding and respecting local meaning and document cohesion. Additionally, the system supports scalable and context-aware document matching without reliance on user interaction data
In pursuit of these objectives, the system makes 2 key assumptions on the structure and content of documents within a chosen application domain. The first assumption is that documents can be broken into self-contained semantic components, the second assumes an ability to represent the application domain's distinct meanings as a finite, discrete set of topics.
At a high level, the proposed system aims to represent a document as a bag of topics, then apply sparse vector ranking algorithms at retrieval time. Topics are inferred by clustering the contextualized embeddings of semantic components within a learned embedding space.
The contributions of this thesis involve a review of existing retrieval methods, an outline of the proposed system's intuition and architecture, and an explorative implementation against a strategically chosen application domain. The thesis finds that standard embedding models (SBERT in this case) are insufficient for identifying application specific topics. Future work will focus on fine-tuning embedding models to better capture domain-specific semantics and fully evaluate the potential of this topic-based retrieval framework.
The thesis also provides the necessary tooling, for extension and modification of the retrieval pipeline. Namely, it supports the training and querying of the proposed retrieval system, while accepting custom implementations at each step.
Evaluating Individual Player Value and Positional Spending Efficiency in the National Football League
(2025-04-10) Jasti, Rahul; Scheinerman, DanielThis thesis introduces a data-driven framework for evaluating player value in the National Football League (NFL) by linking advanced performance metrics to player salaries. Despite the proliferation of advanced metrics in professional football, translating measures such as Wins Above Replacement, Expected Points Added, and Pro Football Focus grades into fair salary valuations remains challenging. The proposed framework addresses this gap by combining unsupervised learning with predictive modeling. Specifically, we use k-means to group players into performance-based archetypes. Then we train XGBoost regression models for each archetype to predict players’ expected average per-year salary. Finally, we design a constrained roster optimization model in order to maximize expected team wins under the salary cap. This segmented modeling approach enables a fine-grained evaluation of cost-efficiency across player roles and reveals systematic market inefficiencies. Results indicate that certain roles are consistently undervalued, whereas others are overvalued relative to their on-field contributions. We acknowledge that our findings are limited when considering an entire NFL roster due to the scarcity of advanced tracking data. We further acknowledge that our results are subject to uncertainty due to insufficient robustness checks and validation. Nevertheless, our results are intriguing and they immediately provide a practical application for researchers or general managers who want to improve their spending efficiency.
Evaluating the Geographically Weighted Regression for Modeling Fertility Rates in South Korea
(2025-05-01) Cho, Sung; Cerenzia, MarkSouth Korea’s total fertility rate (TFR) has steadily declined to unprecedented levels, reaching 0.72 in 2023, which is well below the replacement level of 2.1. As this decline continues, the trend poses severe economic and demographic challenges, including rapid population aging, labor force contraction, and increasing strain on welfare systems. This thesis evaluates the effectiveness of using the Geographically Weighted Regression (GWR) to model South Korea’s TFR at the local level. In particular, we revisit the work done by Jung et al. (2019), which fitted the model on data from 2019. One aspect of the model not addressed in their paper is its use of “pseudo-t statistics,” which is a result of the model’s violation of classical OLS assumptions. To address this gap, we re-estimate both an Ordinary Least Squares (OLS) model and a GWR model using updated 2023 data across 190 administrative regions. The model’s fit is assessed using test statistics including AICc, Moran’s I, and Koenker (BP). We then implement a 5,000-iteration nonparametric bootstrap procedure to evaluate the stability of the GWR coefficient estimates, computing empirical confidence intervals and percent-opposite-sign metrics for each coefficient. The results suggest that GWR improves model fit relative to OLS, capturing meaningful spatial heterogeneity in the data which OLS does not take into account. However, the bootstrap analysis reveals instability in the coefficient estimates, casting doubt on the reliability of inference drawn from the GWR pseudo-t statistics. These findings ultimately support the use of GWR as an exploratory rather than an immediately inferential tool and underscore the spatial and statistical complexity of TFR modeling in Korea.
Exact and Heuristic Optimization Methods for the Transportation of Radiopharmaceuticals
(2025-04-10) Desai, Jashvi; Akrotirianakis, IoannisThis thesis addresses the optimization of radiopharmaceutical transportation for distribution by developing a comprehensive variant of the Vehicle Routing Problem (VRP). Radiopharmaceuticals are highly time-sensitive due to their short half-lives, making timely delivery crucial for maintaining clinical efficacy in PET scan imaging. The proposed model incorporates several realistic extensions to the classical (capacitated) VRP model, and these extensions are heterogeneous fleets, time windows, pickup and delivery, and split deliveries. The combination of these features within a single, healthcare-specific VRP model tailored to radiopharmaceutical delivery represents a meaningful and novel advancement not covered by existing literature.
An exact mixed-integer programming model is implemented using Gurobi to explore the scalability limits of exact methods. A series of computational experiments on randomly generated networks of increasing size reveals that exact methods quickly become infeasible beyond 17–18 nodes due to exponential runtime growth. To address this limitation, an Adaptive Large Neighborhood Search (ALNS) heuristic is developed and tested. A real-world case study involving 44 medical imaging facilities in the Metro-Detroit area is then used to evaluate heuristic performance at scale. Results show that the ALNS consistently produces high-quality solutions in an efficient manner, achieving up to a 21.5% improvement over an initial feasible solution. The most effective operator combination emerged as random removal followed by savings insertion, with the "adaptive" portion of the algorithm quickly learning to prioritize these heuristics. These findings underscore the potential of algorithmic approaches in improving delivery reliability for time-sensitive healthcare supply chains.