Seattle Mariners Front Office Analytics

01 / The Problem

Three questions every front office faces every offseason.

Payroll-constrained teams can't afford bad contracts. The Mariners operate in a mid-market window where every dollar of AAV must justify itself in WAR — yet traditional point-estimate stats obscure how confident you should actually be in a player's true talent level, especially for bench players and early-season samples.

Beyond current contracts, two forward-looking questions shape roster construction: is the development pipeline producing MLB-caliber talent at the positions the team needs, and which free agents represent genuine surplus value relative to the market?

This project builds three interconnected analytical tools to answer all three, each feeding into the next.

02 / The Data

Four sources, one DuckDB warehouse.

FanGraphs (via pybaseball)

Player-season batting and pitching stats including WAR, PA, and contract data. 2015–2025. Ingested via curl_cffi to bypass Cloudflare; idempotent upserts into DuckDB.

Baseball Savant (Statcast)

Exit velocity, barrel%, and xwOBA per player-season from 2015 onward. Contact quality features used as inputs to the FA projection model.

Baseball Reference (Draft)

Amateur draft picks for classes 2013–2025, rounds 1–20, matched to mlbam_id for career WAR joins. Pick-slot baseline computed across all 30 teams.

Spotrac (Contracts)

Current Mariners contract AAV scraped via BeautifulSoup. Joined to player-seasons by normalized name. Full reload on each run — Spotrac is the authoritative source.

03 / Methodology

01

Hierarchical Bayesian WAR Model

Each player's latent true-talent WAR is drawn from a position-group distribution (C, 1B, 2B, 3B, SS, OF, UT), which is itself drawn from a league-wide hyperprior. Playing time is encoded as observation uncertainty — a player with 90 PA has a wide likelihood; one with 650 PA has a narrow one. This partial pooling prevents overreacting to small samples while still updating meaningfully on full-season evidence. Sampled with NUTS at target_accept=0.98 (4 chains × 1,000 draws, 0 divergences). Posterior summarized as mean ± 94% HDI per player.

02

Draft Cohort ROI — Pure SQL in DuckDB

Draft classes 2013–2025 analyzed entirely in DuckDB window functions and CTEs. Pick-slot WAR baselines computed as the league-average career WAR at each overall pick number across all 30 teams — any Mariners pick above that curve represents developmental surplus. MLB reach rate, time-to-debut, and a new development curve (cumulative WAR per draftee by years since draft) all exported as CSVs for Tableau. Positional pipeline gaps identified here feed directly into Pillar 3 FA targeting.

03

XGBoost FA Projection Model

Gradient-boosted regressor trained on player-seasons 2017–2023, predicting next-season WAR from trailing 3-year WAR, age, plate appearances, wOBA, and Statcast contact quality metrics (xwOBA, exit velocity, barrel%). Validation on 2024 → 2025. Point predictions from a standard XGBoost regressor; 80% prediction intervals from separate quantile models at p10 and p90, giving an honest uncertainty range per player. FA universe filtered to positional gaps from Pillar 2, then ranked by projected WAR surplus over the implied market rate.

05 / Live Dashboard

Explore the full analysis interactively.

All three pillars are published to Tableau Public. The dashboard includes the Bayesian payroll efficiency scatter, draft cohort reach rates, surplus picks, and the FA target shortlist.

Seattle Mariners
Front Office Analytics

Three questions every front office faces every offseason.

Four sources, one DuckDB warehouse.

Hierarchical Bayesian WAR Model

Draft Cohort ROI — Pure SQL in DuckDB

XGBoost FA Projection Model

The data surfaced clear signals.

Explore the full analysis interactively.

Let's Work
Together?

Three questions every front office faces every offseason.

Four sources, one DuckDB warehouse.

Hierarchical Bayesian WAR Model

Draft Cohort ROI — Pure SQL in DuckDB

XGBoost FA Projection Model

The data surfaced clear signals.

Explore the full analysis interactively.

Let's WorkTogether?

Let's Work
Together?