DIAMONDS ON THE LINE: Profits through Investment Gaming
Transcription
DIAMONDS ON THE LINE: Profits through Investment Gaming
DIAMONDS ON THE LINE: Profits through Investment Gaming Clay Graham DEPAUL UNIVERSITY The boys (and girls) are back in town! Let’s Look at Why We’re Here? Build an analytical model for investing in a baseball game’s outcome resulting in: • Picking a team(s) • Quantifying level(s) of investment In order to: • Maximize expected value of profits • Subject to: economic constraints and risk tolerance “It’s Not Gambling!” Jeff Ma Sniff and Kick (in academic parlance-research) Input from Our Crack Research Team Access to Ultra High Speed Computers (in use at Wharton?) Did you know? Percentage Inequities Between Game and Line 80% 70% 60% 50% 40% 30% 20% 10% 0% Overrated 70% Undervalued 54% 30% Home Favored % Source: http://oddswarehouse.com/ 46% Road Winning % “It’s flat-out scary, BABY.” Dick Vitale Sources: (1) http://www.prnewswire.com/news-releases/sports-betting-tops-one-trillion-us-dollars-says-bookmaker-to-the-billionaires-273768381.html, accessed February 11,2015 (2) ESPN the Magazine, February 16,2015 Scoring Linked to K/BB Growing Strike Zone? (runs/game vs K/BB – time line) r2 = .80 Source: http://espn.go.com/mlb/stats/team/_/stat/ , accessed 2/8/2015 How it Works Mapping Path to Profits Data Information Reservoir What's in a Line Production Function Matchup's Batter vs Pitcher Road vs Home LINE μ Implied P(W) Park Factor αβ parameters EDGE Filters Governing Constraints Decision & Feedback GAME σ Gamma Functions EVRuns Road Normalized IP(W) Key: DECISION EVRuns Home P(W) Road path Home Path Joint path Objective Profit by Capitalizing on the Market Inequities Between the Game and the Line What is the Money Line? Money Line is the Price of an Investment (bet) Road -113: Favorite, risk 113 to win 100 Home: 105: Underdog, risk 100 to win 105 Juice, “Vig”, and other Mysteries Even bet defined: Home -110, Road -110 • Bet 110 to win 100 • Dime line • House: Receives 220 for bets (2 @ 110) Pays out 210 to winner (original bet + winnings) Keeps 10 as profit 4.5% (10/220) – this is the Vig Winning Lines 2007-2014 (median: -115) (mean: -105) Underdog wins 40% Favorite wins 60% Source: http://oddswarehouse.com/ Implied Probability of Winning (line) Normalized Implied P(W) Given Lines: Road -113, Home 105 Implied probability of winning, calculation • Road: 113 / (113+100) = 53.0% • Home: 100 / (105 + 100) = 48.8% • Total 101.8% Normalize (sum of probabilities =100%) • Road: 53.0% / 101.8% = 52.1% • Home 48.8% / 101.8% = 47.9% In the Words of the Great Western Philosopher “It's tough to make predictions, especially about the future.” -Yogi Berra Baseball’s Pythagorean Theorem Probability of the Home Team Winning: P(Whome) = (Runshome)2 / {(Runshome)2 + (Runsroad)2} Building the Production Function Runs / Out Fundamental Elements of Productivity • • • • • Singles Doubles Triples Home runs Base on balls runs/out = f(%1, %2, %3, %HR, %BB) expected value runs / 9 innings = (27 * runs/out) Production Function Runs / Out (note: forced zero intercept) Multiple Regression for Runs/Out Summary ANOVA Table Explained Unexplained Multiple R R-Square Adjusted R-Square StErr of Estimate 0.744 0.553 0.553 0.054 Degrees of Freedom Sum of Squares Mean of Squares F-Ratio p-Value 4 7286 26.640 21.529 6.660 0.003 2253.897 < 0.0001 Coefficient Standard Error t-Value p-Value 0 0.3928 0.9053 1.7361 0.1716 NA 0.0094 0.0221 0.0307 0.0155 NA 41.6966 40.9909 56.5698 11.0396 NA < 0.0001 < 0.0001 < 0.0001 < 0.0001 Regression Table Constant %1 %2+3 %HR %BB source: www.retrosheet.org/gamelogs/, for years 2011,2012 and 2013 Confidence Interval 95% Lower Upper NA 0.3743 0.8620 1.6760 0.1411 NA 0.4113 0.9486 1.7963 0.2021 Production Distribution: Runs/Out source: www.retrosheet.org/gamelogs/, for years 2011,2012 and 2013 Runs/9 Inning Game Distributed as a Gamma Function source: www.retrosheet.org/gamelogs/, for years 2011,2012 and 2013 Gamma Function also has Mathematically Desirable Characteristics Gamma function’s two parameters (domain lower boundary = 0) " shape , $ scale 1st moment 2nd moment : = " * $ , F2 = $2 * " Solving for " and $ in terms of: : and F2 " = :2 / F2 $ = F2 / : Used to calculate: each team’s expected run production Matchups Dynamic Club Statistics: Continuously Calculate key Variables Time Variance Ranking Calculating Game’s Matchup Variances (F2Road Run Scored + (F2Road Runs allowed F2Home Runs Allowed) = F2Road vs Home Scoring + F2Home Runs Scored) = F2Home vs Road Scoring Scoring Prediction Tabulation Inputs to Gamma Distribution Matchups: Batter-Pitcher (micro)→ μ Road-Home performance (macro)→σ2 Fore each team Imposing Matchup Constraints (filters) Time Separate the Wheat from the Chaff Filters Governing Constraints Variable OB+S K/BB PA Effective Outcomes Notes: (1) Too good to be true Road Home Net Rank <-10 >9 % Δ EVR <-25% >15% % Δ K/BB <-50% >75% % Δ OB+S <25% >10% NIP(W) >47% >45% Edge min >0% >0 Edge max1 <21% <17% PArd >71 >10 PAhm >11 >52 Data base age ≈ 65 days “Rankmeister” and Time Period Impact Filters Governing Constraints Rank Advantage Varies with Database Tabulation Period 64 days Rank Advantage Home = 13 2 - 4 = -2 26 – 11 = 15 108 days -2 15 13 Rank Advantage Road = -6 1 - 8 = -7 11 – 10 = 1 -7 1 -6 Multiple Selection Criteria From regression equation Batter pitcher matchup calculations Road Team Products Should be Adjusted for Park Characteristics Park Factor 1 Source: http://espn.go.com/mlb/stats/parkfactor notes: (1) Inverse Fibonacci weights Monte Carlo Simulation Winning Margin - Runs Density function: Home Team Winning Margin Δ runs = Γ("hm,$ hm)home - Γ("rd,$ rd)road Monte Carlo – Run Differential Between Teams P(Whome) = 57% Probability of Winning Through Monte Carlo Simulation Density functions are incorporated rather than point estimates (averages) P(win)home = Γ("hm,$ hm)home2 / (Γ("hm,$ hm)home2 + Γ("rd,$ rd)road2) Monte Carlo – Each Team’s Density Function P(Whome) = 58% Monte Carlo Simulation Probability of Winning Monte Carlo vs. Monolithic Results 56.5% P(Whome) = f(run differential) = 57% 57.9% P(Whome) = f(density function) = 58% Monolithic: P(Whome) = Γ(",$)home2 / (Γ(",$)home2 + Γ(",$)road2) = 63% It all Leads to Gaining an EDGE Just what is the EDGE? Simply stated: EDGE = P(Win) – Normalized Implied P(Win) game line Basic Investment Function - %Bankroll (expected winning percentage ≈ 55%-58%) Generalized “S” Curve (used to fit variable EDGE function) % Bankroll (staking) = Ab+((At-Ab) / (1+Exp (-(EDGE-X0) / W))) Where: Ab = minimum proportion of bankroll - base At = maximum proportion of bankroll - top W = transition slope - width X0 = shifting factor EDGE = P(W) – NIP(W) Dynamic Investment Function Changes with Probability of Winning Too good to be true Does it Work? Bankroll More Than Doubled in Just Two Months Source: tabulated over 2014 season Original Bankroll up over Ten fold previous slide time period Source: tabulated over 2014 season Feeling So Good! Post Season Source: tabulated over 2014 season Summary of Some Operational Results Winning percentage 68%, Average daily return on at risk capital 35%, Overall return on original bankroll 1,425%, Average bets per day 1.91, Average bet size 3.5% of available bankroll, Percent of games invested 23%, EDGE based investment results in a doublings of profits Modeling Contributions Runs/out enhances accuracy of modeling, Game day batter-starting pitcher matchups effectively feed the production function, Road-Home variance matchup generates scoring F2, Dynamic time variable drives algorithm, Monte Carlo used to more effectively determine probability of winning, Genetic programming and filters powerful profit optimizer. “Go where the numbers take you!”™ Only 37 Days until the Season Opener! Our Year! Time for Your Questions! Epilog What Happened Since SSAC15? Nate Silver challenge Assault of the hedge funds