Track: High School and Middle School STEM Competition
Abstract
This paper is to build an empirical model to predict the NBA team winning percentage based on their team offensive, defensive, and differential statistics by collecting historical data during 2003-2016. The raw data have been standardized through Z transformation to remove mean and large variance bias effect. A multiple linear and step regression model was derived to predict the team winning record. After trimmed the insignificant regression terms, the derived model can predict team winning percent with R-Square > 0.95. The multi-linearity concerns were addressed by looking at the Variance Inflation Factor > 10. The redundant terms were removed to avoid over-fit risk. The regression model has identified 3-point Percentage, Turn Over, and Point per Game most critical to the team offensive efficiency. This observation is consistent with modern basketball. In defense, how to defend the rebound and opponent’s field goal percentage are most critical. Warriors’ 2015-2016 team record has been identified as an extreme outlier since their winning formula and team statistics are significantly different from the remaining 29 teams. The 2nd-order and Interaction Terms were added to enhance the prediction accuracy. The nonlinearity terms have indicated the complexity of the basketball team behaviors. Defense Field Goal% * Defense Point per Game was identified as the most significant interaction term. Which may reflect the Best Defense is the start of a good Offense. The model built based on 2003-2016 data was further validated by the new season 2016-2017. This model can provide NBA coaches and general managers how to draft, recruit, trade, or sign particular players to build a desired Championship team based on the winning % formula. This methodology can be applied to NBA play-off and other major professional sports like baseball, football, hockey, soccer.