Track: High School STEM Competition
Abstract
This STEM project would utilize the Regression Modeling to predict the Housing Price based on Real Estate parameters and data of the King County (Seattle) in 2014-2015. Clustering and Partitions algorithms were utilized to screen the factors and identified the top 2 parameters: (1) square foot of the house living space, and (2) house grade level. Multiple Linear and Polynomial Regression Models were conducted to search the optimal predictive model of the housing price. Based on the residual analysis, the extreme outlier was removed to improve the model goodness of fit. Authors utilized both Python script and JMP JSL script to benchmark the coding language. R-square and Residual Analysis were utilized as the model selection criteria. The modeling results have indicated that the polynomial regression model has outperformed the linear regression model based on both R-Square and Residual Analysis. Python script is similar to JMP script regarding the regression programming. JMP Fit Model Regression platform can simplify the coding process for the non-programming JMP users. Though, through this STEM project, authors have learned regression statistics which could also facilitate the Python learning on the same statistical regression functions. It’s an effective learning methodology to learn Statistics, JMP Platform, JMP JSL, and Python together through a fun STEM project.