Track: High School STEM Competition
Abstract
This STEM paper will study the Time Series data file Raleigh Temps.jmp in the JMP sample Library, which contains maximum monthly temperatures measured in degrees Fahrenheit from 1980 to 1990. The objective of this paper is to forecast the Raleigh Temperature for 1990 to 2010. Among the four STEM components, the Science studied is Climatology or the study of the Earth’s Weather patterns; Technology is used to predict the Raleigh Max Monthly Temperature for the next twenty years; Engineering focuses on comparing Different Statistical Models on the Time Series Data, and mathematical tools like Statistics are applied. Most traditional and modern Data mining platforms can visualize the data distribution, Normal Quantile, Normal Mixture, Outliers very well, detect process capability and process stability, build Multiple Regression Model and identify month as the main factor, detect Clusters or Principal Components through Eigen Analysis, and use Neural Network or Partition Trees to build the Transfer Function Profiler. The major findings among these non-Time Series Platforms are that there is month to month cyclic behavior within each year but no year-to-year trend pattern is detected. However, there are several limitations among these non-time series platforms. It cannot decompose the seasonal component (cyclic month-month) from the trend component (year to year), cannot determine the relative strength of the “seasonal” and “trend” components, cannot determine the optimal smoothing setting if the curve is highly modulated, and cannot forecast or predict future points. JMP Time Series and Forecast platform were further used on the same time series data file. The Time Series decomposition statistics were utilized to separate the seasonal component from the trend component, and the smoothing technique cleaned the Forecasting Error. Seasonal lag was detected and confirmed by the Autocorrelation Function (ACF) plot and Variogram plot. The Time Series Forecast Platform can help predict Raleigh Temperature for 1990 to 2010 based on the 1980 to 1990 data. Authors are also continuing this Time Series STEM project on the following areas – Decomposition and Smoothing Statistics, Non-Seasonal and Seasonal ARIMA Models, and Forecasting and Prediction Interval Statistics. Time Series Analysis is not just popularly used in Finance Forecasting but also powerful for predicting any future uncertainty from the Time Series data.