Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a time series prediction model and document the end-to-end steps using a template. The Minimum Daily Temperatures dataset is a time series situation where we are trying to forecast future outcomes based on past data points.
INTRODUCTION: The problem is to forecast the minimum weekly temperatures In Australia. The dataset describes a time-series of temperatures over ten years (1981-1990) in the city Melbourne, Australia, and there are 3650 daily observations. The source of the data is credited as the Australian Bureau of Meteorology. We used the first 70% of the observations for training and testing various models, while holding back the remaining observations for validating the final model.
For this iteration of the machine learning modeling, we will summarize the daily temperature measurements into weekly numbers before modeling.
ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 1.940. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (1, 0, 2) with the seasonal order being (0, 1, 1, 52). Furthermore, the chosen model processed the validation data with an RMSE of 1.611, which was better than the baseline model as expected.
CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result and should be considered for further modeling.
Dataset Used: Minimum Daily Temperatures
Dataset ML Model: Time series forecast with numerical attributes
Dataset Reference: https://machinelearningmastery.com/time-series-datasets-for-machine-learning/
The HTML formatted report can be found here on GitHub.