Tag: ARIMA

Univariate Time Series Modeling Template Using ARIMA Version 8

time series, univariate, Python, ARIMA, template

As I work on practicing and solving machine learning (ML) problems, I find myself repeating a set of steps and activities repeatedly.

Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a project template that can be used to support time series analysis using the ARIMA modeling and Python.

Version 8 of the time series template contains minor adjustments and corrections to the prevision version of the template.

You will find the Python time series template on the Analytics Project Templates page.

Time Series Model for Chicago Live Births in the USA Using Python and ARIMA

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a time series prediction model and document the end-to-end steps using a template. Live Births in the United States dataset is a time series situation where we attempt to forecast future outcomes based on past data points.

INTRODUCTION: The United Nations Statistics Division collects, compiles, and disseminates official demographic and social statistics on various topics. The Demographic Yearbook provides statistics on population size and composition, births, deaths, marriage, and divorce rates annually. The problem is to forecast the monthly number of live births in the United States. The dataset describes a time-series of individuals over 47 years (1969-2015), and there are 564 observations. We used the first 90% of the instances for training various models while holding back the remaining data for validating the final model.

ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 16735. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (3, 1, 4) with the seasonal order (2, 0, 2, 12). Furthermore, the chosen model processed the validation data with an RMSE of 7177, which was better than the baseline model as expected.

CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result and should be considered for further modeling.

Dataset Used: Live births by month of birth | Demographic Statistics Database | United Nations Statistics Division

Dataset ML Model: Time series forecast with numerical attribute

Dataset Reference: https://data.un.org/Data.aspx?d=POP&f=tableCode:55

The HTML formatted report can be found here on GitHub.

Time Series Model for Chicago Fed Hiring Expectations Survey Using Python and ARIMA

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a time series prediction model and document the end-to-end steps using a template. The Chicago Fed Hiring Expectations Survey dataset is a time series situation where we attempt to forecast future outcomes based on past data points.

INTRODUCTION: The problem is to forecast the monthly number of business condition surveys conducted by the Chicago Federal Reserve for the area of hiring expectations in the next 12 months. The dataset describes a time-series of survey calculation (between plus 40 and minus 40) for over eight years (2013-2020), and there are 93 observations. We used the first 80% of the observations for training various models while holding back the remaining observations for validating the final model.

ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 12.132. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (1, 0, 1). Furthermore, the chosen model processed the validation data with an RMSE of 8.694, which was better than the baseline model as expected.

CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result and should be considered for further modeling.

Dataset Used: Chicago Fed Survey of Business Conditions: Hiring Expectations in the next 12 Months, January 2013 to September 2020

Dataset ML Model: Time series forecast with numerical attribute

Dataset Reference: Federal Reserve Bank of Chicago, Chicago Fed Survey of Business Conditions: Hiring Expectations in the next 12 Months [CFSBCHIRINGEXP], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/CFSBCHIRINGEXP, November 4, 2020.

The HTML formatted report can be found here on GitHub.

Time Series Model for University of Michigan Inflation Expectation Using Python and ARIMA

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a time series prediction model and document the end-to-end steps using a template. The Inflation Expectation dataset from the University of Michigan is a time series situation where we are trying to forecast future outcomes based on past data points.

INTRODUCTION: The problem is forecasting the monthly number of median expected price change next 12 months based on consumers’ surveys. The dataset describes a time-series of percentages over 42 years (1978-2020), and there are 512 observations. We used the first 80% of the observations for training various models while holding back the remaining observations for validating the final model.

ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 0.221. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (4, 1, 2). Furthermore, the chosen model processed the validation data with an RMSE of 0.206, which was better than the baseline model as expected.

CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result and should be considered for further modeling.

Dataset Used: University of Michigan: Inflation Expectation

Dataset ML Model: Time series forecast with numerical attribute

Dataset Reference: University of Michigan, University of Michigan: Inflation Expectation [MICH], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/MICH, October 24, 2020.

The HTML formatted report can be found here on GitHub.

Time Series Model for Birmingham Parking Occupancy Using Python and ARIMA Part 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a time series prediction model and document the end-to-end steps using a template. The Birmingham Parking Occupancy dataset is a time series situation where we are trying to forecast future outcomes based on past data points.

INTRODUCTION: The problem is to forecast the hourly number of parking occupancy for a parking facility in Birmingham. The dataset describes a time-series of parking occupancy over three months between October 2016 and December 2016, and there are 1834 hourly observations. We used the first 90% of the observations for training various models while holding back the remaining observations for validating the final model.

From iteration Part1, we trained and validated an ARIMA model using just one facility, BHMBCCMKT01, within the dataset.

In this Part2 iteration, we will train and validate an ARIMA model for each one of the facilities within the dataset.

ANALYSIS: The baseline prediction (or persistence) for the parking facility BHMBCCMKT01 resulted in an RMSE of 46. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (2, 0, 1) with the seasonal order (2, 0, 0, 24). Furthermore, the chosen model processed the validation data with an RMSE of 22, which was better than the baseline model as expected.

Parking structure: BHMBCCPST01

  • RMSE for the persistent model is: 38
  • Final Non-season order: (0, 0, 1) Final Seasonal Order: (1, 0, 1, 24)
  • RMSE from the validation data is: 20

Parking structure: BHMBCCSNH01

  • RMSE for the persistent model is: 157
  • Final Non-season order: (2, 0, 1) Final Seasonal Order: (0, 0, 2, 24)
  • RMSE from the validation data is: 75

Parking structure: BHMBCCTHL01

  • RMSE for the persistent model is: 84
  • Final Non-season order: (0, 0, 0) Final Seasonal Order: (1, 0, 1, 24)
  • RMSE from the validation data is: 24

Parking structure: BHMNCPPLS01

  • RMSE for the persistent model is: 32
  • Final Non-season order: (4, 0, 0) Final Seasonal Order: (1, 0, 0, 24)
  • RMSE from the validation data is: 16

Parking structure: BHMBRCBRG02

  • RMSE for the persistent model is: 189
  • Final Non-season order: (0, 1, 3) Final Seasonal Order: (0, 0, 2, 24)
  • RMSE from the validation data is: 95

Parking structure: BHMBRCBRG03

  • RMSE for the persistent model is: 78
  • Final Non-season order: (2, 1, 0) Final Seasonal Order: (0, 0, 2, 24)
  • RMSE from the validation data is: 41

Parking structure: BHMBRTARC01

  • RMSE for the persistent model is: 109
  • Final Non-season order: (1, 0, 0) Final Seasonal Order: (1, 0, 0, 24)
  • RMSE from the validation data is: 120

Parking structure: BHMEURBRD01

  • RMSE for the persistent model is: 77
  • Final Non-season order: (1, 0, 4) Final Seasonal Order: (2, 0, 1, 24)
  • RMSE from the validation data is: 24

CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result, and we should consider using ARIMA for further modeling.

Dataset Used: Parking Birmingham Data Set

Dataset ML Model: Time series forecast with numerical attribute

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Parking+Birmingham

The HTML formatted report can be found here on GitHub.