（從我一個尊敬的作家，賽斯·高汀）

溝不是自己挖的。

在大多數情況下，我們處與溝裡因為那正是我們把自己放在哪裡。

行動會成為習慣，習慣會重複，因為做它們讓我們感到安全。

能使事情變得更有趣的最簡單方法，是簡單地停止重複習慣性的行為。

這通常來自對觸發器的反應。刪除那觸發器，您就可以改變習慣。

做微小的變化。做不同方法去記分。

明天是每天都會來。但是我們不必走與昨天同樣的路線。

Skip to content
# 一次又一次又一次

# Deep Learning Regression Model for Ames Iowa Housing Prices Using TensorFlow Take 6

# Deep Learning Regression Model for Ames Iowa Housing Prices Using TensorFlow Take 5

# Time Series Model for Weekly Births in Quebec Using Python

# Deep Learning Regression Model for Allstate Claims Severity Using Python Take 7

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

Professionals merely do work and ship art!

Month: February 2020

（從我一個尊敬的作家，賽斯·高汀）

溝不是自己挖的。

在大多數情況下，我們處與溝裡因為那正是我們把自己放在哪裡。

行動會成為習慣，習慣會重複，因為做它們讓我們感到安全。

能使事情變得更有趣的最簡單方法，是簡單地停止重複習慣性的行為。

這通常來自對觸發器的反應。刪除那觸發器，您就可以改變習慣。

做微小的變化。做不同方法去記分。

明天是每天都會來。但是我們不必走與昨天同樣的路線。

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The Ames Iowa Housing Prices dataset is a regression situation where we are trying to predict the value of a continuous variable.

INTRODUCTION: Many factors can influence a home’s purchase price. This Ames Housing dataset contains 79 explanatory variables describing every aspect of residential homes in Ames, Iowa. The goal is to predict the final price of each home.

In iteration Take1, we established the baseline mean squared error for further takes of modeling.

In iteration Take2, we converted some of the categorical variables from nominal to ordinal and observed the effects of the change.

In iteration Take3, we examined the feature selection technique of attribute importance ranking by using the Gradient Boosting algorithm. By selecting only the most important attributes, we decreased the processing time and maintained a similar level of RMSE compared to the baseline.

In iteration Take4, we examined the feature selection technique of recursive feature elimination (RFE) by using the Gradient Boosting algorithm. By selecting up to 100 attributes, we decreased the processing time and maintained a similar level of RMSE compared to the baseline.

In iteration Take5, we constructed several Multilayer Perceptron (MLP) models with one, two, and three hidden layers. We also observed how the different model architectures affect the RMSE metric.

In this Take6 iteration, we will add Dropout layers to our Multilayer Perceptron (MLP) models. We will observe how the Dropout layers affect the RMSE metric.

ANALYSIS: In iteration Take1, the baseline performance of the machine learning algorithms achieved an average RMSE of 31,172. Two algorithms (Ridge Regression and Gradient Boosting) achieved the top RMSE metrics after the first round of modeling. After a series of tuning trials, Gradient Boosting turned in the best overall result and achieved an RMSE metric of 24,165. By using the optimized parameters, the Gradient Boosting algorithm processed the test dataset with an RMSE of 21,067, which was even better than the prediction from the training data.

In iteration Take2, Gradient Boosting achieved an RMSE metric of 23,612 with the training dataset and processed the test dataset with an RMSE of 21,130. Converting the nominal variables to ordinal did not have a material impact on the prediction accuracy in either direction.

In iteration Take3, Gradient Boosting achieved an RMSE metric of 24,045 with the training dataset and processed the test dataset with an RMSE of 21,994. At the importance level of 99%, the attribute importance technique eliminated 222 of 258 total attributes. The remaining 36 attributes produced a model that achieved a comparable RMSE to the baseline model. The processing time for Take2 also reduced by 67.90% compared to the Take1 iteration.

In iteration Take4, Gradient Boosting achieved an RMSE metric of 23,825 with the training dataset and processed the test dataset with an RMSE of 21,898. The RFE technique eliminated 208 of 258 total attributes. The remaining 50 attributes produced a model that achieved a comparable RMSE to the baseline model. The processing time for Take3 also reduced by 1.8% compared to the Take1 iteration.

In iteration Take5, all models processed the test dataset and produced an RMSE near or around the 23,000 level. The two-layer model with 128 and 64 nodes (Model 2C) was able to achieve the best RMSE of 22,708 using the test dataset. All models eventually overfit, and the models with more layers overfit much faster than the simpler models.

In this Take6 iteration, all models again processed the test dataset and produced an RMSE near or around the 23,000 level. All models eventually overfit, but the Dropout layers can help by reducing overfitting.

CONCLUSION: For this iteration, the addition of Dropout layers produced similar RMSEs for all models. For this dataset, we should consider experimenting with more regularization techniques.

Dataset Used: Kaggle Competition – House Prices: Advanced Regression Techniques

Dataset ML Model: Regression with numerical and categorical attributes

Dataset Reference: https://ww2.amstat.org/publications/jse/v19n3/decock.pdf

One potential source of performance benchmarks: https://www.kaggle.com/c/house-prices-advanced-regression-techniques

The HTML formatted report can be found here on GitHub.

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The Ames Iowa Housing Prices dataset is a regression situation where we are trying to predict the value of a continuous variable.

INTRODUCTION: Many factors can influence a home’s purchase price. This Ames Housing dataset contains 79 explanatory variables describing every aspect of residential homes in Ames, Iowa. The goal is to predict the final price of each home.

In iteration Take1, we established the baseline mean squared error for further takes of modeling.

In iteration Take2, we converted some of the categorical variables from nominal to ordinal and observed the effects of the change.

In iteration Take3, we examined the feature selection technique of attribute importance ranking by using the Gradient Boosting algorithm. By selecting only the most important attributes, we decreased the processing time and maintained a similar level of RMSE compared to the baseline.

In iteration Take4, we examined the feature selection technique of recursive feature elimination (RFE) by using the Gradient Boosting algorithm. By selecting up to 100 attributes, we decreased the processing time and maintained a similar level of RMSE compared to the baseline.

In this Take5 iteration, we will construct several Multilayer Perceptron (MLP) models with one, two, and three hidden layers. We will observe how the different model architectures affect the RMSE metric.

ANALYSIS: In iteration Take1, the baseline performance of the machine learning algorithms achieved an average RMSE of 31,172. Two algorithms (Ridge Regression and Gradient Boosting) achieved the top RMSE metrics after the first round of modeling. After a series of tuning trials, Gradient Boosting turned in the best overall result and achieved an RMSE metric of 24,165. By using the optimized parameters, the Gradient Boosting algorithm processed the test dataset with an RMSE of 21,067, which was even better than the prediction from the training data.

In iteration Take2, Gradient Boosting achieved an RMSE metric of 23,612 with the training dataset and processed the test dataset with an RMSE of 21,130. Converting the nominal variables to ordinal did not have a material impact on the prediction accuracy in either direction.

In iteration Take3, Gradient Boosting achieved an RMSE metric of 24,045 with the training dataset and processed the test dataset with an RMSE of 21,994. At the importance level of 99%, the attribute importance technique eliminated 222 of 258 total attributes. The remaining 36 attributes produced a model that achieved a comparable RMSE to the baseline model. The processing time for Take2 also reduced by 67.90% compared to the Take1 iteration.

In iteration Take4, Gradient Boosting achieved an RMSE metric of 23,825 with the training dataset and processed the test dataset with an RMSE of 21,898. The RFE technique eliminated 208 of 258 total attributes. The remaining 50 attributes produced a model that achieved a comparable RMSE to the baseline model. The processing time for Take3 also reduced by 1.8% compared to the Take1 iteration.

In this Take5 iteration, all models processed the test dataset and produced an RMSE near or around the 23,000 level. The two-layer model with 128 and 64 nodes (Model 2C) was able to achieve the best RMSE of 22,708 using the test dataset. All models eventually overfit, and the models with more layers overfit much faster than the simpler models.

CONCLUSION: For this iteration, the different model architectures produced similar RMSEs. For this dataset, we should consider experimenting with more regularization techniques.

Dataset Used: Kaggle Competition – House Prices: Advanced Regression Techniques

Dataset ML Model: Regression with numerical and categorical attributes

Dataset Reference: https://ww2.amstat.org/publications/jse/v19n3/decock.pdf

One potential source of performance benchmarks: https://www.kaggle.com/c/house-prices-advanced-regression-techniques

The HTML formatted report can be found here on GitHub.

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a time series prediction model and document the end-to-end steps using a template. The Daily Births in Quebec dataset is a time series situation where we are trying to forecast future outcomes based on past data points.

INTRODUCTION: The problem is to forecast the weekly number of births in the province of Quebec, Canada. The dataset describes a time-series of baby births for 14 years (1977-1990), and there are 5113 daily observations. To avoid out-of-memory issues during the processing, we first summarized the daily data into 730 weekly sums. We subsequently used the first 80% of the observations for training and testing various models while holding back the remaining observations for validating the final model.

ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 70. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (2, 1, 2) with the seasonal order being (1, 0, 2, 52). Furthermore, the chosen model processed the validation data with an RMSE of 59, which was better than the baseline model as expected.

Dataset Used: Monthly Sunspot Number in Zurich, January 1749 through December 1983

Dataset ML Model: Time series forecast with numerical attributes

Dataset Reference: Rob Hyndman and Yangzhuoran Yang (2018). tsdl: Time Series Data Library. v0.1.0. https://pkg.yangzhuoranyang./tsdl/.

The HTML formatted report can be found here on GitHub.

SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Allstate Claims Severity dataset is a regression situation where we are trying to predict the value of a continuous variable.

INTRODUCTION: Allstate is interested in developing automated methods of predicting the cost, and hence severity, of claims. In this Kaggle challenge, the contestants were asked to create an algorithm that could accurately predict claims severity. Each row in this dataset represents an insurance claim. The task is to predict the value for the ‘loss’ column. Variables prefaced with ‘cat’ are categorical, while those prefaced with ‘cont’ are continuous.

In iteration Take1, we constructed machine learning models using the original dataset and with minimum data preparation and no feature engineering. The XGBoost model serves as the baseline for the future iterations of modeling.

In iteration Take2, we tuned additional parameters of the XGBoost model and improved the MAE metric further.

In iteration Take3, we constructed several basic Multilayer Perceptron (MLP) models with one hidden layer. The basic MLP model serves as the baseline model as we build more complex MLP models in future iterations.

In iteration Take4, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. We also observed whether the additional hidden layer has a positive effect on MAE when compared to models that have just one hidden layer.

In iteration Take5, we constructed several Multilayer Perceptron (MLP) models with three hidden layers. We also observed whether the additional hidden layer has a positive effect on MAE when compared to models that have just one or two hidden layers.

In iteration Take6, we constructed several three-layer Multilayer Perceptron (MLP) models with batch normalization. We also observed whether the batch normalization technique has a positive effect on MAE when compared to models without the batch normalization.

In this iteration, we will tune the MLP model that has 512/128/64 nodes and 0.25/0.25/0.25 Dropout ratios. We will perform a grid search for the most optimized model using different learning rates, kernel initializers, and batch sizes.

ANALYSIS: In iteration Take1, the baseline performance of the machine learning algorithms achieved an average MAE of 1301. eXtreme Gradient Boosting (XGBoost) achieved the top MAE metric after the first round of modeling. After a series of tuning trials, XGBoost achieved an MAE metric of 1199. By using the optimized parameters, the XGBoost algorithm processed the test dataset with an MAE of 1204, which was in line with the MAE prediction from the training data.

In iteration Take2, the further-tuned eXtreme Gradient Boosting (XGBoost) model achieved an improved MAE metric of 1191 using the training data. By using the same optimized parameters, the XGBoost algorithm processed the test dataset with an MAE of 1195, which was in line with the MAE prediction from the training data.

In iteration Take3, the simple MLP model with 128 nodes achieved an MAE metric of 1193 on the test dataset after 50 epochs. The MLP model with 1024 nodes processed the same test dataset with an MAE of 1170 after the same number of epochs but with a much larger over-fitting.

In iteration Take4, the MLP model with 128/64 nodes and 0.25/0.25 Dropout ratios achieved an MAE metric of 1169 on the test dataset after 31 epochs. The MLP model with 256/128 nodes and 0.25/0.50 Dropout ratios also processed the same test dataset with an MAE of 1169 after 25 epochs.

In iteration Take5, the MLP model with 512/128/64 nodes and 0.25/0.50/0.50 Dropout ratios achieved an MAE metric of 1164 on the test dataset after 16 epochs. The MLP model with 1024/512/256 nodes and 0.25/0.50/0.50 Dropout ratios also processed the same test dataset with an MAE of 1164 after nine epochs.

In iteration Take6, the MLP model with 512/128/64 nodes and 0.25/0.25/0.25 Dropout ratios achieved an MAE metric of 1157 on the test dataset after 22 epochs. The MLP model with 1024/512/256 nodes and 0.50/0.50/0.50 Dropout ratios also processed the same test dataset with an MAE of 1159 after 48 epochs.

In this Take7 iteration, the models with the learning rate of 0.0005 seemed to produce the most stable training and testing loss curves. Those models also achieved the MAEs between 1158-1161 for the testing dataset around 20 epochs before they started to overfit.

CONCLUSION: For this iteration, the 512/128/64 nodes and 0.25/0.25/0.25 Dropout MLP model achieved good overall results using the learning rate of 0.0005. For this dataset, we should consider using this model for further modeling activities or production uses.

Dataset Used: Allstate Claims Severity Data Set

Dataset ML Model: Regression with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/allstate-claims-severity/data

One potential source of performance benchmarks: https://www.kaggle.com/c/allstate-claims-severity/leaderboard

The HTML formatted report can be found here on GitHub.

You must be logged in to post a comment.