Month: April 2022

我們應該如何慶祝你的節日?

(從我一個尊敬的作家,賽斯·高汀

如果今天是紀念你的節日,那主題會是什麼?

如果我們必須檢查關於你的一切,你的業績,你的影響力,你的聲譽,我們會畫出一個什麼樣的正面畫? 我們會用什麼樣的口號,橫幅和問候語來慶祝您和您的工作?

將一個組織或一個人的工作歸結為一兩句話是不完全准確的,但無論如何我們都會這樣做。

你的歸結又是什麼呢?

Multi-Class Tabular Classification Model for Durum Wheat Identification Using Python and TensorFlow

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Durum Wheat Identification dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Wheat is the main ingredient of most common food products in many people’s daily lives. Obtaining good quality wheat kernels is an essential matter for food supplies. In this study, the research team attempted to examine and classify type-1252 durum wheat kernels to obtain top-quality crops based on their vitreousness. The researchers used a total of 236 morphological, color, wavelet, and gaborlet features to classify durum wheat kernels and foreign objects by training several Artificial Neural Networks (ANNs) with different amounts of elements based on the feature rank list obtained with the ANOVA test.

ANALYSIS: The average performance of the preliminary TensorFlow models achieved an accuracy benchmark of 99.23%. When we processed the test dataset with the final model, the model achieved an accuracy score of 99.55%.

CONCLUSION: In this iteration, TensorFlow appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Durum Wheat Dataset

Dataset ML Model: Multi-Class classification with numerical features

Dataset Reference: https://www.muratkoklu.com/datasets/

One source of potential performance benchmarks: https://doi.org/10.1016/j.compag.2019.105016

The HTML formatted report can be found here on GitHub.

Multi-Class Tabular Classification Model for Durum Wheat Identification Using Python and XGBoost

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Durum Wheat Identification dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Wheat is the main ingredient of most common food products in many people’s daily lives. Obtaining good quality wheat kernels is an essential matter for food supplies. In this study, the research team attempted to examine and classify type-1252 durum wheat kernels to obtain top-quality crops based on their vitreousness. The researchers used a total of 236 morphological, color, wavelet, and gaborlet features to classify durum wheat kernels and foreign objects by training several Artificial Neural Networks (ANNs) with different amounts of elements based on the feature rank list obtained with the ANOVA test.

ANALYSIS: The performance of the preliminary XGBoost model achieved an accuracy benchmark of 99.30%. After a series of tuning trials, the final model processed the training dataset with an accuracy score of 99.60%. When we processed the test dataset with the final model, the model achieved an accuracy score of 99.55%.

CONCLUSION: In this iteration, the XGBoost model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Durum Wheat Dataset

Dataset ML Model: Multi-Class classification with numerical features

Dataset Reference: https://www.muratkoklu.com/datasets/

One source of potential performance benchmarks: https://doi.org/10.1016/j.compag.2019.105016

The HTML formatted report can be found here on GitHub.

Univariate Time Series Model for Annual Immigration into USA Using TensorFlow

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a time series prediction model and document the end-to-end steps using a template. The Annual Immigration into USA dataset is a univariate time series situation where we attempt to forecast future outcomes based on past data points.

INTRODUCTION: The problem is to forecast the annual number of people immigrating to the United States. The dataset describes a time-series of people (in thousands) over 143 years (1820-1962), and there are 143 observations. We used the first 80% of the observations for training and testing various models while holding back the remaining observations for validating the final model.

ANALYSIS: The baseline persistence model yielded an RMSE of 52,116. The LSTM model processed the same test data with an RMSE of 38,031, which was better than the baseline model as expected. In an earlier ARIMA modeling experiment, the best ARIMA model with non-seasonal order of (0, 1, 2) processed the validation data with an RMSE of 61,789.

CONCLUSION: For this dataset, the TensorFlow LSTM model achieved an acceptable result, and we should consider using TensorFlow for further modeling.

Dataset Used: Annual immigration into the United States, 1820-1962.

Dataset ML Model: Time series forecast with numerical attribute.

Dataset Reference: Rob Hyndman and Yangzhuoran Yang (2018). tsdl: Time Series Data Library. v0.1.0. https://pkg.yangzhuoranyang./tsdl/.

The HTML formatted report can be found here on GitHub.

Multi-Class Tabular Classification Model for Durum Wheat Identification Using Python and TensorFlow Decision Forests

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Durum Wheat Identification dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Wheat is the main ingredient of most common food products in many people’s daily lives. Obtaining good quality wheat kernels is an essential matter for food supplies. In this study, the research team attempted to examine and classify type-1252 durum wheat kernels to obtain top-quality crops based on their vitreousness. The researchers used a total of 236 morphological, color, wavelet, and gaborlet features to classify durum wheat kernels and foreign objects by training several Artificial Neural Networks (ANNs) with different amounts of elements based on the feature rank list obtained with the ANOVA test.

ANALYSIS: The performance of the preliminary Gradient Boosted Trees model achieved an accuracy benchmark of 99.97% on the training dataset. When we applied the finalized model to Kaggle’s test dataset, the model achieved an accuracy score of 99.88%.

CONCLUSION: In this iteration, the TensorFlow Decision Forests model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Durum Wheat Dataset

Dataset ML Model: Multi-Class classification with numerical features

Dataset Reference: https://www.muratkoklu.com/datasets/

One source of potential performance benchmarks: https://doi.org/10.1016/j.compag.2019.105016

The HTML formatted report can be found here on GitHub.