Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Superconductor Critical Temperature dataset is a regression situation where we are trying to predict the value of a continuous variable.
INTRODUCTION: The research team wishes to create a statistical model for predicting the superconducting critical temperature based on the features extracted from the superconductor’s chemical formula. The model seeks to examine the features that can contribute the most to the model’s predictive accuracy.
From iteration Take1, we constructed and tuned machine learning models for this dataset using TensorFlow with five layers. We also observed the best result that we could obtain using the tuned models with the validation and test datasets.
In this Take2 iteration, we will construct and tune machine learning models for this dataset using TensorFlow with dropout layers. We will observe the best result that we can obtain using the tuned models with the validation and test datasets.
ANALYSIS: From iteration Take1, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 11.109. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 10.564. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.540.
In this Take2 iteration, the baseline performance of the TensorFlow algorithm achieved an RMSE benchmark of 10.580. After a series of tuning trials, the TensorFlow model processed the validation dataset with an RMSE score of 10.905. When we applied the TensorFlow model to the previously unseen test dataset, we obtained an RMSE score of 10.885.
CONCLUSION: In this iteration, the TensorFlow model with the dropout layers did not appear to have a noticeable effect on the modeling of this dataset. However, we still should consider using the algorithm for further modeling.
Dataset Used: Superconductivity Data Set
Dataset ML Model: Regression with numerical attributes
Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Superconductivty+Data
One potential source of performance benchmarks: https://doi.org/10.1016/j.commatsci.2018.07.052
The HTML formatted report can be found here on GitHub.