In this project, different machine learning method including SVM, decision tree, random forest, neural network and Xgboost are used to make a prediction of PM2.5. Xgboost shows the highest accuracy and SVM has the lowest accuracy for predictions. Subset sample deletion was used to improve performance for it and increased the accuracy ~12%. Bagged decision tree, random forest and extra trees were compared and exhibited higher accuracy comparing with decision tree on prediction of PM2.5. Stacking was also used to reduce the bias and irrelevant features by combining other models’ predictions. SVM, decision tree, random forest, Xgboost were used as the first layer models. Xgboost shows the best performance as the second layer model comparing with the other three classification models.
stefanie17 / pm2.5-prediction-based-on-interactive-multiple-machine-learning-models Goto Github PK
View Code? Open in Web Editor NEWIn this project, different machine learning method including SVM, decision tree, random forest, neural network and Xgboost are used to make a prediction of PM2.5. Xgboost shows the highest accuracy and SVM has the lowest accuracy for predictions. Subset sample deletion was used to improve performance for it and increased the accuracy ~12%. Bagged decision tree, random forest and extra trees were compared and exhibited higher accuracy comparing with decision tree on prediction of PM2.5. Stacking was also used to reduce the bias and irrelevant features by combining other models’ predictions. SVM, decision tree, random forest, Xgboost were used as the first layer models. Xgboost shows the best performance as the second layer model comparing with the other three classification models.