首页 /研究 /XGBoost Classification based Network Intrusion Detection System for Big Data using PySparkling Water
LEARNING

XGBoost Classification based Network Intrusion Detection System for Big Data using PySparkling Water

Tadepalli Anish Deepak

发表年份
2020
引用次数
4
访问权限
开放获取

摘要

A Machine learning is a technique for information investigation that robotizes model building. It is a part of man-made consciousness dependent on the possibility that frameworks can gain from information, recognize patterns and make data driven decisions on choices with negligible human intercession. Training machine learning algorithms with large volume of data (also known as big data) gives better result. Cloud Computing (CC) erases the barriers of handling the bigdata in terms of computation and storage. In this paper we are proposing a cloud-based Intrusion Detection System (IDS) using tree-based ensemble classification algorithm known as XGBoost classifier trained on CICIDS-2017 dataset which is a realistic cyber dataset which has benign and most up-to-date common seven different types network attacks. Sparkling Water enables clients to join the quick, versatile machine learning functionalities of H2O with the capacities of Spark. The proposed IDS using XGBOOST classifier from H2O.ai generated good results when compared with other algorithms like Random Forest (RF), artificial neural network (ANN), gradient boost (GBM), and stack ensemble method. Out of all algorithms XGBoost gave 99.8% accuracy on validation set and nearly 99.1% accuracy on test set form k-fold cross validation.

关键词

Intrusion detection systemBig dataComputer scienceData miningArtificial intelligence

相关论文

查看 LEARNING 分类全部论文