A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015

KDD99 dataset is more than 15 years old, it is still widely used in academic research. This study reviews 149 research articles from 65 journals indexed in Science Citation Index Expanded and Emerging Sources Citation Index. The number of published studies shows that KDD99 is the most used dataset in IDS and machine learning areas.

Intrusion Detection Systems (IDS) studies can be considered as a classification task that separates normal behavior of networks from attacks. Machine learning and data mining algorithms are widely used in IDS. Most of these algorithms are based on the assumption that problem space does not change very fast. Attackers continuously change and improve their capabilities.

KDD99 is the most used dataset in IDS domain. 149 research articles that used KDD99 were published in Science Citation Index Expanded (SCIE) and Emerging Sources Citation Index (ESCI) journals between 2010 and 2015. The main intersection of machine learning research, IDS, and information security is K+9DD99 dataset.

Auto94

https://www.researchgate.net/publication/309038723_A_review_of_KDD99_dataset_usage_in_intrusion_detection_and_machine_learning_between_2010_and_2015