README Australian Defense Force Academy - Intrusion Detection Datasets Data hosted and ReadMe file provided by the University of Arizona Artificial Intelligence Lab. Citation information below. ADFA-IDS.zip (15 MB) contains: NOTE: n and X are used as repeating variables in file names to indicate a series of files ADFA-LD.zip (2.3 MB compressed, 8.7 MB uncompressed) ADFA-LD Attack_Data_Master ._Adduser_n ._Hydra_FTP_n ._Hydra_SSH_n ._Java_Meterpreter_n ._Meterpreter_n ._Web_Shell_n Training_Data_Master UTD-n (where n = 0001 thorugh 0833) Validation_Data_Master UTD-n (where n = 0001 thorugh 4372) Full_Process_Traces2.zip (30 MB compressed, 4 GB uncompressed) Full_Process_Traces Full_Trace_Attack_Data (examples of each folder type listed, each contain GHC files) V1-CesarFTP-N1-1 V2-WebDAV-P1-1 V3-Icecast-N1-1 V4-Tomcat-N1-1 V5-OS-SMB-N1-1 V6-OS-Print_Spool-N1-1 V7-PMWiki-P1-1 V8-Wireless-Karma-N1-1 V9-PEF-N1-1 V10-Backroored-Executable-S1-1 V11-Browser-Attack-N1-1 V12-Infectious-Media-S1-1 Full_Trace_Training_Data (contains GHC files, examples of file names given) Training-Backdoored-Executable_n.GHC Training-Background_n.GHC Training-Browser-Attack_n.GHC Training-CesarFTP_n.GHC Training-Icecast_n.GHC Training-Infectious-Media-Part1_n.GHC Training-Infectious-Media-Part2_n.GHC Training-OS_Print_Spoot_n.GHC Training-OS_SMB_n.GHC Training-PDF_n.GHC Training-PMWiki_n.GHC Training-Tomcat_n.GHC Training-WebDAV_n.GHC Training-Wireless-Karma_n.GHC Full_Trace_Validation_Data Training-Backdoored-Executable_n.GHC Training-Background_n.GHC Training-Browser-Attack-PartX_n.GHC Training-CesarFTP_n.GHC Training-Icecast_n.GHC Training-Infectious-Media-PartX_n.GHC Training-OS_Print_Spoot_n.GHC Training-OS_SMB_n.GHC Training-PDF-PartX_n.GHC Training-PMWiki_n.GHC Training-Tomcat_n.GHC (part 1 and 2) Training-WebDAV_n.GHC Training-Wireless-Karma-PartX_n.GHC chamaeleon_scan.PDF chamaeleon_scan.webarchive (opens with Safari) Licence.txt readme.txt DESCRIPTION ADFA IDS is an intrusion detection system dataset made publicly available in 2013, intended as representative of modern attack structure and methodology to replace the older datasets KDD and UNM. ADFA IDS includes independent datasets for Linux and Windows environments. ADFA-LD (Linux dataset) was generated on a Ubuntu Linux 11.04 host OS with Apache 2.2.17 running PHP 5.3.5. FTP, SSH, MySQL 14.14, and TikiWiki were started. The following show the payloads and vectors used to attack the Ubuntu OS and generate the dataset. PAYLOAD/EFFECT VECTOR password bruteforce FTP by Hydra password bruteforce SSH by Hydra add new superuser Client side poisoned executable Java based meterpreter Tiki Wiki vulnerability exploit Linux meterpreter payload Client side poisoned executable C100 Webshell PHP remote file inclusion vulnerability See G. Creech, Developing a high-accuracy cross platform Host-Based Intrusion Detection System capable of reliably detecting zero-day attacks, 2014, Section 3.5.2 for detailed information on the methodology, collection, and organization of this dataset. ADFA-WD (Windows dataset) was genearted on a Windows XP Service Pack 2 host OS with the XP default firewall enabled for all attacks, and file sharing enabled, a network printer configured, wireless and Ethernet networking. Norton AV 2013 was used to scan certain payloads. FTP server, web server and management tool, and streaming audio digital radio package were activated. A target ratio of 1 : 10 : 1 =normal data:validation data:attack data was used to guide collection and structuring activities. Vectors: TCP ports, web based vectors, browser attacks, and malware attachments Effects: Bind shell, reverse shell, exploitation payload, remote operation, staging, system manipulation, privilege escalation, data exfiltration, and back-door insertion. See G. Creech, Developing a high-accuracy cross platform Host-Based Intrusion Detection System capable of reliably detecting zero-day attacks, 2014, Chapter 4 for detailed information on the methodology, collection, and organization of this dataset. HOW TO CITE THIS DATASET Author(s): Gideon Creech and Jiankun Hu Title: ADFA IDS Dataset Publisher: University of Arizona Artificial Intelligence Lab, AZSecure-data, Director Hsinchun Chen Location: [AZSecure-data has not yet implemented Digital Object Identifiers or Persistent URLs, please copy and paste the location where you retrieve this file from within http://www.azsecure-data.org/] Publication date: November 2016 IEEE formatted citation: G. Creech and J. Hu. ADFA IDS Dataset, University of Arizona Artificial Intelligence Lab, AZSecure-data, Director Hsinchun Chen. Available http://www.azsecure-data.org/ [November 2016] ALSO CITE the following related publications: G. Creech and J. Hu. A Semantic Approach to Host-based Intrusion Detection Systems Using Contiguous and Discontiguous System Call Patterns. Computers, IEEE Transactions on, PP(99):11, 2013. G. Creech and J. Hu. Generation of a new IDS test dataset: Time to retire the KDD collection. In Wireless Communications and Networking Conference (WCNC), 2013 IEEE, pages 44874492, 2013. G. Creech. Developing a high-accuracy cross platform Host-Based Intrusion Detection System capable of reliably detecting zero-day attacks, 2014 Original data host and associated information: https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-IDS-Datasets/