Una aplicación del algoritmo proactive Forest para la detección de bots malignos

Daniel  Pardo Echevarría; Nayma  Cepero Pérez; Humberto Díaz Pando

Authors

Daniel Pardo Echevarría Technological University of Havana "José Antonio Echeverría"
Nayma Cepero Pérez Technological University of Havana "José Antonio Echeverría"
Humberto Díaz Pando Technological University of Havana "José Antonio Echeverría"

Keywords:

Bot detection, classification, decision forest, decision tree

Abstract

Malicious bots are computer programs that have the particularity of simulating human activity, being used to execute cyber-attacks. These programs are a problem that affects multiple web services. As a result, multiple approaches have been developed to detect them. The application of machine learning algorithms, especially those that generate classifier models based on supervised learning, has had a great impact. The present work proposes the application of the Proactive Forest (PF) algorithm in the detection of malicious bots. Evaluating its performance, based on the percentage of instances correctly classified as malicious bot or human user. Performing additionally, a comparison with the Random Forest (RF) algorithm, being an algorithm that also generates a decision forest. Implemented in a state-of-the-art article, for the detection of malicious bots. The results achieved show a maximum performance of the Proactive Forest algorithm of 63,14 % of correctly classified instances.

References

Cepero-Pérez, N., Denis-Miranda, L. A., Hernández-Palacio, R., Moreno-Espino, M., y García-Borroto, M. (2018). Proactive Forest for Supervised Classification. International Workshop on Artificial Intelligence and Pattern Recognition, pp. 255–262.

Dahan, H., Cohen, S., Rokach, L., y Maimon, O. (2014). Proactive Data Mining with Decision Trees. Proactive Data Mining with Decision Trees, pp. 21-33.

Doran, D. (2011). Web robot detection techniques: Overview and limitations. Data Mining and Knowledge Discovery, 22(1), 183-210.

Fernández, C., Baptista, P., y Hernández, R. (1998). Metodología de la investigación (T. M.-H. C. Inc. Ed. Vol. Segunda Edición). México.

Haq, S., y Singh, Y. (2018). Botnet detection using machine learning. Paper presented at the In 2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC).

Hernández, J., Ramírez, J., y Ferri, C. (2004). Introducción a la Minería de Datos (Vol. 9). Madrid.

Imperva. (2020). Bad Bot Report. Retrieved from California, USA

Mohammed, M., Khan , M. B., y Mohammed Bashier, E. B. (2016). Machine Learning Algorithms and Applications: Crc Press.

Pardo, D., Moreno, M., Diaz, H., y Chissingui, H. J. (2022). RANDOM FOREST PARA LA DETECCIÓN DE BOTS EN EL COMERCIO ELECTRÓNICO. Paper presented at the X Congreso Internacional de Tecnologías, Comercio Electrónico y Contenidos Digitales.

Rokach, L. (2015). Decision forest: Twenty years of research. Information Fusion, 27, 111-125.

Rout, Lingam, R. R., y Somayajulu, D. V. (2020). Detection of malicious social bots using learning automata with url features in twitter network. IEEE Transactions on Computational Social Systems, 7(4), 1004-1018.

Rovetta, S., Suchacka, G., y Masulli, F. (2020). Bot recognition in a Web store: An approach based on unsupervised learning. Journal of Network and Computer Applications., 157, 102577.

Velasco, J., González, V., Fidalgo, E., y Alegre, E. (2021). Efficient Detection of Botnet Traffic by features selection and Decision Trees. Paper presented at the Preprint submitted to IEEE Access.

Vishwakarma, A. R. (2020). Network Traffic Based Botnet Detection Using Machine Learning. (Master of Science (MS)), San Jose State University, SJSU Scholar Works.

Xu , H., Li , Z., Chu, C., Chen, Y., Yang , Y., Lu, H., . . . Stavrou, A. (2019). Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace. European Symposium on Research in Computer Security, pp. 143-163.