Multiclasificador homogéneo para detección de Bots en el comercio electrónico

Hélder João Chissingui; Nayma  Cepero Pérez; Humberto D´´íaz Pando; Mailyn  Moreno Espino

Authors

Hélder João Chissingui Military Technical Higher Institute – ISTM
Nayma Cepero Pérez Technological University of Havana "José Antonio Echeverría"
Humberto D´´íaz Pando Technological University of Havana "José Antonio Echeverría"
Mailyn Moreno Espino Technological University of Havana "José Antonio Echeverría"

Keywords:

Bot detection, meta learning, multiclassifiers, e-commerce

Abstract

For electronic commerce, mitigating bot threats is a relevant task, due to the enormous impact of malicious activities perpetrated by bots, through these by malicious people, whose, in addition to the damage they cause to the IT infrastructure and economic losses, also exacerbate human user dissatisfaction. Currently this problem becomes even more complex, because sometimes human users use mobile applications with their user accounts to have access privileges to certain business services, that is, the level of sophistication of the bots is increasingly higher, which results in the patterns of human activities under certain circumstances having the same characteristics as the activities of bots. With these levels of development, detection tasks become increasingly complex and vital. In this study, a detection approach based on supervised learning is proposed, with the homogeneous models of ensembles of classifiers, Bagging and Boosting. The models built based on the ExtraTree, Cart and K-nearest neighbors estimators, achieved the maximum F1 score of 100%, in certain scenarios, in which the number of examples of the minority class does not exceed 9% of the data set. The results are compared with other approaches of the state of the art.

References

B, H. X., Li, Z., Chu, C., Chen, Y., Yang, Y., Lu, H., Wang, H., & Stavrou, A. (2018). Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace. ESORICS 2018, 1: 143-163. https://doi.org/10,1007/978-3-319-98989-1

Barbon S., J., Campos, G. F. C., Tavares, G. M., Igawa, R. A., Proença M.L., J., & Guido, R. C. (2018). Detection of human, legitimate bot, and malicious bot in online social networks based on wavelets. ACM Transactions on Multimedia Computing, Communications and Applications, 14(1s). https://doi.org/10,1145/3183506

Bermúdez, M. D.-C. (2022). Gestión de Gobierno basada en ciencia e innovación: avances y desafíos. Anales de la Academia de Ciencias de Cuba, 12(2): 12-35. http://www.revistaccuba.cu/index.php/revacc/article/view/e1235

Cabri, A., Suchacka, G., Rovetta, S., & Masulli, F. (2018). Online Web Bot Detection Using a Sequential Classification Approach. 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th Intl. Conference on Data Science and Systems. https://doi.org/10,1109/HPCC/SmartCity/DSS.2018.00252

Daya, A. A., Salahuddin, M. A., Limam, N., & Boutaba, R. (2019). A Graph-Based Machine Learning Approach for Bot Detection. IFIP/IEEE International Symposium on Integrated Network Management, Washington DC, USA, April 2019, April.

Dietterich, T. G. (2000). Ensemble Methods in Machine Learning. Multiple Classifier Systems, pp. 1-15.

Echevarría, D. P., Espino, M. M., Pando, H. D., & Chissingui, H. J. (2022). Comercio Electrónico Random Forest For Bot Detection In E-Comerce. Infomática - XVIII Convención y Feria Internacional.

Garcia, S., Grill, M., Stiborek, J., & Zunimo, A. (2014). An empirical comparison of botnet detection methods. Computers and Security Journal, Elsevier, 45: 100-123. https://doi.org/http://dx.doi.org/10,1016/j.cose.2014.05.011

Garcia, S., Grill, M., Stiborek, J., Zunimo, A., Dietterich, T. G., Suchacka, G., Wotzka, D., Chen, H., He, H., Starr, A., Deng, J., Dong, W., Socher, R., Li, L. L.-J., Li, K., Fei-Fei, L., Balla, A., Stassopoulou, A., Dikaiakos, M. D., … Greensmith, J. (2020). Artificial Intelligence - A Modern Approach. Computers & Security, 8(1): 1-6. https://doi.org/10,1007/s11416-020-00368-6

Han, J., Kamber, M., & Pei, J. (2012). Data mining concepts and techniques, third edition. Morgan Kaufmann Publishers. http://www.amazon.de/Data-Mining-Concepts-Techniques-Management/dp/0123814790/ref=tmm_hrd_title_0?ie=UTF8&qid=1366039033&sr=1-1

Haq, S., & Singh, Y. (2018). Botnet Detection using Machine Learning. 2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC), pp. 240-245. https://doi.org/10,1109/PDGC.2018.8745912

Imperva. (2022). 2022 Imperva Bad Bot Report - Evasive Bots Drive Online Fraud. www.imperva.com

Rahman, R. U., & Tomar, D. S. (2020). Threats of price scraping on e-commerce websites: attack model and its detection using neural network. Journal of Computer Virology and Hacking Techniques, 17(1): 75-89. https://doi.org/10,1007/s11416-020-00368-6

Rovetta, S., Suchacka, G., & Masulli, F. (2020). Bot recognition in a Web store: An approach based on unsupervised learning. Journal of Network and Computer Applications, 157, 102577. https://doi.org/https://doi.org/10,1016/j.jnca.2020,102577

Suchacka, G., Cabri, A., Rovetta, S., & Masulli, F. (2021). Efficient on-the-fly Web bot detection. Knowledge-Based Systems, 223, 107074. https://doi.org/https://doi.org/10,1016/j.knosys.2021,107074

Suchacka, G., & Iwanski, J. (2020). Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach. Knowledge-Based Systems, 197, 105875. https://doi.org/https://doi.org/10,1016/j.knosys.2020,105875

Suchacka, G., & Sobków, M. (2015). Detection of Internet Robots Using a Bayesian Approach. IEEE.

Zhao, J., Liu, X., Yan, Q., Li, B., Shao, M., & Peng, H. (2020). Multi-attributed heterogeneous graph convolutional network for bot detection. Information Sciences, 537: 380-393. https://doi.org/https://doi.org/10,1016/j.ins.2020,03.113