Análisis comparativo entre algoritmos de aprendizaje de reglas para identificar indicadores que influyen en el bajo rendimiento industrial

Yohan  Gil Rodriguez; Raisa Socorro Llanes; Alejandro Rosete Suárez; Lisandra Bravo Ilisástigui

Authors

Yohan Gil Rodriguez Empresa de Soluciones Informáticas DATAZUCAR
Raisa Socorro Llanes Universidad Tecnológica de la Habana "José Antonio Echeverría"
Alejandro Rosete Suárez Universidad Tecnológica de la Habana "José Antonio Echeverría"
Lisandra Bravo Ilisástigui Universidad Tecnológica de la Habana "José Antonio Echeverría"

Keywords:

Data Mining, CRISP-DM, Industrial Yield, Rule Learning

Abstract

The computerization of the processes of the sugar industry generates abundant data. At present, the application of the programs of the existing Agro-Industrial Platform in AZCUBA has guaranteed the speed and quality of harvest information and the benefits derived from it. The cuban sugar industry needs to implement scientific tools and methods that allow the influence of the technological variables of the industrial process on the efficiency of cane sugar manufacturing to be analyzed and quantified with greater precision. For this reason, it is necessary to discover what are the main causes that are influencing the low industrial yields in the cane sugar manufacturing process in cuba based on the historical data of the sugar harvest. The CRISP-DM methodology is used for modeling the data mining process. As a starting point for deeper analysis, a comparison between rule learning algorithms is made, where patterns that influence low industrial yields are obtained.

References

Beck, F., y Fürnkranz, J. (2021). An Empirical Investigation Into Deep and Shallow Rule Learning. Frontiers in Artificial Intelligence, 4. Recuperado de https://bit.ly/3M0huZg

Concepción Cruz, E., Caraballoso Torrecilla, V., Nápoles Alberto, R. G., Morales Fundora, L., Cruz Coca, O., y Viñas Quintero, Y. (2015). PROBLEMAS ASOCIADOS AL RENDIMIENTO AGRÍCOLA DE LA CAÑA DE AZÚCAR EN LA COOPERATIVA POTRERILLO, PROVINCIA SANCTI SPÍRITUS: PROBLEMS ASSOCIATED TO THE AGRICULTURAL YIELD OF SUGARCANE IN THE POTRERILLO COOPERATIVE, PROVINCE OF SANCTI SPíRITUS. Centro Azúcar, 42(2), 83-92.

Coto Palacio, J., Jiménez Martínez, Y., y Nowé, A. (2020). Aplicación de sistemas neuroborrosos en la clasificación de reportes en problemas de secuenciación. Revista Cubana de Ciencias Informáticas, 14(4), 34-47.

Equipo Técnico de Krypton Solid. (2021, diciembre 28). Examinando la plataforma de análisis de Knime para análisis de big data. Recuperado 9 de enero de 2022, de Krypton Solid website: https://bit.ly/3vksF9d

García, S., Luengo, J., y Herrera, F. (2015). Data Preprocessing in Data Mining. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-10247-4

Gordillo, J. J. T., & Rodríguez, V. H. P. (2009). CÁLCULO DE LA FIABILIDAD Y CONCORDANCIA ENTRE CODIFICADORES DE UN SISTEMA DE CATEGORÍAS PARA EL ESTUDIO DEL FORO ONLINE EN E-LEARNING. 27, 17.

Hernández Orallo, J., Ramárez Quintana, M. J., & Ferri Ramírez, C. (2004). Introducción a la minería de datos. España: PEARSON EDUCACION. S.A.

Ian H., W., y Frank, E. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier. https://doi.org/10.1016/C2009-0-19715-5

Iplus – Datazucar. (s. f.). Recuperado 11 de octubre de 2021, de Datazucar website: https://bit.ly/3Ig8Kv1

Martínez Heras, J. (2020, octubre 9). Precision, Recall, F1, Accuracy en clasificación. Recuperado 28 de abril de 2022, de IArtificial.net website: https://bit.ly/37PaLSE

Montequín, R., Teresa, M., Cabal, Á., Valeriano, J., Fernández, M., Manuel, J., y Valdés, G. (s. f.). METODOLOGÍAS PARA LA REALIZACIÓN DE PROYECTOS DE DATA MINING. DATA MINING, 9.

Núñez, V. B., Velandia, R., Hernández, F., Meléndez, J., y Vargas, H. (2013). Atributos Relevantes para el Diagnóstico Automático de Eventos de Tensión en Redes de Distribución de Energía Eléctrica. Revista Iberoamericana de Automática e Informática Industrial RIAI, 10(1), 73-84. https://doi.org/10.1016/j.riai.2012.11.007

Ortega, R. A. V., y Suárez, F. L. H. (2010). EVALUACIÓN DE ALGORITMOS DE EXTRACCIÓN DE REGLAS DE DECISIÓN PARA EL DIAGNÓSTICO DE HUECOS DE TENSIÓN. 127.

Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4), 1432-1462. https://doi.org/10.1016/j.eswa.2013.08.042

Pérez, F. M. (s. f.). Estudio y análisis del funcionamiento de técnicas de minería de datos en conjuntos de datos relacionados con la Biología. 35.

Ribas García, M., Consuegra del Rey, R., y Alfonso Alfonso, M. (2016). ANÁLISIS DE LOS FACTORES QUE MÁS INCIDEN SOBRE EL RENDIMIENTO INDUSTRIAL AZUCARERO. 43(1), 10.

Rivas Méndez, A. (2014). Estudio experimental sobre algoritmos de clasificación supervisada basados en reglas en conjuntos de datos de alta dimensión. Recuperado de https://bit.ly/3LoZcQR

Widmann, M. (2019, mayo 27). From Modeling to Scoring: Confusion Matrix and Class Statistics. Recuperado 20 de febrero de 2021, de KNIME website: https://bit.ly/3vwjv9u

Wirth, R., y Hipp, J. (2000). CRISP-DM: Towards a standard process model for data mining. Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining.