Tendencias en la sumarización lingüistica de datos

LINGUISTIC DATA SUMMARIZATION AND OVERVIEW

Autores/as

  • Iliana Pérez Pupo Universidad de Ciencias Informáticas
  • Pedro Yobanis Piñeiro Pérez Universidad de Ciencias Informáticas
  • Nayma Martín Amaro Universidad de Ciencias Informáticas
  • Rafael Esteban Bello Pérez Universidad Central de Las Villas "Marta Abreu"

Palabras clave:

descubrimiento de conocimiento; resúmenes lingüísticos; sumarización lingüística de datos.

Resumen

Las técnicas de sumarización lingüística de datos han surgido para ayudar a descubrir relaciones complejas entre variables y presentar la información en lenguaje natural. En el desarrollo de estas técnicas se combinan la inteligencia artificial, la estadística, el aprendizaje automático entre otras áreas del conocimiento humano. Esta investigación tiene como objetivo realizar un estudio del estado del arte de la temática que permita a los investigadores analizar las tendencias en los métodos de generación de resúmenes, las estrategias de validación empleadas en las investigaciones y las principales áreas de aplicación. Como conclusiones se identifican la necesidad de mejora de los métodos de validación, las técnicas emergentes en la generación de resúmenes y la posibilidad del empleo de los resúmenes en problemas de predicción entre otros.

Citas

Aguilar, C., Fernando, G., Pérez Pupo, I., Pérez, P., Martínez, N., y Crúz Castillo, Y. (2016). Aplicación de la minería de datos anómalos en organizaciones orientadas a proyectos. Revista Cubana de Ciencias Informáticas, 10, 195-209.

Amghar, D., & Chikh, A. M. (2018). Extracting a Linguistic Summary from a Medical Database. International Journal of Intelligent Systems and Applications, 10(12), 16-26. https://doi.org/10.5815/ijisa.2018.12.02

Boran, F. E., Akay, D., & Yager, R. R. (2016). An overview of methods for linguistic summarization with fuzzy sets. Expert Systems with Applications, 61, 356-377. https://doi.org/10.1016/j.eswa.2016.05.044

Castro, G. F., Pérez, I., Piñero, P., Torres, S., Vásquez, M., Hidalgo, J., & Vera-Lucio, N. (2016). Platform for Project Evaluation Based on Soft-Computing Techniques (pp. 226-240). Springer, Cham. Retrieved from https://link.springer.com/chapter/10.1007/978-3-319-48024-4_18

Chiang, D.-A., Chow, L. R., & Wang, Y.-F. (2000). Mining time series data by a fuzzy linguistic summary system. Fuzzy Sets and Systems, 112(3), 419-432. https://doi.org/10.1016/S0165-0114(98)00003-7

Degtiarev, K. Y., & Remnev, N. V. (2016). Linguistic resumes in software engineering: the case of trend summarization in mobile crash reporting systems. Procedia Computer Science, 102, 121-128. https://doi.org/210.1016/j.procs.2016.09.378

Díaz-Hermida, F., & Vidal, J. C. (2018). Fuzzy quantification for linguistic data analysis and data mining. Retrieved from https://arxiv.org/abs/1807.07389v1

Dijkman, R., & Wilbik, A. (2017). Linguistic summarization of event logs – A practical approach. Information Systems, 67, 114-125. https://doi.org/10.1016/j.is.2017.03.009

Donis-Díaz, C. A., Muro, A. G., Bello-Pérez, R., & Morales, E. V. (2014). A hybrid model of genetic algorithm with local search to discover linguistic data summaries from creep data. Expert Systems with Applications, 41(4, Part 2), 2035-2042. https://doi.org/10.1016/j.eswa.2013.09.002

Donis-Díaz, Carlos A., Bello, R., & Kacprzyk, J. (2015). Using Ant Colony Optimization and Genetic Algorithms for the Linguistic Summarization of Creep Data. In P. Angelov, K. T. Atanassov, L. Doukovska, M. Hadjiski, V. Jotsov, J. Kacprzyk, … S. Zadrożny (Eds.), Intelligent Systems’2014 (pp. 81-92). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-11313-5_8

Dua, D., & Graff, C. (2017). UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. Retrieved from http://archive.ics.uci.edu/ml

Dubois, D., & Prade, H. (1992). Gradual inference rules in approximate reasoning. Information Sciences, 61(1), 103-122. https://doi.org/10.1016/0020-0255(92)90035-7

Duraj, A., Szczepaniak, P. S., & Chomatek, L. (2020). Intelligent Detection of Information Outliers Using Linguistic Summaries with Non-monotonic Quantifiers. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (pp. 787-799). Springer. https://doi.org/10.1007/978-3-030-50153-2_58

Duraj, A., Szczepaniak, P. S., & Ochelska-Mierzejewska, J. (2016). Detection of Outlier Information Using Linguistic Summarization. In T. Andreasen, H. Christiansen, J. Kacprzyk, H. Larsen, G. Pasi, O. Pivert, … S. Zadrożny (Eds.), Flexible Query Answering Systems 2015 (pp. 101-113). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-26154-6_8

Eciolaza, L., Pereira-FariñA, M., & Trivino, G. (2013). Automatic linguistic reporting in driving simulation environments. Applied Soft Computing, 13(9), 3956-3967. https://doi.org/10.1016/j.asoc.2012.09.007

Genç, S., Akay, D., Boran, F. E., & Yager, R. R. (2020). Linguistic summarization of fuzzy social and economic networks: an application on the international trade network. Soft Computing, 24(2), 1511-1527. https://doi.org/10.1007/s00500-019-03982-9

George, R., & Srikant, R. (1996). Data summarization using genetic algorithms and fuzzy logic. Genetic Algorithms and Soft Computing, 599-611.

Gilsing, R., Wilbik, A., Grefen, P., Turetken, O., & Ozkan, B. (2020). A Formal Basis for Business Model Evaluation with Linguistic Summaries. In Enterprise, Business-Process and Information Systems Modeling (pp. 428-442). Springer. https://doi.org/10.1007/978-3-030-49418-6_29

Heble-Lahera, C., Cascallar-Fuentes, A., Ramos-Soto, A., & Diz, A. B. (2020). Empirical study of fuzzy quantification models for linguistic descriptions of meteorological data. In 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). IEEE. https://doi.org/10.1109/FUZZ48607.2020.9177716

Hernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). McGraw-Hill Interamericana México^ eD. F DF.

Hudec, M., Bednárová, E., & Holzinger, A. (2018). Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language. Journal of Official Statistics, 34(4), 981-1010. https://doi.org/10.2478/jos-2018-0048

Igde, E. Y., Aydoğan, S., Boran, F. E., & Akay, D. (2017). Linguistic Summarization of Structured Patent Data.

Jain, A., Keller, J. M., & Bezdek, J. C. (2016). Quantitative and qualitative comparison of periodic sensor data. In 2016 IEEE-embs international conference on biomedical and health informatics (bhi) (pp. 37-40). IEEE. https://doi.org/10.1109/BHI.2016.7455829

Jain, A., Popescu, M., Keller, J., Rantz, M., & Markway, B. (2019). Linguistic summarization of in-home sensor data. Journal of Biomedical Informatics, 96, 103240. https://doi.org/10.1016/j.jbi.2019.103240

Kacprzyk, J. (1999). Fuzzy logic for linguistic summarization of databases. In FUZZ-IEEE’99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315) (Vol. 2, pp. 813-818 vol.2). https://doi.org/10.1109/FUZZY.1999.793053

Kacprzyk, J, & Strykowski, P. (1999). Linguistic summaries of sales data at a computer retailer via fuzzy logic and a genetic algorithm. Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), 2, 937-943. https://doi.org/10.1109/CEC.1999.782523

Kacprzyk, J., Yager, R. R., & Merigo, J. M. (2019). Towards Human-Centric Aggregation via Ordered Weighted Aggregation Operators and Linguistic Data Summaries: A New Perspective on Zadeh’s Inspirations. IEEE Computational Intelligence Magazine, 14(1), 16-30. https://doi.org/10.1109/MCI.2018.2881641

Kacprzyk, J., & Zadrożny, S. (2005). Fuzzy linguistic data summaries as a human consistent, user adaptable solution to data mining. In B. Gabrys, K. Leiviskä, & J. Strackeljan (Eds.), Do Smart Adaptive Systems Exist? Best Practice for Selection and Combination of Intelligent Methods (pp. 321-340). Berlin, Heidelberg: Springer. https://doi.org/10.1007/3-540-32374-0_16

Kacprzyk, Janusz. (1999). An interactive fuzzy logic approach to linguistic data summaries. In 18th International Conference of the North American Fuzzy Information Processing Society-NAFIPS (Cat. No. 99TH8397) (pp. 595-599). IEEE. https://doi.org/10.1109/NAFIPS.1999.781763

Kacprzyk, Janusz, & Zadrożny, S. (1995). Fquery for Access: Fuzzy Querying for a Windows-Based DBMS. In Fuzziness in database management systems (Vol. 5, pp. 415-433). Springer. Retrieved from https://link.springer.com/chapter/10.1007/978-3-7908-1897-0_18

Kacprzyk, Janusz, & Zadrożny, S. (2000). On a fuzzy querying and data mining interface. Kybernetika, 36(6), 657-670.

Kacprzyk, Janusz, & Zadrożny, S. (2003). Linguistic summarization of data sets using association rules. In Fuzzy Systems, 2003. FUZZ’03. The 12th IEEE International Conference on (Vol. 1, pp. 702-707). IEEE.

Kacprzyk, Janusz, & Zadrożny, S. (2005). Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools. Information Sciences, 173(4), 281-304. https://doi.org/10.1016/j.ins.2005.03.002

Kacprzyk, Janusz, & Zadrożny, S. (2009). Linguistic database summaries using fuzzy logic, towards a human-consistent data mining tool, (20), 10.

Kacprzyk, Janusz, & Zadrożny, S. (2016a). Fuzzy logic-based linguistic summaries of time series: a powerful tool for discovering knowledge on time varying processes and systems under imprecision. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 6(1), 37-46. https://doi.org/10.1002/widm.1175

Kacprzyk, Janusz, & Zadrożny, S. (2016b). Linguistic summarization of the contents of Web server logs via the Ordered Weighted Averaging (OWA) operators. Fuzzy Sets and Systems, 285, 182-198. https://doi.org/10.1016/j.fss.2015.07.020

Kacprzyk, Janusz, & Zadrożny, S. (2016c). On a fairness type approach to consensus reaching support under fuzziness via linguistic summaries. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1999-2006). https://doi.org/10.1109/FUZZ-IEEE.2016.7737937

Kacprzyk, Janusz, & Zadrożny, S. (2018). Reaching Consensus in a Group of Agents: Supporting a Moderator Run Process via Linguistic Summaries. In Soft Computing Applications for Group Decision-making and Consensus Modeling (pp. 465-485). Springer.

Kacprzyk, Janusz, Zadrożny, S., & Dziedzic, M. (2014). A Novel View of Bipolarity in Linguistic Data Summaries. In L. T. Kóczy, C. R. Pozna, & J. Kacprzyk (Eds.), Issues and Challenges of Intelligent Systems and Computational Intelligence (pp. 215-229). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-03206-1_16

Kaczmarek-Majer, K., Hryniewicz, O., Dominiak, M., & Święcicki, Ł. (2019). Personalized linguistic summaries in smartphone-based monitoring of bipolar disorder patients. Atlantis Press. https://doi.org/10.2991/eusflat-19.2019.56

Khedidja, B., Allel, H., & Mohand, L. (2020). Data Summarization for Sensor Data Management: Towards Computational-Intelligence-Based Approaches. International Journal of Computing and Digital Systems, 9(5), 825-833. https://doi.org/10.12785/ijcds/090505

Kuhn, T. (2014). A Survey and Classification of Controlled Natural Languajes. Computational Lingusitic, 40(1), 121-170. http://dx.doi.org/10.1162/COLI_a_00168

Littell, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis. Oxford University Press.

Marín, N., & Sánchez, D. (2016). On generating linguistic descriptions of time series. Fuzzy Sets and Systems, 285, 6-30. https://doi.org/10.1016/j.fss.2015.04.014

Peláez-Aguilera, M. D., Espinilla, M., Fernández, M. R., & Medina, J. (2019). Fuzzy Linguistic Protoforms to Summarize Heart Rate Streams of Patients with Ischemic Heart Disease. Hindawi, 2019, 11. https://doi.org/0.1155/2019/2694126

Pérez, I., Piñero, P. Y., Bello, R., Acuña, L. A., & Vacacela, R. G. (2020). Linguistic Summaries Generation with Hybridization Method Based on Rough and Fuzzy Sets. In International Joint Conference on Rough Sets (pp. 385-397). Springer. https://doi.org/10.1007/978-3-030-52705-1_29

Pérez, I., Piñero, P. Y., Vacacela, R. G., Bello, R., & Acuña, L. A. (2020). Discovering Fails in Software Projects Planning Based on Linguistic Summaries. In International Joint Conference on Rough Sets (pp. 365-375). Springer. https://doi.org/10.1007/978-3-030-52705-1_27

Pérez, I., Santos, O., García, R., Piñero, P., & Ramírez, E. C. (2018). Descubrimiento de resúmenes lingüísticos para ayuda a la toma decisiones en gestión de proyecto. Revista Cubana de Ciencias Informáticas, 12, 163-175.

Pérez Pupo, I., Villavicencio, N., Piñero, P., García Vacacela, R., & García Sánchez, R. (2020). PROERP Ecosistema de software para la toma de decisiones en gestión de proyectos. In Experiencias Iberoamericanas de Ingeniería de Proyectos (p. 899). Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.

Piñero, P., Pérez Pupo, I., García Vacacela, R., & Toscanini, P. (2020). Caracterización de los estándares de gestión de proyectos y su impacto en la gestión económico financiera de las organizaciones orientadas a proyectos. Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.

Pupo, I. P., Santos Acosta, O., Piñero, P., García Vacacela, R., & Alvarado, L. (2020). Descubrimiento de errores en la planificación de proyectos basado en resúmenes linguísticos. In Experiencias Iberoamericanas de Ingeniería de Proyectos (pp. 867-876). Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.

Ramos-Soto, A., & Martin-Rodillab, P. (2019). Enriching linguistic descriptions of data: A framework for composite protoforms. Fuzzy Sets and Systems, 26. https://doi.org/10.1016/j.fss.2019.11.013

Rasmussen, D., & Yager, R. R. (1999). Finding fuzzy and gradual functional dependencies with SummarySQL. Fuzzy Sets and Systems, 106(2), 131-142. https://doi.org/10.1016/S0165-0114(97)00268-6

Rojas Valenzuela, Á. R. (2018). Resúmenes lingüísticos para riego de cultivos (Tesis). Universidad Técnica Federico Santa María, Departamento de Informática, Santiago, Chile. Retrieved from https://repositorio.usm.cl

Sanchez-Valdes, D., Alvarez-Alvarez, A., & Trivino, G. (2016). Dynamic linguistic descriptions of time series applied to self-track the physical activity. Fuzzy Sets and Systems, 285, 162-181. https://doi.org/10.1016/j.fss.2015.06.018

Smits, G., Nerzic, P., Pivert, O., & Lesot, M.-J. (2018). Efficient Generation of Reliable Estimated Linguistic Summaries. In 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-8). https://doi.org/10.1109/FUZZ-IEEE.2018.8491604

Wilbik, A., & Dijkman, R. M. (2016). On the generation of useful linguistic summaries of sequences. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 555-562). https://doi.org/10.1109/FUZZ-IEEE.2016.7737736

Wilbik, A., Gilsing, R., Turetken, O., Ozkan, B., & Grefen, P. (2020). Intentional linguistic summaries for collaborative business model radars. In 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). IEEE. https://doi.org/10.1109/FUZZ48607.2020.9177587

Wilbik, A., Kaymak, U., & Dijkman, R. M. (2017). A method for improving the generation of linguistic summaries. In 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-6). https://doi.org/10.1109/FUZZ-IEEE.2017.8015752

Wilbik, A., Vanderfeesten, I., Bergmans, D., Heines, S., & Mook, W. van. (2018). Linguistic Summaries for Compliance Analysis of a Glucose Management Clinical Protocol. In 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). https://doi.org/10.1109/FUZZ-IEEE.2018.8491449

Wu, D., & Mendel, J. M. (2010). Linguistic summarization using IF–THEN rules and interval type-2 fuzzy sets. IEEE Transactions on Fuzzy Systems, 19(1), 136-151. https://doi.org/10.1109/TFUZZ.2010.2088128

Wu, D., & Mendel, J. M. (2011). Linguistic summarization using IF–THEN rules and interval type-2 fuzzy sets. IEEE Transactions on Fuzzy Systems, 19(1), 136-151.

Wu, D., Mendel, J. M., & Joo, J. (2010). Linguistic summarization using if-then rules. In Fuzzy Systems (FUZZ), 2010 IEEE International Conference on (pp. 1-8). IEEE.

Yager, R. R. (1991). On Linguistic Summaries of Data. Knowledge Discovery in Databases, 378-389.

Yager, Ronald R. (1982). A new approach to the summarization of data. Information Sciences, 28(1), 69-86. https://doi.org/10.1016/0020-0255(82)90033-0

Zadeh, L. A. (1983). A computational approach to fuzzy quantifiers in natural languages. Computers & Mathematics with Applications, 9(1), 149-184.

Zadeh, L. A. (2002). A prototype-centered approach to adding deduction capability to search engines–the concept of protoform. In Intelligent Systems, 2002. Proceedings. 2002 First International IEEE Symposium (Vol. 1, pp. 2-3). IEEE. https://doi.org/10.1109/IS.2002.1044219.

Descargas

Publicado

2021-03-07

Cómo citar

Pérez Pupo, I. ., Piñeiro Pérez, P. Y., Martín Amaro, N. ., & Bello Pérez, R. E. . (2021). Tendencias en la sumarización lingüistica de datos: LINGUISTIC DATA SUMMARIZATION AND OVERVIEW. Revista Cubana De Transformación Digital, 2(1), 79–101. Recuperado a partir de https://rctd.uic.cu/rctd/article/view/105