Tendencias en la sumarización lingüistica de datos
LINGUISTIC DATA SUMMARIZATION AND OVERVIEW
Keywords:
knowledge discovery; linguistic data summarization; linguistic summaries.Abstract
Linguistic data summarization techniques have emerged to help discover complex relationships between variables and present information in natural language. In the development of these techniques, artificial intelligence, statistics, machine learning, among other areas of human knowledge, are combined. This research aims to carry out a study of the state of the art of the subject that allows researchers to analyze the evolution and trends in its development. Trends in abstract generation methods, validation strategies used in research, and main areas of application are analyzed. As conclusions, the need to improve validation methods, emerging techniques in the generation of summaries and the possibility of using summaries in prediction problems, among others, are identified.
References
Aguilar, C., Fernando, G., Pérez Pupo, I., Pérez, P., Martínez, N., y Crúz Castillo, Y. (2016). Aplicación de la minería de datos anómalos en organizaciones orientadas a proyectos. Revista Cubana de Ciencias Informáticas, 10, 195-209.
Amghar, D., & Chikh, A. M. (2018). Extracting a Linguistic Summary from a Medical Database. International Journal of Intelligent Systems and Applications, 10(12), 16-26. https://doi.org/10.5815/ijisa.2018.12.02
Boran, F. E., Akay, D., & Yager, R. R. (2016). An overview of methods for linguistic summarization with fuzzy sets. Expert Systems with Applications, 61, 356-377. https://doi.org/10.1016/j.eswa.2016.05.044
Castro, G. F., Pérez, I., Piñero, P., Torres, S., Vásquez, M., Hidalgo, J., & Vera-Lucio, N. (2016). Platform for Project Evaluation Based on Soft-Computing Techniques (pp. 226-240). Springer, Cham. Retrieved from https://link.springer.com/chapter/10.1007/978-3-319-48024-4_18
Chiang, D.-A., Chow, L. R., & Wang, Y.-F. (2000). Mining time series data by a fuzzy linguistic summary system. Fuzzy Sets and Systems, 112(3), 419-432. https://doi.org/10.1016/S0165-0114(98)00003-7
Degtiarev, K. Y., & Remnev, N. V. (2016). Linguistic resumes in software engineering: the case of trend summarization in mobile crash reporting systems. Procedia Computer Science, 102, 121-128. https://doi.org/210.1016/j.procs.2016.09.378
Díaz-Hermida, F., & Vidal, J. C. (2018). Fuzzy quantification for linguistic data analysis and data mining. Retrieved from https://arxiv.org/abs/1807.07389v1
Dijkman, R., & Wilbik, A. (2017). Linguistic summarization of event logs – A practical approach. Information Systems, 67, 114-125. https://doi.org/10.1016/j.is.2017.03.009
Donis-Díaz, C. A., Muro, A. G., Bello-Pérez, R., & Morales, E. V. (2014). A hybrid model of genetic algorithm with local search to discover linguistic data summaries from creep data. Expert Systems with Applications, 41(4, Part 2), 2035-2042. https://doi.org/10.1016/j.eswa.2013.09.002
Donis-Díaz, Carlos A., Bello, R., & Kacprzyk, J. (2015). Using Ant Colony Optimization and Genetic Algorithms for the Linguistic Summarization of Creep Data. In P. Angelov, K. T. Atanassov, L. Doukovska, M. Hadjiski, V. Jotsov, J. Kacprzyk, … S. Zadrożny (Eds.), Intelligent Systems’2014 (pp. 81-92). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-11313-5_8
Dua, D., & Graff, C. (2017). UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. Retrieved from http://archive.ics.uci.edu/ml
Dubois, D., & Prade, H. (1992). Gradual inference rules in approximate reasoning. Information Sciences, 61(1), 103-122. https://doi.org/10.1016/0020-0255(92)90035-7
Duraj, A., Szczepaniak, P. S., & Chomatek, L. (2020). Intelligent Detection of Information Outliers Using Linguistic Summaries with Non-monotonic Quantifiers. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (pp. 787-799). Springer. https://doi.org/10.1007/978-3-030-50153-2_58
Duraj, A., Szczepaniak, P. S., & Ochelska-Mierzejewska, J. (2016). Detection of Outlier Information Using Linguistic Summarization. In T. Andreasen, H. Christiansen, J. Kacprzyk, H. Larsen, G. Pasi, O. Pivert, … S. Zadrożny (Eds.), Flexible Query Answering Systems 2015 (pp. 101-113). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-26154-6_8
Eciolaza, L., Pereira-FariñA, M., & Trivino, G. (2013). Automatic linguistic reporting in driving simulation environments. Applied Soft Computing, 13(9), 3956-3967. https://doi.org/10.1016/j.asoc.2012.09.007
Genç, S., Akay, D., Boran, F. E., & Yager, R. R. (2020). Linguistic summarization of fuzzy social and economic networks: an application on the international trade network. Soft Computing, 24(2), 1511-1527. https://doi.org/10.1007/s00500-019-03982-9
George, R., & Srikant, R. (1996). Data summarization using genetic algorithms and fuzzy logic. Genetic Algorithms and Soft Computing, 599-611.
Gilsing, R., Wilbik, A., Grefen, P., Turetken, O., & Ozkan, B. (2020). A Formal Basis for Business Model Evaluation with Linguistic Summaries. In Enterprise, Business-Process and Information Systems Modeling (pp. 428-442). Springer. https://doi.org/10.1007/978-3-030-49418-6_29
Heble-Lahera, C., Cascallar-Fuentes, A., Ramos-Soto, A., & Diz, A. B. (2020). Empirical study of fuzzy quantification models for linguistic descriptions of meteorological data. In 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). IEEE. https://doi.org/10.1109/FUZZ48607.2020.9177716
Hernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). McGraw-Hill Interamericana México^ eD. F DF.
Hudec, M., Bednárová, E., & Holzinger, A. (2018). Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language. Journal of Official Statistics, 34(4), 981-1010. https://doi.org/10.2478/jos-2018-0048
Igde, E. Y., Aydoğan, S., Boran, F. E., & Akay, D. (2017). Linguistic Summarization of Structured Patent Data.
Jain, A., Keller, J. M., & Bezdek, J. C. (2016). Quantitative and qualitative comparison of periodic sensor data. In 2016 IEEE-embs international conference on biomedical and health informatics (bhi) (pp. 37-40). IEEE. https://doi.org/10.1109/BHI.2016.7455829
Jain, A., Popescu, M., Keller, J., Rantz, M., & Markway, B. (2019). Linguistic summarization of in-home sensor data. Journal of Biomedical Informatics, 96, 103240. https://doi.org/10.1016/j.jbi.2019.103240
Kacprzyk, J. (1999). Fuzzy logic for linguistic summarization of databases. In FUZZ-IEEE’99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315) (Vol. 2, pp. 813-818 vol.2). https://doi.org/10.1109/FUZZY.1999.793053
Kacprzyk, J, & Strykowski, P. (1999). Linguistic summaries of sales data at a computer retailer via fuzzy logic and a genetic algorithm. Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), 2, 937-943. https://doi.org/10.1109/CEC.1999.782523
Kacprzyk, J., Yager, R. R., & Merigo, J. M. (2019). Towards Human-Centric Aggregation via Ordered Weighted Aggregation Operators and Linguistic Data Summaries: A New Perspective on Zadeh’s Inspirations. IEEE Computational Intelligence Magazine, 14(1), 16-30. https://doi.org/10.1109/MCI.2018.2881641
Kacprzyk, J., & Zadrożny, S. (2005). Fuzzy linguistic data summaries as a human consistent, user adaptable solution to data mining. In B. Gabrys, K. Leiviskä, & J. Strackeljan (Eds.), Do Smart Adaptive Systems Exist? Best Practice for Selection and Combination of Intelligent Methods (pp. 321-340). Berlin, Heidelberg: Springer. https://doi.org/10.1007/3-540-32374-0_16
Kacprzyk, Janusz. (1999). An interactive fuzzy logic approach to linguistic data summaries. In 18th International Conference of the North American Fuzzy Information Processing Society-NAFIPS (Cat. No. 99TH8397) (pp. 595-599). IEEE. https://doi.org/10.1109/NAFIPS.1999.781763
Kacprzyk, Janusz, & Zadrożny, S. (1995). Fquery for Access: Fuzzy Querying for a Windows-Based DBMS. In Fuzziness in database management systems (Vol. 5, pp. 415-433). Springer. Retrieved from https://link.springer.com/chapter/10.1007/978-3-7908-1897-0_18
Kacprzyk, Janusz, & Zadrożny, S. (2000). On a fuzzy querying and data mining interface. Kybernetika, 36(6), 657-670.
Kacprzyk, Janusz, & Zadrożny, S. (2003). Linguistic summarization of data sets using association rules. In Fuzzy Systems, 2003. FUZZ’03. The 12th IEEE International Conference on (Vol. 1, pp. 702-707). IEEE.
Kacprzyk, Janusz, & Zadrożny, S. (2005). Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools. Information Sciences, 173(4), 281-304. https://doi.org/10.1016/j.ins.2005.03.002
Kacprzyk, Janusz, & Zadrożny, S. (2009). Linguistic database summaries using fuzzy logic, towards a human-consistent data mining tool, (20), 10.
Kacprzyk, Janusz, & Zadrożny, S. (2016a). Fuzzy logic-based linguistic summaries of time series: a powerful tool for discovering knowledge on time varying processes and systems under imprecision. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 6(1), 37-46. https://doi.org/10.1002/widm.1175
Kacprzyk, Janusz, & Zadrożny, S. (2016b). Linguistic summarization of the contents of Web server logs via the Ordered Weighted Averaging (OWA) operators. Fuzzy Sets and Systems, 285, 182-198. https://doi.org/10.1016/j.fss.2015.07.020
Kacprzyk, Janusz, & Zadrożny, S. (2016c). On a fairness type approach to consensus reaching support under fuzziness via linguistic summaries. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1999-2006). https://doi.org/10.1109/FUZZ-IEEE.2016.7737937
Kacprzyk, Janusz, & Zadrożny, S. (2018). Reaching Consensus in a Group of Agents: Supporting a Moderator Run Process via Linguistic Summaries. In Soft Computing Applications for Group Decision-making and Consensus Modeling (pp. 465-485). Springer.
Kacprzyk, Janusz, Zadrożny, S., & Dziedzic, M. (2014). A Novel View of Bipolarity in Linguistic Data Summaries. In L. T. Kóczy, C. R. Pozna, & J. Kacprzyk (Eds.), Issues and Challenges of Intelligent Systems and Computational Intelligence (pp. 215-229). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-03206-1_16
Kaczmarek-Majer, K., Hryniewicz, O., Dominiak, M., & Święcicki, Ł. (2019). Personalized linguistic summaries in smartphone-based monitoring of bipolar disorder patients. Atlantis Press. https://doi.org/10.2991/eusflat-19.2019.56
Khedidja, B., Allel, H., & Mohand, L. (2020). Data Summarization for Sensor Data Management: Towards Computational-Intelligence-Based Approaches. International Journal of Computing and Digital Systems, 9(5), 825-833. https://doi.org/10.12785/ijcds/090505
Kuhn, T. (2014). A Survey and Classification of Controlled Natural Languajes. Computational Lingusitic, 40(1), 121-170. http://dx.doi.org/10.1162/COLI_a_00168
Littell, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis. Oxford University Press.
Marín, N., & Sánchez, D. (2016). On generating linguistic descriptions of time series. Fuzzy Sets and Systems, 285, 6-30. https://doi.org/10.1016/j.fss.2015.04.014
Peláez-Aguilera, M. D., Espinilla, M., Fernández, M. R., & Medina, J. (2019). Fuzzy Linguistic Protoforms to Summarize Heart Rate Streams of Patients with Ischemic Heart Disease. Hindawi, 2019, 11. https://doi.org/0.1155/2019/2694126
Pérez, I., Piñero, P. Y., Bello, R., Acuña, L. A., & Vacacela, R. G. (2020). Linguistic Summaries Generation with Hybridization Method Based on Rough and Fuzzy Sets. In International Joint Conference on Rough Sets (pp. 385-397). Springer. https://doi.org/10.1007/978-3-030-52705-1_29
Pérez, I., Piñero, P. Y., Vacacela, R. G., Bello, R., & Acuña, L. A. (2020). Discovering Fails in Software Projects Planning Based on Linguistic Summaries. In International Joint Conference on Rough Sets (pp. 365-375). Springer. https://doi.org/10.1007/978-3-030-52705-1_27
Pérez, I., Santos, O., García, R., Piñero, P., & Ramírez, E. C. (2018). Descubrimiento de resúmenes lingüísticos para ayuda a la toma decisiones en gestión de proyecto. Revista Cubana de Ciencias Informáticas, 12, 163-175.
Pérez Pupo, I., Villavicencio, N., Piñero, P., García Vacacela, R., & García Sánchez, R. (2020). PROERP Ecosistema de software para la toma de decisiones en gestión de proyectos. In Experiencias Iberoamericanas de Ingeniería de Proyectos (p. 899). Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.
Piñero, P., Pérez Pupo, I., García Vacacela, R., & Toscanini, P. (2020). Caracterización de los estándares de gestión de proyectos y su impacto en la gestión económico financiera de las organizaciones orientadas a proyectos. Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.
Pupo, I. P., Santos Acosta, O., Piñero, P., García Vacacela, R., & Alvarado, L. (2020). Descubrimiento de errores en la planificación de proyectos basado en resúmenes linguísticos. In Experiencias Iberoamericanas de Ingeniería de Proyectos (pp. 867-876). Guayaquil, Ecuador: Universidad Católica de Santiago de Guayaquil.
Ramos-Soto, A., & Martin-Rodillab, P. (2019). Enriching linguistic descriptions of data: A framework for composite protoforms. Fuzzy Sets and Systems, 26. https://doi.org/10.1016/j.fss.2019.11.013
Rasmussen, D., & Yager, R. R. (1999). Finding fuzzy and gradual functional dependencies with SummarySQL. Fuzzy Sets and Systems, 106(2), 131-142. https://doi.org/10.1016/S0165-0114(97)00268-6
Rojas Valenzuela, Á. R. (2018). Resúmenes lingüísticos para riego de cultivos (Tesis). Universidad Técnica Federico Santa María, Departamento de Informática, Santiago, Chile. Retrieved from https://repositorio.usm.cl
Sanchez-Valdes, D., Alvarez-Alvarez, A., & Trivino, G. (2016). Dynamic linguistic descriptions of time series applied to self-track the physical activity. Fuzzy Sets and Systems, 285, 162-181. https://doi.org/10.1016/j.fss.2015.06.018
Smits, G., Nerzic, P., Pivert, O., & Lesot, M.-J. (2018). Efficient Generation of Reliable Estimated Linguistic Summaries. In 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-8). https://doi.org/10.1109/FUZZ-IEEE.2018.8491604
Wilbik, A., & Dijkman, R. M. (2016). On the generation of useful linguistic summaries of sequences. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 555-562). https://doi.org/10.1109/FUZZ-IEEE.2016.7737736
Wilbik, A., Gilsing, R., Turetken, O., Ozkan, B., & Grefen, P. (2020). Intentional linguistic summaries for collaborative business model radars. In 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). IEEE. https://doi.org/10.1109/FUZZ48607.2020.9177587
Wilbik, A., Kaymak, U., & Dijkman, R. M. (2017). A method for improving the generation of linguistic summaries. In 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-6). https://doi.org/10.1109/FUZZ-IEEE.2017.8015752
Wilbik, A., Vanderfeesten, I., Bergmans, D., Heines, S., & Mook, W. van. (2018). Linguistic Summaries for Compliance Analysis of a Glucose Management Clinical Protocol. In 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). https://doi.org/10.1109/FUZZ-IEEE.2018.8491449
Wu, D., & Mendel, J. M. (2010). Linguistic summarization using IF–THEN rules and interval type-2 fuzzy sets. IEEE Transactions on Fuzzy Systems, 19(1), 136-151. https://doi.org/10.1109/TFUZZ.2010.2088128
Wu, D., & Mendel, J. M. (2011). Linguistic summarization using IF–THEN rules and interval type-2 fuzzy sets. IEEE Transactions on Fuzzy Systems, 19(1), 136-151.
Wu, D., Mendel, J. M., & Joo, J. (2010). Linguistic summarization using if-then rules. In Fuzzy Systems (FUZZ), 2010 IEEE International Conference on (pp. 1-8). IEEE.
Yager, R. R. (1991). On Linguistic Summaries of Data. Knowledge Discovery in Databases, 378-389.
Yager, Ronald R. (1982). A new approach to the summarization of data. Information Sciences, 28(1), 69-86. https://doi.org/10.1016/0020-0255(82)90033-0
Zadeh, L. A. (1983). A computational approach to fuzzy quantifiers in natural languages. Computers & Mathematics with Applications, 9(1), 149-184.
Zadeh, L. A. (2002). A prototype-centered approach to adding deduction capability to search engines–the concept of protoform. In Intelligent Systems, 2002. Proceedings. 2002 First International IEEE Symposium (Vol. 1, pp. 2-3). IEEE. https://doi.org/10.1109/IS.2002.1044219.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Iliana Pérez Pupo, Pedro Yobanis Piñeiro Pérez, Nayma Martín Amaro, Rafael Esteban Bello Pérez
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.