Method of automatic extraction of software requirements from unstructured textual information

Authors

  • Amanda Hernández Carreras Technological University of Havana "José Antonio Echeverría"
  • Alfredo Simón Cuevas Technological University of Havana "José Antonio Echeverría"
  • Anaisa Hernández González Universidad Tecnológica de la Habana "José Antonio Echeverría"

Keywords:

requirements capture; automatic requirements extraction; natural language processing

Abstract

Obtaining requirements is one of the most important and critical phases in software development, due to the influence of its results on the success of the projects. Documentary analysis is one of the most used techniques in this process. The manual execution of this analysis has been characterized by the high consumption of time and the frequent appearance of errors, motivating the development of investigations focused on its automation. Natural Language Processing for Requirements Engineering (PLNRE) is an area of ​​research and development that seeks to apply Natural Language Processing (PLN) techniques, tools, and resources to the Requirements Engineering (RE) process, to help human analysts to carry out various linguistic tasks. In the work, a method for the automatic extraction of software requirements, from unstructured textual information, was presented. The proposed method focuses on syntactic analysis based on lexical-syntactic patterns, on dependency analysis and an approach based on the combination of both eduction techniques. The Precision, Coverage and Measure-F metrics were computed by comparing the requirement that was obtained, with the one elaborated manually by the expert. In this comparison, the Levenshtein distance was used, using 60% as the acceptance threshold. The results obtained demonstrate a relevance in the value of precision by the pattern-based extraction technique, as well as in the coverage and F-measure for the solution that integrates both information extraction techniques.

References

Alonso Toro Lazo, J. G. (2016). Especificación de requisitos de software: Una mirada desde la revisión teórica de antecedentes. Entre Ciencia e Ingeniería, 10(19): 108-115.

Altinok, D. (2021). Mastering SpaCy. Birmingham: Packt Publishing Ltd.

Ballesteros, M., Martín, R., y Agudo, B. D. (2010). JadaWeb: A CBR System for Cooking Recipes. En Proceedings of Workshop on Computer Cooking Contest (ICCBR 2010). Italy, p. 179.

Bourque, P., Dupuis, R., Abran, A., Moore, J., y Tripp, L. (2014). Guide to the Software Engineering - Body of Knowledge. Recuperado de: http://www. swebok. org.

Caseli, H., Pereira, T., Specia, L., Pardo, T., Gasperin, C., y Aluisio, S. (2009). Building a Brazilian Portuguese parallel corpus of original and simplified texts. Advances in Computational Linguistics, Research in Computer Science, 41: 59-70.

Dalpiaz, F., Ferrari, A., Franch, X., y Palomares, C. (2018). Natural Language Processing for Requirements Engineering. IEEE Software, 35(5): 115-119.

Denger, C., Berry, D., y Kamsties, E. (2003). Higher quality requirements specifications through natural language patterns. Proceedings 2003 Symposium on Security and Privacy, pp. 80-90. IEEE.

Gamallo, P. y González, I. (2011). A gramatical formalism based on patterns of part of speech tags. International Journal in Corpus Linguistcs, 16(19): 45-71.

Garg, N., Agarwal, P., y Khan, S. (2015). Recent advancements in requirement elicitation and prioritization techniques. 2015 International Conference on Advances in Computer Engineering and Applications, pp. 237-240, IEEE.

Hendrik Metha, M. B. (2013). The state of the art in automated requirements elicitation. Information and Software Technology, 55(10): 1695-1709.

Herrera, J., Peñas, A., y Verdejo, F. (2005). Textual Entailment Recognition Based on Dependency Analysis and WordNet. Part of the Lecture Notes in Computer Science book series, 3944, pp. 231-239.

Hussain, I., Kosseim, L., y Ormandjieva, O. (2008). Using Linguistic Knowledge to Classify Nonfunctional Requirements in SRS documents. Lecture Notes in Computer Science, 5039, pp. 287-298.

Kübler, S., McDonald, R., y Nivre, J. (2009). Dependency Parsing. Synthesis lectures on human language technologies, 1(1): 1-127.

Lamsweerde, A., Darimont, R., y Letier, E. (1998). Managing conflicts in goal-driven requirements engineering. IEEE transactions on Software engineering, 24(11): 908-926.

Leacock, C., y Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database, 49(2): 265-283.

Lili. (2010). Research on User Requirements Elicitation Using Text Association Rule. 2010 International Symposium on Intelligence Information Processing and Trusted Computing, pp. 357-359, IEEE

Abbasi, M. A., Jabeen, J., Hafeez, Y., Batool, D., & Fareen, N. (2015). Assessment of Requirement Elicitation Tools and Techniques by Various Parameters. Software Engineering, 3(2): 7-11.

Meth, H., Maedche, A., y Einoeder, M. (2013). Is Knowledge Power? The Role of Knowledge in Automated Requirements Elicitation. Advanced Information Systems Engineering: 25th International Conference, CAiSE 2013, Valencia, Spain, June 17-21, 2013. Proceedings 25, pp. 578-593, Springer Berlin Heidelberg.

Miller, G., Beckwith, R. Pellbaum, C., Gross, C. y Miller, C. (1990). Introduction to WordNet: an on-line lexical database. International Journal of Lexicography, 3(4): 235-244.

Mullner, D. (2011). Modern hierarchical, agglomerative clustering algorithms. arXiv preprint arXiv:1109.2378.

Murugesh, S., y Jaya, A. (2015). Construction of Ontology for Software Requirements Elicitation. Indian Journal of Science and Technology, 8(29).

Pablo, G., y Marcos, G. (2012). Dependency-Based Open Information Extraction. Proceedings of the joint workshop on unsupervised and semi-supervised learning in NLP, pp. 10-18.

Pedersen, T., Patwardhan, S., y Michelizzi, J. (2004). WordNet::Similarity-Measuring the Relatedness of Concepts. AAAI, vol. 4, pp. 25-29.

Rolland, C., y Salinesi, C. (2009). Supporting Requirements Elicitation through Goal/Scenario Coupling. Conceptual Modeling: Foundations and Applications, 5600, pp. 398-416.

Rolland, C., Souveyet, C. y Ben-Achour, C. (1998). Guiding goal modeling using scenarios. IEEE Transation Software Engineering, 24, pp. 1055-1071.

Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster análisis. Journal of Computational and Applied Mathematics, 20: 53-65.

Shadab Khan, A. B. (2014). Systematic Review of Requirement Elicitation Techniques. International Journal of Information and Computation Technology. Indian.

Shah, U., Patel, S., y Jinwala, D. (2016). Specification of non-functional requirements: A hybrid approach. 22nd International Working Conference on Requirements Engineering. Gothenburg, Sweden.

Vlas, R., y Robinson, W. N. (2011). A Rule-Based Natural Language Technique for Requirements Discovery and Classification in Open-Source Software Development Projects. 2011 44th Hawaii International Conference on System Sciences, pp. 1-10, IEEE.

Wu, y Palmer. (1994). Verb semantics and lexical selection. 32nd Annual Meeting of the Association for Computational Linguistics. Mexico. arXiv preprint cmp-lg/9406033.

Published

2023-03-07

How to Cite

Hernández Carreras, A., Simón Cuevas, A. ., & Hernández González, A. (2023). Method of automatic extraction of software requirements from unstructured textual information. Revista Cubana De Transformación Digital, 4(1), e203. Retrieved from https://rctd.uic.cu/rctd/article/view/203

Issue

Section

Articulos originales - Parte I