Hacia la democratización del aprendizaje de máquinas

Ernesto Luis Estevanell-Valladares Estevanell-Valladares; Suilan  Estevez-Velarde; Alejandro Piad-Morffis; Yoan Gutierrez; Andres Montoyo; Yudivian Almeida-Cruz

Authors

Ernesto Luis Estevanell-Valladares Estevanell-Valladares University of Habana, Faculty of Mathematics and Computer Science
Suilan Estevez-Velarde Universidad de La Habana
Alejandro Piad-Morffis University of Habana, Faculty of Mathematics and Computer Science
Yoan Gutierrez University of Alicante, University Institute of Computer Science Research.
Andres Montoyo University of Alicante, Department of Languages and Computer Systems
Yudivian Almeida-Cruz University of Habana, Faculty of Mathematics and Computer Science

Keywords:

Artificial intelligence, AutoML, automated learning, machine learning

Abstract

Machine Learning is a field of Artificial Intelligence that has gained recent interest in all areas of the industry, motivated primarily by the accelerated growth of computer capabilities and data availability. However, one of the main difficulties for its application is the need for experts who know the internal details of the multiple models that can be used. In this context, a new field of study has emerged, AutoML (Automated Machine Learning), which facilitates the use of these techniques by experts from other domains. This paper presents an introduction to the field of AutoML, a brief comparison between existing tools, and a concrete proposal of a technology —AutoGOAL, own authorship— which has been designed to solve machine learning problems of various kinds. Our proposal is competitive with state-of-the-art tools in classic machine learning problems, and it can be seamlessly deployed in more complex domains, such as natural language processing.

References

Azevedo, A. & Santos, M. F. (2008). KDD, SEMMA y CRISP-DM: a parallel overview. En A. Abraham (ed.), IADIS European Conf. Data Mining (pp. 182-185). IADIS. ISBN: 978-972-8924-63-8

Ballestar, M. T., Grau-Carles, P., & Sainz, J. (2019). Predicting customer quality in e-commerce social networks: a machine learning approach. Review of Managerial Science, 13(3), 589-603.

Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., & Cox, D. D. (2015). Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery, 8(1), 014008. doi: 10.1088/1749-4699/8/1/014008

Bhardwaj, R., Nambiar, A. R., & Dutta, D. (2017). A study of machine learning in healthcare. In 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), volume 2, pages 236-241. IEEE. doi: 10.1109/COMPSAC.2017.164

Brunton, S. L., Noack, B. R., y Koumoutsakos, P. (2020). Machine learning for fluid mechanics. Annual Review of Fluid Mechanics, 52:477-508. doi: 10.1146/annurev-fluid-010719-060214

Chollet, F., et al. (2015). Keras. Github. https://¬keras.io.

de Sá, A. G., Pinto, W. J. G., Oliveira, L. O. V., & Pappa, G. L. (2017). Recipe: a grammar-based framework for automatically evolving classification pipelines. En European Conference on Genetic Programming, (pp. 246-261). Springer. doi: 10.1007/978-3-319-55696-3_16

Estévez-Velarde, S., Gutiérrez, Y., Almeida-Cruz, Y., & Montoyo, A. (2020a). General-purpose hierarchical optimisation of machine learning pipelines with grammatical evolution. Information Sciences, 543, 58-71. doi: 10.1016/j.ins.2020.07.035

Estévez-Velarde, S., Piad-Morffis, A., Gutiérrez, Y., Montoyo, A., Muñoz-Guillena, R. & Almeida-Cruz, Y. (2020b). Demo Application for the AutoGOAL Framework. En Ptaszynski, Michal y Ziolko, Bartosz (editors). Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations (pp. 18-22). Recuperado de http://bit.ly/3sDDeAt.

Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. In Advances in neural information processing systems, 28, 2962-2970.

Hopcroft, John E. & Ullman, Jeffrey D., editors (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 77-106.

Hutter, F., Kotthoff, L., & Vanschoren, J., editors (2018). Automated Machine Learning: Methods, Systems, Challenges. Springer. Recuperado de http://link.springer.com/978-3-030-05318-5. doi: 10.1007/978-3-030-05318-5

Kim, H. T. & Ahn, C. W. (2015). A new grammatical evolution based on probabilistic context-free grammar. En Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems, 2, pages 1-12. Springer.

Kotthoff, L., Thornton, C., Hoos, H. H., Hutter, F., & Leyton-Brown, K. (2017). Auto-weka 2.0: Automatic model selection and hyperparameter optimization in weka. The Journal of Machine Learning Research, 18(1), 826-830.

Mendoza, N. F. (2020). 86% of businesses say they're not ready for the next stage of the Data Age. TechRepublic. Recuperado de https://tek.io/3nvzmxP.

Mohr, F., Wever, M., & Hüllermeier, E. (2018). Ml-plan: Automated machine learning via hierarchical planning. Machine Learning, 107(8-10), 1495-1515. doi: 10.1007/s10994-018-5735-z

Oliver, J. R. (1996). A machine-learning approach to automated negotiation y prospects for electronic commerce. Journal of management information systems, 13(3),83-112. doi: 10.1080/07421222.1996.11518135

Olson, R. S., Bartley, N., Urbanowicz, R. J., & Moore, J. H. (2016). Evaluation of a tree-based pipeline optimization tool for automating data science. En Proceedings of the Genetic and Evolutionary Computation Conference 2016, abs/1603.06212, 485-492.

Olson, R. S. & Moore, J. H. (2019). Tpot: A tree-based pipeline optimization tool for automating machine learning. En Proceedings of the Workshop on Automatic Machine Learning, en PMLR, 64, 66-74.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Wuest, T., Weimer, D., Irgens, C., & Thoben, K.-D. (2016). Machine learning in manufacturing: advantages, challenges, and applications. Production & Manufacturing Research, 4(1), 23-45.