Herramienta basada en aprendizaje reforzado y sistemas multi-agente para problemas de secuenciación de tareas

Jessica Coto Palacio; Yailen Martínez Jiménez; Ann Nowé

Autores/as

Jessica Coto Palacio UEB Hotel Los Caneyes
Yailen Martínez Jiménez Universidad Central de Las Villas "Marta Abreu"
Ann Nowé Vrije Universiteit Brussel

Palabras clave:

Secuenciación de tareas, Sistemas Multi-agente, Industria 4.0, Aprendizaje Reforzado

Resumen

La aparición de la Industria 4.0 permite que nuevos enfoques puedan resolver problemas industriales como el problema de secuenciación de tareas de tipo Job Shop. Se ha de-mostrado que los enfoques basados en Aprendizaje Reforzado con múltiples agentes son altamente prometedores para manejar escenarios de secuenciación complejos. En este trabajo se propone una herramienta basada en Aprendizaje Reforzado y Sistemas Multi-Agente la cual es fácil de usar, y más atractiva para la industria. Permite a los usuarios interactuar con los algoritmos de aprendizaje de tal manera que todas las restricciones de la planta de producción se incluyan cuidadosamente y los objetivos se puedan adaptar a los escenarios del mundo real. El usuario puede mantener la mejor solución obtenida por un algoritmo Q-Learning o ajustarla fijando algunas operaciones para cumplir con ciertas restricciones, luego la herramienta optimizará la solución modificada res-petando las preferencias del usuario utilizando dos alternativas posibles. Estas alternativas se validan utilizando juegos de datos de la librería de problemas de investigación de operaciones (OR-Library), los experimentos muestran que el algoritmo Q-Learning modificado es capaz de obtener los mejores resultados.

Citas

Asadzadeh, L. (2015). A local search genetic algorithm for the job shop scheduling problem with intelligent agents. Computers & Industrial Engineering, 85, 376–383.

Aydin, M. E., & Oztemel, E. (2000). Dynamic job-shop scheduling using reinforcement learning agents. Robotics and Autonomous Systems, 33, 169–178.

Baxter, J., & Bartlett, P. L. (2001). Inﬁnite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, 15, 319–350.

Beasley, J. E. (1990). OR-Library: Distributing test problems by electronic mail. Journal of the Operational Research Society, 41(11), 1069–1072.

Gabel, T. (2009). Multi-Agent Reinforcement Learning Approaches for Distributed Job-Shop Scheduling Problems. PhD Thesis. Universität Osnabrück.

Gabel, T., & Riedmiller, M. (2007). Scaling Adaptive Agent-Based Reactive Job-Shop Scheduling to Large-Scale Problems. IEEE Symposium on Computational Intelligence in Scheduling (CI-Sched 2007), 259–266.

Gavin, R., & Niranjan, M. (1994). Online Q-learning using connectionist systems. Technical Report. Cambridge University, Engineering Department.

Gomes, C. P. (2000). Artificial intelligence and operations research: challenges and opportunities in planning and scheduling. The Knowledge Engineering Review, 15(1), 1–10.

Goren, S., & Sabuncuoglu, I. (2008). Robustness and stability measures for scheduling: single-machine environment. IIE Transactions, 40(1), 66–83.

Hall, N. G., & Potts, C. N. (2004). Rescheduling for new orders. Operations Research, 52, 440–453.

Kuhnle, A., Röhrig, N., Lanza, G. (2019). Autonomous order dispatching in the semiconductor industry using reinforcement learning. Procedia CIRP, Volume 79, Pages 391-396, ISSN 2212-8271.

Leitao, P., Colombo, A. W., & Karnouskos, S. (2016). Industrial automation based on cyber-physical systems technologies: Prototype implementations and challenges. Computers Industry, 81, 11–25.

Leitao, P., Rodrigues, N., Barbosa, J., Turrin, C., & Pagani, A. (2005). Intelligent products: The grace experience. Control Engineering Practice, 42, 95–105.

Leusin, M. E., Frazzon, E. M., Uriona Maldonado, M., Kück, M., & Freitag, M. (2018). Solving the Job-Shop Scheduling Problem in the Industry 4.0 Era. Technologies, 6(4).

Martínez Jiménez, Y. (2012). A Generic Multi-Agent Reinforcement Learning Approach for Scheduling Problems. PhD Thesis. Vrije Universiteit Brussel, Brussels.

Palombarini, J. A., Martínez, E. C. (2019). Closed-loop Rescheduling using Deep Reinforcement Learning. IFAC-PapersOnLine, Volume 52, Issue 1, Pages 231-236, ISSN 2405-8963.

Pinedo, M. (1995). Scheduling: theory, algorithms and systems. Englewood Cliffs, NJ: PrenticeHall.

Shi D., Fan W., Xiao Y., Lin T., Xing C. (2020). Intelligent scheduling of discrete automated production line via deep reinforcement learning. International Journal of Production Research.

Singh, S., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123–158.

Stone, P., & Veloso, M. (2000). Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3), 345–383.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. The MIT Press.

Taylor, M. E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10, 1633–1685.

Toader, F. A. (2017). Production Scheduling in Flexible Manufacturing Systems: A State of the Art Survey. 3(7), 1–6.

Urlings, T. (2010). Heuristics and metaheuristics for heavily constrained hybrid flowshop problems. Universidad Politécnica de Valencia.

Vogel-Heuser, B., Lee, J., & Leitao, P. (2015). Agents enabling cyber-physical production systems. AT-Autom., 63, 777–789.

Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD Thesis. King’s College.

Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229–256.

Xiang, W., & Lee, H. P. (2008). Ant colony intelligence in multi-agent dynamic manufacturing scheduling. Engineering Applications of Artificial Intelligence, 21, 73–85.

Y. Ng, A., & Jordan, M. (2000). PEGASUS: apolicy search method for large MDPs and POMDPs. Proceedings of the 16th Conference on Uncertainty in Artiﬁcial Intelligence.

Zhang, W. (1996). Reinforcement Learning for Job Shop Scheduling. PhD Thesis. Oregon State University.