Self-improving Q-learning based controller for a class of dynamical processes

Musial, Jakub; Stebel, Krzysztof; Czeczot, Jacek

Szczegóły

PDF BIBTEX RIS

Tytuł artykułu

Self-improving Q-learning based controller for a class of dynamical processes

Tytuł czasopisma

Archives of Control Sciences

Rocznik

2021

Wolumin

vol. 31

Numer

No 3

Autorzy

Musial, Jakub ; Stebel, Krzysztof ; Czeczot, Jacek

Afiliacje

Musial, Jakub : Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, 44-100 Gliwice, ul. Akademicka 16, Poland ; Stebel, Krzysztof : Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, 44-100 Gliwice, ul. Akademicka 16, Poland ; Czeczot, Jacek : Silesian University of Technology, Faculty of Automatic Control, Electronics and Computer Science, Department of Automatic Control and Robotics, 44-100 Gliwice, ul. Akademicka 16, Poland

Słowa kluczowe

process control ; Q-learning algorithm ; reinforcement learning ; intelligent control ; on-line learning

Wydział PAN

Nauki Techniczne

Zakres

527-551

Wydawca

Committee of Automatic Control and Robotics PAS

Bibliografia

[1] H. Boubertakh, S. Labiod, M. Tadjine and P.Y. Glorennec: Optimization of fuzzy PID controllers using Q-learning algorithm. Archives of Control Sciences, 18(4), (2008), 415–435
[2] I.Carlucho, M. De Paula, S.A. Villar and G.G.Acosta: Incremental Qlearning strategy for adaptive PID control of mobile robots. Expert Systems With Applications, 80, (2017), 183–199, DOI: 10.1016/j.eswa.2017.03.002.
[3] K. Delchev: Simulation-based design of monotonically convergent iterative learning control for nonlinear systems. Archives of Control Sciences, 22(4), (2012), 467–480.
[4] M. Jelali: An overview of control performance assessment technology and industrial applications. Control Eng. Pract., 14(5), (2006), 441–466, DOI: 10.1016/j.conengprac.2005.11.005.
[5] M. Jelali: Control Performance Management in Industrial Automation: Assessment, Diagnosis and Improvement of Control Loop Performance. Springer-Verlag London, (2013)
[6] H.-K. Lam, Q. Shi, B. Xiao, and S.-H. Tsai: Adaptive PID Controller Based on Q-learning Algorithm. CAAI Transactions on Intelligence Technology, 3(4), (2018), 235–244, DOI: 10.1049/trit.2018.1007.
[7] D. Li, L. Qian, Q. Jin, and T. Tan: Reinforcement learning control with adaptive gain for a Saccharomyces cerevisiae fermentation process. Applied Soft Computing, 11, (2011), 4488–4495, DOI: 10.1016/j.asoc.2011.08.022.
[8] M.M. Noel and B.J. Pandian: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Applied Soft Computing, 23, (2014), 444–451, DOI: 10.1016/j.asoc.2014.06.037.
[9] T. Praczyk: Concepts of learning in assembler encoding. Archives of Control Sciences, 18(3), (2008), 323–337.
[10] M.B. Radac and R.E. Precup: Data-driven model-free slip control of antilock braking systems using reinforcement Q-learning. Neurocomputing, 275, (2017), 317–327, DOI: 10.1016/j.neucom.2017.08.036.
[11] A.K. Sadhu and A. Konar: Improving the speed of convergence of multi-agent Q-learning for cooperative task-planning by a robot-team. Robotics and Autonomous Systems, 92, (2017), 66–80, DOI: 10.1016/j.robot.2017.03.003.
[12] N. Sahebjamnia, R. Tavakkoli-Moghaddam, and N. Ghorbani: Designing a fuzzy Q-learning multi-agent quality control system for a continuous chemical production line – A case study. Computers & Industrial Engineering, 93, (2016), 215–226, DOI: 10.1016/j.cie.2016.01.004.
[13] K. Stebel: Practical aspects for the model-free learning control initialization. in Proc. of 2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR), Poland, (2015), DOI: 10.1109/MMAR.2015.7283918.
[14] R.S. Sutton and A.G. Barto: Reinforcement learning: An Introduction, MIT Press, (1998)
[15] S. Syafiie, F. Tadeo, and E. Martinez: Softmax and "-greedy policies applied to process control. IFAC Proceedings, 37, (2004), 729–734, DOI: 10.1016/S1474-6670(16)31556-2.
[16] S. Syafiie, F. Tadeo, and E. Martinez: Model-free learning control of neutralization process using reinforcement learning. Engineering Applications of Artificial Intelligence, 20, (2007), 767–782, DOI: 10.1016/j.engappai.2006.10.009.
[17] S. Syafiie, F. Tadeo, and E. Martinez: Learning to control pH processes at multiple time scales: performance assessment in a laboratory plant. Chemical Product and Process Modeling, 2(1), (2007), DOI: 10.2202/1934- 2659.1024.
[18] S. Syafiie, F. Tadeo, E. Martinez, and T. Alvarez: Model-free control based on reinforcement learning for a wastewater treatment problem. Applied Soft Computing, 11, (2011), 73–82, DOI: 10.1016/j.asoc.2009.10.018.
[19] P. Van Overschee and B. De Moor: RAPID: The End of Heuristic PID Tuning. IFAC Proceedings, 33(4), (2000), 595–600, DOI: 10.1016/S1474- 6670(16)38308-8.
[20] M. Wang, G. Bian, and H. Li: A new fuzzy iterative learning control algorithm for single joint manipulator. Archives of Control Sciences, 26(3), (2016), 297–310. DOI: 10.1515/acsc-2016-0017.
[21] Ch.J.C.H. Watkins and P. Dayan: Technical Note: Q-learning. Machine Learning, 8, (1992), 279–292, DOI: 10.1023/A:1022676722315.

Data

2021.09.27

Typ

Article

Identyfikator

DOI: 10.24425/acs.2021.138691 ; ISSN 1230-2384