刘斌博士:Markov decision process and reinforcement learning for intelligent operation and maintenance----中国科学院数学与系统科学研究院

学术报告

您当前的位置：首页学术报告

刘斌博士:Markov decision process and reinforcement learning for intelligent operation and maintenance

Academy of Mathematics and Systems Science, CAS
Colloquia & Seminars

Speaker:	刘斌博士，思克莱德大学管理科学系
Inviter:	胡庆培
Title:	Markov decision process and reinforcement learning for intelligent operation and maintenance
Language:	Chinese
Time & Venue:	2022.12.09 10:00-11:00 腾讯会议:951-760-750
Abstract:	本次讲座主要针对智能运维中的建模优化问题。首先基于前期研究，我将讨论基于马尔可夫决策过程的有限周期的视情维护策略。考虑二元件系统以及系统元件的退化过程具有随机相关性，用二元伽马过程来描述系统退化过程。系统元件服从周期性检测，当元件的退化程度超过预防性维护阈值时，其会被替换。该维护问题可以表示成马尔可夫决策过程并可用动态规划来求解。不同于无限周期的维护策略，有限周期的最优策略是动态的，其在每次检测都会变化。我们从理论上得出最优策略的性质并给出各种维护方式的界限。之后，我将简单讨论下在线强化学习及深度强化学习在智能运维中的运用，最终期望实现运维的自动化。