Feedforward beta control in the KSTAR tokamak by deep reinforcement learning

被引：47

作者：

Seo, Jaemin ^{[1
]}

Na, Y. S. ^{[1
]}

Kim, B. ^{[1
]}

Lee, C. Y. ^{[1
]}

Park, M. S. ^{[1
]}

Park, S. J. ^{[1
]}

Lee, Y. H. ^{[2
]}

机构：

[1] Seoul Natl Univ, Dept Nucl Engn, Seoul, South Korea

[2] Korea Inst Fus Energy, Daejeon, South Korea

来源：

NUCLEAR FUSION | 2021年 / 61卷 / 10期

基金：

新加坡国家研究基金会;

关键词：

machine learning; reinforcement learning; beta control; data-driven simulation; KSTAR; tokamak; GENERAL AXISYMMETRICAL EQUILIBRIA; RECONSTRUCTION; PARAMETERS; PLASMAS;

D O I：

10.1088/1741-4326/ac121b

中图分类号：

O35 [流体力学]; O53 [等离子体物理学];

学科分类号：

070204 ; 080103 ; 080704 ;

摘要：

In this work, we address a new feedforward control scheme for the normalized beta (beta (N)) in tokamak plasmas, using the deep reinforcement learning (RL) technique. The deep RL algorithm optimizes an artificial decision-making agent that adjusts the discharge scenario to obtain a given target beta (N) from the state-action-reward sets explored by its own trial and error in a virtual tokamak environment. The virtual environment for the RL training is constructed using a long short-term memory (LSTM) network that imitates the plasma responses to external actuator controls, which is trained using five years' worth of KSTAR experimental data. The RL agent then experiences numerous discharges with different actuator controls in the LSTM simulator, and its internal parameters are optimized in the direction of maximizing the reward. We analyze a series of KSTAR experiments conducted with the RL-determined scenarios to validate the feasibility of the beta control scheme in a real device. We discuss the successes and limitations of feedforward beta control by RL, and suggest a future research path for this area of study.

引用

页数：14

共 51 条

[1] Overview of recent experimental results from the DIII-D advanced tokamak programme [J].