Magnetic control of tokamak plasmas through deep reinforcement learning

Nature - Tập 602 Số 7897 - Trang 414-419 - 2022

Jonas Degrave¹, F. Felici², Jonas Buchli¹, Michael Neunert¹, Brendan Tracey¹, F. Carpanese¹, Timo Ewalds¹, Roland Hafner¹, Abbas Abdolmaleki¹, Diego de Las Casas¹, Craig Donner¹, Leslie Fritz¹, C. Galperti², Andrea Huber¹, James Keeling¹, Maria Tsimpoukelli¹, Jackie Kay¹, A. Merle², J.M. Moret², Seb Noury¹, Federico Pesamosca², David Pfau¹, O. Sauter², C. Sommariva², S. Coda², B.P. Duval², A. Fasoli², Pushmeet Kohli¹, Koray Kavukcuoglu¹, Demis Hassabis¹, Martin Riedmiller¹

¹DeepMind, London, UK

²Swiss Plasma Center - EPFL, Lausanne, Switzerland

Tóm tắt

AbstractNuclear fusion using magnetic confinement, in particular in the tokamak configuration, is a promising path towards sustainable energy. A core challenge is to shape and maintain a high-temperature plasma within the tokamak vessel. This requires high-dimensional, high-frequency, closed-loop control using magnetic actuator coils, further complicated by the diverse requirements across a wide range of plasma configurations. In this work, we introduce a previously undescribed architecture for tokamak magnetic controller design that autonomously learns to command the full set of control coils. This architecture meets control objectives specified at a high level, at the same time satisfying physical and operational constraints. This approach has unprecedented flexibility and generality in problem specification and yields a notable reduction in design effort to produce new plasma configurations. We successfully produce and control a diverse set of plasma configurations on the Tokamak à Configuration Variable1,2, including elongated, conventional shapes, as well as advanced configurations, such as negative triangularity and ‘snowflake’ configurations. Our approach achieves accurate tracking of the location, current and shape for these configurations. We also demonstrate sustained ‘droplets’ on TCV, in which two separate plasmas are maintained simultaneously within the vessel. This represents a notable advance for tokamak feedback control, showing the potential of reinforcement learning to accelerate research in the fusion domain, and is one of the most challenging real-world systems to which reinforcement learning has been applied.

Từ khóa

Tài liệu tham khảo

Hofmann, F. et al. Creation and control of variably shaped plasmas in TCV. Plasma Phys. Control. Fusion 36, B277 (1994).

Coda, S. et al. Physics research on the TCV tokamak facility: from conventional to alternative scenarios and beyond. Nucl. Fusion 59, 112023 (2019).

Anand, H., Coda, S., Felici, F., Galperti, C. & Moret, J.-M. A novel plasma position and shape controller for advanced configuration development on the TCV tokamak. Nucl. Fusion 57, 126026 (2017).

Mele, A. et al. MIMO shape control at the EAST tokamak: simulations and experiments. Fusion Eng. Des. 146, 1282–1285 (2019).

Anand, H. et al. Plasma flux expansion control on the DIII-D tokamak. Plasma Phys. Control. Fusion 63, 015006 (2020).

De Tommasi, G. Plasma magnetic control in tokamak devices. J. Fusion Energy 38, 406–436 (2019).

Walker, M. L. & Humphreys, D. A. Valid coordinate systems for linearized plasma shape response models in tokamaks. Fusion Sci. Technol. 50, 473–489 (2006).

Blum, J., Heumann, H., Nardon, E. & Song, X. Automating the design of tokamak experiment scenarios. J. Comput. Phys. 394, 594–614 (2019).

Ferron, J. R. et al. Real time equilibrium reconstruction for tokamak discharge control. Nucl. Fusion 38, 1055 (1998).

Moret, J.-M. et al. Tokamak equilibrium reconstruction code LIUQE and its real time implementation. Fusion Eng. Des. 91, 1–15 (2015).

Xie, Z., Berseth, G., Clary, P., Hurst, J. & van de Panne, M. Feedback control for Cassie with deep reinforcement learning. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 1241–1246 (IEEE, 2018).

Akkaya, I. et al. Solving Rubik’s cube with a robot hand. Preprint at https://arxiv.org/abs/1910.07113 (2019).

Bellemare, M. G. et al. Autonomous navigation of stratospheric balloons using reinforcement learning. Nature 588, 77–82 (2020).

Humphreys, D. et al. Advancing fusion with machine learning research needs workshop report. J. Fusion Energy 39, 123–155 (2020).

Bishop, C. M., Haynes, P. S., Smith, M. E., Todd, T. N. & Trotman, D. L. Real time control of a tokamak plasma using neural networks. Neural Comput. 7, 206–217 (1995).

Joung, S. et al. Deep neural network Grad-Shafranov solver constrained with measured magnetic signals. Nucl. Fusion 60, 16034 (2019).

van de Plassche, K. L. et al. Fast modeling of turbulent transport in fusion plasmas using neural networks. Phys. Plasmas 27, 022310 (2020).

Abbate, J., Conlin, R. & Kolemen, E. Data-driven profile prediction for DIII-D. Nucl. Fusion 61, 046027 (2021).

Kates-Harbeck, J., Svyatkovskiy, A. & Tang, W. Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568, 526–531 (2019).

Jardin, S. Computational Methods in Plasma Physics (CRC Press, 2010).

Grad, H. & Rubin, H. Hydromagnetic equilibria and force-free fields. J. Nucl. Energy (1954) 7, 284–285 (1958).

Carpanese, F. Development of Free-boundary Equilibrium and Transport Solvers for Simulation and Real-time Interpretation of Tokamak Experiments. PhD thesis, EPFL (2021).

Abdolmaleki, A. et al. Relative entropy regularized policy iteration. Preprint at https://arxiv.org/abs/1812.02256 (2018).

Paley, J. I., Coda, S., Duval, B., Felici, F. & Moret, J.-M. Architecture and commissioning of the TCV distributed feedback control system. In 2010 17th IEEE-NPSS Real Time Conference 1–6 (IEEE, 2010).

Freidberg, J. P. Plasma Physics and Fusion Energy (Cambridge Univ. Press, 2008).

Hommen, G. D. et al. Real-time optical plasma boundary reconstruction for plasma position control at the TCV Tokamak. Nucl. Fusion 54, 073018 (2014).

Austin, M. E. et al. Achievement of reactor-relevant performance in negative triangularity shape in the DIII-D tokamak. Phys. Rev. Lett. 122, 115001 (2019).

Kolemen, E. et al. Initial development of the DIII–D snowflake divertor control. Nucl. Fusion 58, 066007 (2018).

Anand, H. et al. Real time magnetic control of the snowflake plasma configuration in the TCV tokamak. Nucl. Fusion 59, 126032 (2019).

Wigbers, M. & Riedmiller, M. A new method for the analysis of neural reference model control. In Proc. International Conference on Neural Networks (ICNN’97) Vol. 2, 739–743 (IEEE, 1997).

Berkenkamp, F., Turchetta, M., Schoellig, A. & Krause, A. Safe model-based reinforcement learning with stability guarantees. In 2017 Advances in Neural Information Processing Systems 908–919 (ACM, 2017).

Wabersich, K. P., Hewing, L., Carron, A. & Zeilinger, M. N. Probabilistic model predictive safety certification for learning-based control. IEEE Tran. Automat. Control 67, 176–188 (2021).

Abdolmaleki, A. et al. On multi-objective policy optimization as a tool for reinforcement learning. Preprint at https://arxiv.org/abs/2106.08199 (2021).

Coda, S. et al. Overview of the TCV tokamak program: scientific progress and facility upgrades. Nucl. Fusion 57, 102011 (2017).

Karpushov, A. N. et al. Neutral beam heating on the TCV tokamak. Fusion Eng. Des. 123, 468–472 (2017).

Lister, J. B. et al. Plasma equilibrium response modelling and validation on JT-60U. Nucl. Fusion 42, 708 (2002).

Lister, J. B. et al. The control of tokamak configuration variable plasmas. Fusion Technol. 32, 321–373 (1997).

Ulyanov, D., Vedaldi, A. & Lempitsky, V. Instance normalization: the missing ingredient for fast stylization. Preprint at https://arxiv.org/abs/1607.08022 (2016).

Andrychowicz, M. et al. What matters in on-policy reinforcement learning? A large-scale empirical study. In ICLR 2021 Ninth International Conference on Learning Representations (2021).

Cassirer, A. et al. Reverb: a framework for experience replay. Preprint at https://arxiv.org/abs/2102.04736 (2021).

Hoffman, M. et al. Acme: a research framework for distributed reinforcement learning. Preprint at https://arxiv.org/abs/2006.00979 (2020).

Hofmann, F. FBT-a free-boundary tokamak equilibrium code for highly elongated and shaped plasmas. Comput. Phys. Commun. 48, 207–221 (1988).

Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In Proc. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16) 265–283 (2016).

De Tommasi, G. et al. Model-based plasma vertical stabilization and position control at EAST. Fusion Eng. Des. 129, 152–157 (2018).

Gerkšič, S. & De Tommasi, G. ITER plasma current and shape control using MPC. In 2016 IEEE Conference on Control Applications (CCA) 599–604 (IEEE, 2016).

Boncagni, L. et al. Performance-based controller switching: an application to plasma current control at FTU. In 2015 54th IEEE Conference on Decision and Control (CDC) 2319–2324 (IEEE, 2015).

Wakatsuki, T., Suzuki, T., Hayashi, N., Oyama, N. & Ide, S. Safety factor profile control with reduced central solenoid flux consumption during plasma current ramp-up phase using a reinforcement learning technique. Nucl. Fusion 59, 066022 (2019).

Wakatsuki, T., Suzuki, T., Oyama, N. & Hayashi, N. Ion temperature gradient control using reinforcement learning technique. Nucl. Fusion 61, 046036 (2021).

Seo, J. et al. Feedforward beta control in the KSTAR tokamak by deep reinforcement learning. Nucl. Fusion 61, 106010 (2021).

Yang, F. et al. Launchpad: a programming model for distributed machine learning research. Preprint at https://arxiv.org/abs/2106.04516 (2021).

Muldal, A. et al. dm_env: a Python interface for reinforcement learning environments. http://github.com/deepmind/dm_env (2019).

Reynolds, M. et al. Sonnet: TensorFlow-based neural network library. http://github.com/deepmind/sonnet (2017).

Martín A. et al. TensorFlow: large-scale machine learning on heterogeneous systems. Software available from https://www.tensorflow.org/ 2015.

Hender, T. C. et al. Chapter 3: MHD stability, operational limits and disruptions. Nucl. Fusion 47, S128–S202 (2007).

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA