Reward-based Monte Carlo-Bayesian reinforcement learning for cyber preventive maintenance

Computers & Industrial Engineering - Tập 126 - Trang 578-594 - 2018
Theodore T. Allen1, Sayak Roychowdhury2, Enhao Liu1
1The Ohio State University, Integrated Systems Engineering, 1971 Neil Avenue – 210 Baker Systems, Columbus, OH 43221, United States
2Indian Institute of Technology, Kharagpur, Industrial and Systems Engineering, Kharagpur 721302, India

Tài liệu tham khảo

Afful-Dadzie, 2016, Control charting methods for autocorrelated cyber vulnerability data, Quality Engineering, 28, 313, 10.1080/08982112.2015.1125926

Afful-Dadzie, 2014, Data-driven cyber-vulnerability maintenance policies, Journal of Quality Technology, 46, 234, 10.1080/00224065.2014.11917967

Allen, 2011

Allen, 2017, Timely decision analysis enabled by efficient social media modeling, Decision Analysis, 14, 250, 10.1287/deca.2017.0360

Bakker, 2003, Task clustering and gating for bayesian multitask learning, Journal of Machine Learning Research, 4, 83

Bellman, 1957

Cheng, 2017, Joint optimization of lot sizing and condition-based maintenance for multi-component production systems, Computers & Industrial Engineering, 110, 538, 10.1016/j.cie.2017.06.033

Cockburn, 2009, Websites here, websites there, websites everywhere..., But are they secure?, The Quaestor Quarterly, 4, 1

Clarke, 2016, Conclusion: Key Themes for the Next President, The ANNALS of the American Academy of Political and Social Science, 668, 212, 10.1177/0002716216675825

Chen, 1997, Statistical applications of the Poisson-binomial and conditional Bernoulli distributions, Statistica Sinica, 875

Delage, 2010, Percentile optimization for Markov decision processes with parameter uncertainty, Operations Research, 58, 203, 10.1287/opre.1080.0685

Duff, 2002

Ghavamzadeh, 2015, Bayesian reinforcement learning: A survey. Foundations and Trends®, Machine Learning, 8, 359, 10.1561/2200000049

Harrell, 1982, A new distribution-free quantile estimator, Biometrika, 69, 635, 10.1093/biomet/69.3.635

Hou, 2015

Kato, 2010, Conic programming for multitask learning, IEEE Transactions on Knowledge and Data Engineering, 22, 957, 10.1109/TKDE.2009.142

Ponemon Institute (2017). cost of cybercrime study: United States. (2017). https://www.accenture.com/us-en/insight-cost-of-cybercrime-2017.

Poupart, 2006, An analytic solution to discrete Bayesian reinforcement learning, Proceedings of the 23rd International Conference on Machine Learning, 697, 10.1145/1143844.1143932

Puterman, 2014

Roychowdhury, S. (2017). Data-Driven Policies for Manufacturing Systems and Cyber Vulnerability Maintenance (Doctoral dissertation, The Ohio State University).

Smallwood, 1973, The optimal control of partially observable Markov processes over a finite horizon, Operations Research, 21, 1071, 10.1287/opre.21.5.1071

Spaan, 2005, PERSEUS: Randomized point-based value iteration for POMDPs, Journal of Artificial Intelligence Research, 24, 195, 10.1613/jair.1659

Srinivasan, 2013, Value of condition monitoring in infrastructure maintenance, Computers & Industrial Engineering, 66, 233, 10.1016/j.cie.2013.05.022

Sutton, 1998, Vol. 135

Walraven, E. (2017), https://github.com/AlgTUDelft/SolvePOMDP (accessed 8-24-2018).

Wang, Y., Won, K. S., Hsu, D. and Lee, W. S. (2012). Monte Carlo Bayesian Reinforcement Learning. arXiv preprint arXiv:1206.6449.

Wiering, M., & Van Otterlo, M. (2012). Reinforcement learning. Adaptation, Learning, and Optimization, 12.

Wilson, 2007, Multi-task reinforcement learning: a hierarchical Bayesian approach, Proceedings of the 24th international conference on Machine learning, 1015, 10.1145/1273496.1273624