HYEONG SOO CHANG

Professor,
Department of Computer Science and Engineering,
Sogang University, Seoul, Korea.

RESEARCH INTERESTS

      Markov Decision Process, Stochastic Game Theory, Stochastic Optimization,
      Computational Intelligence & Learning Theory

PUBLICATIONS

Books
  1. H. S. Chang, J. Hu, M. Fu, and S. I. Marcus, "Simulation-based Algorithms for Markov Decision Processes,"
    Springer, 2nd Edition, ISBN: 978-1-4471-5021-3, 2013. (1st Edition: ISBN: 978-1-84628-689-6, 2007).
Journals
  1. R. L. Givan, E. K. P. Chong, and H. S. Chang, "Scheduling Multiclass Packet Streams to Minimize Weighted Loss,"
    Queueing Systems, Vol. 41, No.3, 2002, pp. 241-270.
  2. H. S. Chang, P. Fard, S. I. Marcus, and M. Shayman, "Multi-time Scale Markov Decision Processes,"
    IEEE Trans. on Automatic Control, Vol. 48, No. 6, 2003, pp. 976-987.
  3. H. S. Chang and S. I. Marcus, "Approximate Receding Horizon Approach for Markov Decision Processes: Average Reward Case,"
    J. of Mathematical Analysis and Applications, Vol. 286, No. 2, 2003, pp. 636-651.
  4. H. S. Chang, "Localization and A Distributed Local Optimal Solution Algorithm for a Class of Multi-agent Markov Decision Processes,"
    Int. J. of Control, Automation, and Systems, Vol. 1, No. 3, 2003, pp. 358-367.
  5. H. S. Chang and S. I. Marcus, "Two-Person Zero-Sum Markov Games: Receding Horizon Approach,"
    IEEE Trans. on Automatic Control, Vol. 48, No. 11, 2003, pp. 1951-1961.
  6. H. S. Chang, R. L. Givan, and E. K. P. Chong, "Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes,"
    Discrete Event Dynamic Systems, Vol. 14, No. 3, 2004, pp. 309-341.
  7. H. S. Chang, "On Ordinal Comparison of Policies in Markov Reward Processes,"
    J. of Optimization Theory and Applications, Vol. 122, No. 1, 2004, pp. 207-217.
  8. H. S. Chang, "Multi-policy Iteration with a Distributed Voting,"
    Mathematical Methods of Operations Research, Vol. 60, No. 2, 2004.11, pp. 299-310.
  9. H. S. Chang, "A Model for Multi-time Scaled Sequential Decision Making Processes with Adversary,"
    Mathematical and Computer Modeling of Dynamical Systems, Vol. 10, No. 3-4, 2004.12, pp. 287-302.
  10. H. S. Chang, M. Fu, J. Hu, and S. I. Marcus, "An Adaptive Sampling Algorithm for Solving Markov Decision Processes,"
    Operations Research, Vol. 53, No. 1, 2005.1, pp. 126-139.
    • ORMS Today, Vol. 43, No. 5, 2016.10, pp. 24-29: Google DeepMind's Alphago O.R.'s unheralded role in the path-breaking achievement
    • UMD News: Contribution to Google's AlphaGo AI system | Researchers write about Google's Alphago for OR/MS Today
    • WIKIPEDIA: Monte Carlo Tree Search (MCTS) (cf., history of MCTS and exploration and exploitation)
    • The paper, "Markov Decision Processes, AlphaGo, and Monte Carlo Tree Search: Back to the Future," by Michael Fu,
      Leading Developments from INFORMS Communities : INFORMS Tutorials in Operations Research, 2017.10, pp. 68-88 Abstract:

      In 2016, a computer Go-playing program called AlphaGo stunned the (human) world by winning a match (4 games to 1) against the reigning human world champion, a feat more impressive than previous victories by computer programs in chess (Deep Blue) and the TV game show Jeopardy! (Watson). The main engine behind AlphaGo combines machine learning approaches in the form of deep neural networks with a technique called Monte Carlo tree search, whose roots can be traced back to an adaptive multistage sampling simulation-based algorithm for Markov decision processes (MDPs) published in Operations Research in 2005 [H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus. An adaptive sampling algorithm for solving Markov decision processes. Operations Research 53(1):126139, 2005] (and introduced even earlier in 2002). This tutorial describes AlphaGo and the simulation-based MDP algorithm, as well as providing contextual and historical background material for both, and uses simple examples to illustrate the main ideas behind Monte Carlo tree search.


  11. H. S. Chang, "Multi-policy Improvement in Stochastic Optimization with Forward Recursive Function Criteria,"
    J. of Mathematical Analysis and Applications, Vol. 305, No. 1, 2005.5, pp. 130-139.
  12. H. S. Chang, "On the Probability of Correct Selection by Distributed Voting in Stochastic Optimization,"
    J. of Optimization Theory and Applications, Vol. 125, No. 1, 2005.4, pp. 231-240.
  13. H. S. Chang, "Error Bounds for Finite Step Approximations for Solving Infinite Horizon Controlled Markov Set-Chains,"
    IEEE Trans. on Automatic Control, Vol. 50, No. 9, 2005.9, pp. 1413-1418.
  14. H. S. Chang, H-G. Lee, M. Fu, and S. I. Marcus, "Evolutionary Policy Iteration for Solving Markov Decision Processes,"
    IEEE Trans. on Automatic Control, Vol. 50, No. 11, 2005.11, pp. 1804-1808.
  15. H. S. Chang, "Converging Marriage in Honey-Bees Optimization and Application to Stochastic Dynamic Programming,"
    J. of Global Optimization, Vol. 35, No. 3, 2006.7, pp. 423-441.
  16. H. S. Chang, "On Convergence Rate of the Shannon Entropy Rate of Ergodic Markov Chains via Sample-path Simulation,"
    Statistics & Probability Letters, Vol. 76, No. 12, 2006.7, pp. 1261-1264.
  17. H. S. Chang, "Perfect Information Two-Person Zero-Sum Markov Games with Imprecise Transition Probabilities,"
    Mathematical Methods of Operations Research, Vol. 64, No. 2, 2006.10, pp. 335-351.
  18. H. S. Chang, "A Policy Improvement Method in Constrained Stochastic Dynamic Programming,"
    IEEE Trans. on Automatic Control, Vol. 51, No. 9, 2006.9, pp. 1523-1526.
  19. H. S. Chang, M. Fu, J. Hu, and S. I. Marcus, "A Survey of Some Simulation-Based Algorithms for Markov Decision Processes,"
    Communications in Information and Systems, Vol. 7, No. 1, 2007.7, pp. 59-92.
  20. H. S. Chang, "A Policy Improvement Method for Constrained Average Markov Decision Processes,"
    Operations Research Letters, Vol. 35, No. 4, 2007.7, pp. 434-438.
  21. H. S. Chang, M. Fu, J. Hu, and S. I. Marcus, "An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming,"
    IEEE Trans. on Automatic Control, Vol. 52, No.1, 2007.1, pp. 89-94.
  22. H. S. Chang and E. K. P. Chong, "Solving Controlled Markov Set-Chains with Discounting via Multi-policy Improvement,"
    IEEE Trans. on Automatic Control, Vol. 52, No.3, 2007.3, pp. 564-569.
  23. H. S. Chang, M. Fu, J. Hu, and S. I. Marcus, "Recursive Learning Automata Approach to Markov Decision Processes,"
    IEEE Trans. on Automatic Control, Vol. 52, No.7, 2007.7, pp. 1349-1355.
  24. H. S. Chang, "Finite Step Approximation Error Bounds for Solving Average Reward Controlled Markov Set-Chains,"
    IEEE Trans. on Automatic Control, Vol. 53, No.1, 2008.2, pp. 350-355.
  25. H. S. Chang, "Converging Co-Evolutionary Algorithm for Two-Person Zero-Sum Discounted Markov Games with Perfect Information,"
    IEEE Trans. on Automatic Control, Vol. 53, No.2, 2008.3, pp. 596-601.
  26. H. S. Chang, "Decentralized Learning in Finite Markov Chains: Revisited,"
    IEEE Trans. on Automatic Control, Vol. 54, No.7, 2009.7, pp. 1648-1653.
  27. H. S. Chang, J. Hu, M. Fu, and S. I. Marcus, "Adaptive Adversarial Multi-Armed Bandit Approach to Two-Person Zero-Sum Markov Games,"
    IEEE Trans. on Automatic Control, Vol. 55, No. 2, 2010.2, pp. 463-468.
  28. J. Hu, H. S. Chang, M. Fu, and S. I. Marcus, "Dynamic Sample Budget Allocation in Model-Based Optimization,"
    J. of Global Optimization, Vol. 50, No. 4, 2011.8, pp. 575-596.
  29. J. Hu, P. Hu, and H. S. Chang, "A Stochastic Approximation Framework for a Class of Randomized Optimization Algorithms,"
    IEEE Trans. on Automatic Control, Vol. 57, No. 1, 2012.1, pp. 165-178.
  30. H. S. Chang, "A Policy Iteration Heuristic for Constrained Discounted Controlled Markov Chains,"
    Optimization Letters, Vol. 6, No. 7, 2012.10, pp. 1573-1577.
  31. J. Hu and H. S. Chang, "Approximate Stochastic Annealing for Online Control of Infinite Horizon Markov Decision Processes,"
    Automatica, Vol. 48, No. 9, 2012.9, pp. 2182-2188.
  32. H. S. Chang and J. Hu, "On the Probability of Correct Selection in Ordinal Comparison over Dynamic Networks,"
    J. of Optimization Theory and Applications, Vol. 155, No. 2, 2012.11, pp. 594-604.
  33. H. S. Chang, "On Functional Equations for Kth Best Policies in Markov Decision Processes,"
    Automatica, Vol. 49, No. 1, 2013.1, pp. 297-300.
  34. H. S. Chang, "Policy Set Iteration for Markov Decision Processes,"
    Automatica, Vol. 49, No. 12, 2013.12, pp. 3687-3689.
  35. H. S. Chang, "A Necessary Condition for Nash Equilibrium in Two-Person Zero-Sum Constrained Stochastic Games,"
    Game Theory, Vol. 2013, Article ID 290427, 5 pages, 2013.12.
  36. H. S. Chang, "On Modification of Population-based Search Algorithms for Convergence in Stochastic Combinatorial Optimization,"
    Optimization, Vol. 64, No. 7, 2015.7, pp. 1647-1655.
  37. H. S. Chang, "An Exact Iterative Search Algorithm for Constrained Markov Decision Processes,"
    Automatica, Vol. 50, No. 5, 2014.5, pp. 1531-1534.
  38. H. S. Chang, "Value Set Iteration for Markov Decision Processes,"
    Automatica, Vol. 50, No. 7, 2014.7, pp. 1940-1943. Corrigendum
  39. H. S. Chang, "Sleeping Experts and Bandits Approach to Constrained Markov Decision Processes,"
    Automatica, Vol. 63, No. 1, 2016.1, pp. 182-186.
  40. H. S. Chang and S. Choe, "Combining Multiple Strategies for Multi-armed Bandit Problems and Asymptotic Optimality,"
    Journal of Control Science and Engineering, Vol. 2015, Article ID 264953, 7 pages, 2015.3.
  41. H. S. Chang, "Random Search for Constrained Markov Decision Processes with Multi-policy Improvement,"
    Automatica, Vol. 58, No. 8, 2015.8, pp. 127-130.
  42. H. S. Chang, "Value Set Iteration for Two-Person Zero-Sum Markov Games,"
    Automatica, Vol. 76, No. 2, 2017.2, pp. 61-64.

MISC.

SOME ARTICLES