Approximation of stationary processes by hidden Markov models

Mathematics of Control, Signals and Systems - Tập 22 - Trang 1-22 - 2010
Lorenzo Finesso1, Angela Grassi1, Peter Spreij2
1ISIB-CNR, Padova, Italy
2Korteweg-de Vries Institute for Mathematics, Universiteit van Amsterdam, Amsterdam, The Netherlands

Tóm tắt

Stochastic realization is still an open problem for the class of hidden Markov models (HMM): given the law Q of an HMM find a finite parametric description of it. Fifty years after the introduction of HMMs, no computationally effective realization algorithm has been proposed. In this paper we direct our attention to an approximate version of the stochastic realization problem for HMMs. We aim at the realization of an HMM of assigned complexity (number of states of the underlying Markov chain) which best approximates, in Kullback Leibler divergence rate, a given stationary law Q. In the special case of Q being the law of an HMM this corresponds to solving the approximate realization problem for HMMs. In general there is no closed form expression of the Kullback Leibler divergence rate, therefore we replace it, as approximation criterion, with the informational divergence between the Hankel matrices of the processes. This not only has the advantage of being easy to compute, while providing a good approximation of the divergence rate, but also makes the problem amenable to the use of nonnegative matrix factorization (NMF) techniques. We propose a three step algorithm, based on the NMF, which realizes an optimal HMM. The viability of the algorithm as a practical tool is tested on a few examples of HMM order reduction.

Tài liệu tham khảo

Anderson BDO (1999) The realization problem for hidden Markov models. Math Control Signals Syst 12: 80–120 Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite Markov chains. Ann Math Stat 37: 1554–1563 Blackwell D (1957) The entropy of functions of finite-state Markov chains Trans. of the first Prague conference on information theory, statistical decision functions, Random Processes, pp 13–20 Carlyle JW (1969) Stochastic finite-state system theory. In: Zadeh L, Polak L (eds) Systems theory, Chapter 10. McGraw-Hill, New York Csiszár I (1975) I-divergence geometry of probability distributions and minimization problems. Ann Probab 3: 146–158 Csiszár I, Tusnády G (1984) Information geometry and alternating minimization procedures. Stat Decis supplement issue 1: 205–237 Finesso L (1990) Consistent estimation of the order for Markov and hidden Markov chains, PhD Thesis Report 91-1, Institute of Systems Research, University of Maryland College Park Finesso L, Spreij PJC (2002) Approximate realization of finite hidden Markov chains. In: Proceedings of the 2002 IEEE information theory workshop, Bangalore, India, pp 90–93 Finesso L, Spreij PJC (2006) Nonnegative matrix factorization and I-divergence alternating minimization. Linear Algebra Appl 416: 270–287 Gray RM (1990) Entropy and information theory. Springer, New York Han G, Marcus B (2006) Analyticity of entropy rate of hidden Markov chains. IEEE Trans Inf Theory 52(12): 5251–5266 Heller A (1965) On stochastic processes derived from Markov chains. Ann Math Stat 36: 1286–1291 Juang BH, Rabiner LR (1985) A probabilistic distance measure for hidden Markov models. AT&T Tech J 64(20): 391–408 Karan M, Anderson BDO, Williamson RC (1993) A note on the calculation of a probabilistic distance between hidden Markov models. In: Proc. ISPACS 93, 2nd international workshop on intelligent signal Proc. Comm. Systems, Sendai, 93–98 Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401: 788–791 LeGland F, Mevel L (2000) Exponential forgetting and geometric ergodicity in HMMs. Math Control Signals Syst 13(1): 63–93 Leroux BG (1992) Maximum-likelihood estimation for hidden Markov models. Stoch Process Appl 40: 127–143 Mevel L, Finesso L (2004) Asymptotical statistics of misspecified hidden Markov models. IEEE Trans Autom Control 49(7): 1123–1132 Norris JR (1998) Markov chains. Cambridge University Press, Cambridge Picci G (1978) On the internal structure of finite state stochastic processes. In: Mohler RR, Ruberti A (eds) Recent developments in variable structure systems, economics and biology, Lecture notes in Economics and Mathematical Systems, vol 162, Springer, Berlin, pp 288–304 Picci G, van Schuppen JH (1984) On the weak finite stochastic realization problem. In: Korezlioglu H, Mazziotto G, Szpirglas J (eds) Filtering and control of random processes, Lecture Notes in Control and Information Sciences, vol 61, Springer, New York, pp 237–242 Rabiner LR, Juang BH (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1): 4–16 Vanluyten B, Willems JC, De Moor B (2006) Matrix factorization and stochastic state representations. In: Proceedings of the 45th IEEE conference on decision and control, San Diego, pp 4188–4193 Vidyasagar M (2005) The realization problem for hidden Markov models: the complete realization problem. In: Proceedings of the 44th conference on decision and control, Seville, pp 6632–6637 Vidyasagar M (2007) Stochastic modelling over a finite alphabet: approximation using the Kullback-Leibler divergence rate. In: Proceedings of the European control conference 2007, Kos, Paper ThA06.1 Wu CFJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11: 95–103 Software code in R for the numerical implementation of the three step algorithm, http://www.isib.cnr.it/~grassi/HMMs