Automata for learning sequential tasks

New Generation Computing - Tập 16 - Trang 23-53 - 1998
C. H. Ben Choi1
1Department of Electrical Engineering, The Ohio State University, Columbus, USA

Tóm tắt

This paper describes a system that is capable of learning both combinational and sequential tasks. The system learns from sequences of input/output examples in which each pair of input and output represents a step in a task. The system uses finite state machines as its internal models. This paper proposes a method for inferring finite state machines from examples. New algorithms are developed for modifying the finite state machines to allow the system to adapt to changes. In addition, new algorithms are developed to allow the system to handle inconsistent information that may result from noise in the training examples. The system can handle sequential tasks involving long-term dependencies for which recurrent neural networks have been shown to be inadequate. Moreover, the learned finite state machines are easy to be implemented in VLSI. The system has a wide range of applications including but not limited to (a) sequence detection, prediction, and production, (b) intelligent macro systems that can learn rather than simply record sequences of steps performed by a computer user, and (c) design automation systems for designing finite state machines or sequential circuits.

Tài liệu tham khảo

Angluin, D., “A Note on the Number of Queries Needed to Identify Regular Languages,”Information and Control, 1981. Anglui. D., “Learning Regular Sets from Queries and Counter Examples,”Information and Computation, 1987. Angluin, D., and Smith, C. H., “Inductive Inference: Theory and Methods,”Computing Surveys, 15, 3, pp. 237–269, 1983. Barto, A. C. and Anandan, P., “Pattern-Recognizing Stochastic Learning Automata,”IEEE Transactions on Systems, Man, and Cybernetics, SMC-15, 3, pp. 360–375, May/June 1 1985. Bengio, Y., Frasconi, P., and Simard, P., “The Problem of Learning Long-term Dependencies in Recurrent Networks,”IEEE International Conference on Neural Networks, 1993. Bengio, Y., Simard, P., and Frasconi, P., “Learning Long-term Dependencies with Gradient Descent is Difficult,”IEEE Transactions on Neural Networks,5,2, March 1994. Bianchini, M., Gori, M., and Maggini, M., “On the Problem of Local Minima in Recurrent Neural Networks,”IEEE Transactions on Neural Networks,5,2, March 1994. Biermann, A. W. and Feldman, J. A., “On the Synthesis of Finite-State Machines from Samples of Their Behavior,”IEEE Transactions on Computers, pp. 592–597, June 1972. Breeding, K. J., in communication with, Department of Electrical Engineering, The Ohio State University. Carbonell, J. (ed.),Machine Learning: Paradigms and Methods, MIT Press, 1990. Carroll, J. and Long, D.,Theory of Finite Automata with an Introduction to Formal Languages, Prentice-Hall, p. 13, p. 119, 1989. Giles, C. L., Kuhn, G. M., and Williams, R. J., Guest Editors, Special Issue on Dynamic Recurrent Neural Networks,IEEE Transactions on Neural Networks,5,2, March 1994. Grasselli, A. and Luccio, F., “A Method for Minimizing the Number of Internal States in Incompletely Specified Sequential Networks,”IEEE Transactions on Electronic Computers, pp. 350–359, 1965. Hecht-Nielsen, R.,Neurocomputing, Addison-Wesley, Reading, Massachusetts, 1990. Ibarra, O. H. and Jiang, T., “Learning Regular Languages from Counterexamples,”Proceedings of the 1988 Workshop on Computational Learning Theory, 1988. Jordan, M. I., “Attractor Dynamics and Parallelism in Connectionist Sequential Machine,”Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 521–545, 1986. Khanna, T.,Foundations of Neural Networks, Addison-Wesley, 1990. Kohavi, Z.,Switching and Finite Automata Theory, McGraw-Hil,, pp. 333–347, 1978. Lanctot, K. and Oommen, B. J., “Discretized Estimator Learning Automata,”IEEE Transactions on Systems, Man, and Cybernetics,22,6, November/December 1992. Marron, A., “Learning Pattern Languages form a Single Initial Example and from Queries,”Proceedings of the 1988 Workshop on Computational Learning Theory, 1988. McClelland, J. L. and Rumelhart, D. E.,Explorations in Parallel Distributed Processing: A Handbook on Models, Programs, and Exercises, MIT Press, Cambridge, Massachusetts, 1988. McClelland, J. L., Rumelhart, D. E., and the PDP Research Group,Parallel Distributed Processing Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models, MIT Press, Cambridge, Massachusetts, 1987. Meisel, W. S., “A Note on Internal State Minimization in Incompletely Specified Sequential Networks,”IEEE Transactions on Electronic Computers, pp. 508–509, 1967. Michalski, R. S., Carbonell, J. G., and Mitchell, T. M.,Machine Learning: An Artificial Intelligence Approach, Morgan Kaufmann, 1983. Oommen, B. J., Andrade, N., Sitharam, I. S., “Trajectory Planning of Robot Manipulators in Noisy Work Spaces Using Stochastic Automata,”International Journal of Robotics Research, 10, 2, pp. 135–148, 1991. Pitt, L. and Warmuth, M. K., “The Minimum Consistent DFA Problem Cannot Be Approximated within Any Polynomial,”Proceedings of the 21 Annual ACM Symposium on Theory of Computing, pp. 421–432, 1989. Porat, S. and Feldman, J. A., “Learning Automata from Ordered Examples,”Proceedings of the 1988 Workshop on Computational Learning Theory, 1988, also onMachine Learning, 1991. Rivest, R., Schapire, L., and Robert, E., “Inference of Finite Automata Using Homing Sequences,”Proceedings of the Twenty First Annual ACM Symposium on Theory of Computing, pp. 411–420, May 15–17, 1989. Rouvellou, I. and Hart, G. W., “Inference of a Probabilistic Finite State Machine from Its Output,”IEEE Transactions on Systems, Man, and Cybernetics, 25, 3, pp. 424–437, March 1995. Rumelhart, D. E., McClelland, J. L., and the PDP Research Group,Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations, MIT Press, Cambridge, Massachusetts, 1987.