Memory referencing characteristics and caching performance of AND-Parallel Prolog on shared-memory multiprocessors

New Generation Computing - Tập 7 - Trang 37-58 - 1989
M. Hermenegildo, E. Tick1
1University of Oregon USA

Tóm tắt

This paper presents the performance analysis results for the RAP-WAM AND-Parallel Prolog architecture on shared-memory multiprocessor organizations. The goal of this parallel model is to provide inference speeds beyond those attainable in sequential systems, while supporting conventional logic programming semantics. Special emphasis is placed on sequential performance, storage efficiency, and low control overhead. First, the concepts and techniques used in the parallel execution model are described, along with the general methodology, benchmarks, and simulation tools used for its evaluation. Results are given both at the memory reference level and at the memory organization level. A two-level shared-memory architecture model is presented together with an analysis of various solutions to the cache coherency problem. Finally, RAP-WAM shared-memory simulation results are presented. It is argued that the RAP-WAM model can exploit coherent caches and attain speeds in excess of 2 MLIPS with current shared-memory multiprocessing technology for real applications that exhibit medium degrees of parallelism.

Tài liệu tham khảo

Archibald, J., High Performance Cache Coherence Protocols For Shared-Bus Multiprocessors. Technical Report 86-06-02, University of Washington, Seattle, WA 98195, June 1986. Asakawa, Y., Komatsu, H., Kurokawa, T. and Tamura, N., A Very Fast Prolog Compiler on Multiple Architectures. InFall Joint Computer Conference. ACM and IEEE Computer Society, November 1986. Bitar, P. and Despain, M., Multiprocessor Cache Synchronization. In13th Annual International Symposium on Computer Architecture, pp. 424–433. IEEE Computer Society, June 1986. Borgwardt, P. and Rea, D., Distributed Semi-Intelligent Backtracking for a Stack-Based AND-Parallel Prolog. In1986 International Symposium on Logic Programming, pp. 211–222. IEEE Computer Society, 1986. Butler, R., Lusk, E. L., Olson, R. and Overbeek, R. A., ANLWAM: A Parallel Implementation of the Warren Abstract Machine. Internal report, Argonne National Laboratory, Argonne, IL 60439, 1986. Chang, J. H., Despain, A. M. and DeGroot, D., AND-Parallelism of Logic Programs Based on Static Data Dependency Analysis. InDigest of Papers of COMPCON Spring ’85, pp. 218–225, 1985. Ciepielewski, A. and Haridi, S., Control of Activities in the OR-Parallel Token Machine. In1984 International Symposium on Logic Programming, Atlantic City, pp. 49–58, Silver Spring, MD, February 1984. IEEE Computer Society Press. Clocksin, W. F. and Mellish, C. S.,Programming in Prolog, Springer-Verlag, Berlin, New York, 1981. Codish, M. and Shapiro, E., Compiling OR-Parallelism into AND-Parallelism. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 283–298. Springer-Verlag, July 1986. Conery, J. S.,The AND/OR Process Model for Parallel Interpretation of Logic Programs. PhD thesis, The University of California at Irvine, 1983. Technical Report 204. Conery, J. S.,Parallel Execution of Logic Programs. Kluwer Academic Publishers, Norwell, MA 02061, 1987. Fairchild Corp. Introduction to the CLIPPER Architecture. Fairchild Camera and Instrument Corp., Palo Alto CA 94304, 1986. Degroot, D., Restricted AND-Parallelism. InInternational Conference on Fifth Generation Computer Systems, pp. 471–478, November 1984. DeGroot, D., Restricted AND-Parallelism and Side-Effects. InSymposium on Logic Programming, pp. 80–89. IEEE Computer Society, August 1987. Fagin, B. and Despain, A., Performance Studies of a Parallel Prolog Architecture. In14th Annual International Symposium on Computer Architecture, pp. 108–116. IEEE Computer Society, June 1987. Gabriel, R. P.,Performance and Evaluation of Lisp Systems. MIT Press, Cambridge MA, 1985. Also available from Stanford University Computer Science Dept. as Research Paper 111. Gibson, D. H., Considerations in Block-Oriented Systems Design. InAFIPS Conference Proceedings, Vol. 30, pp. 75–80. Spring Joint Computer Conference, Academic Press, April 1967. Goodman, J. R., Using Cache Memory to Reduce Processor-Memory Traffic. In10th Annual International Symposium on Computer Architecture, pp. 124–131. IEEE Computer Society, 1983. Goto, A., Parallel Inference Machine Research in FGCS Project. InProceedings of the First Japan-U.S. AI Symposium, pp. 21–36, December 1987. Gregory, S.,Parallel Logic Programming in PARLOG: The Language and its Implementation. Addison-Wesley Ltd., Wokingham, England, 1987. Hermenegildo, M. V.,An Abstract Machine Based Execution Model for Computer Architecture Design and Efficient Implementation of Logic Programs in Parallel. PhD thesis, Dept. of Electrical and Computer Engineering (Dept. of Computer Science TR-86-20), University of Texas at Austin, Austin, Texas 78712, August 1986. Hermenegildo, M. V., An Abstract Machine for Restricted AND-parallel Execution of Logic Programs. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 25–40. Springer-Verlag, 1986. Hermenegildo, M. V., Relating Goal Scheduling, Precedence, and Memory Management in AND-Parallel Execution of Logic Programs. InFourth International Conference on Logic Programming. University of Melbourne, MIT Press, Cambridge MA, May 1987. Hermenegildo, M. V. and Nasr, R. I., Efficient Management of Backtracking in AND-parallelism. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 40–55. Springer-Verlag, 1986. Kowalski, R. A., Predicate Logic as a Programming Language. InProceedings IFIPS, pp. 569–574, 1974. Lin, Y.-J.,A Parallel Implementation of Logic Programs. PhD thesis, Dept. of Computer Science, University of Texas at Austin, Austin, Texas 78712, August 1988. Lin, Y.-J., Kumar, V. and Leung, C., An Intelligent Backtracking Algorithm for Parallel Execution of Logic Programs. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 55–69. Springer-Verlag, 1986. Mellish, C. S., Some Global Optimizations for a Prolog Compiler.Journal of Logic Programming, Vol. 2, No. 1, April 1985. Nakashima, H. and Nakajima, K., Hardware Architecture of the Sequential Inference Machine: PSI-II. In1987 International Symposium on Logic Programming, pp. 104–113. IEEE Computer Society, August 1987. Pereira, L. M. and Nasr, R. I., Delta-Prolog: A Distributed Logic Programming Language. InProceedings of the Intl. Conf. on 5th. Gen. Computer Systems, 1984. Japan. Van Roy, P., A Prolog Compiler for the PLM. Master’s thesis, University of California at Berkeley, August 1984. Also available as Technical Report UCB/CSD 84/203. Van Roy, P. and Demoen, B., Improving the Execution Speed of Compiled Prolog with Modes, Clause Selection, and Determinism. Technical report, University of Leuven, August 1986. Shapiro, E. Y., A Subset of Concurrent Prolog and Its Interpreter. Technical Report TR-003, ICOT, 1-4-28 Mita, Minato-ku Tokyo 108, JAPAN, January 1983. Shen, K., An Investigation of the Argonne Model of OR-Parallel Prolog. Master’s thesis, The University of Manchester, November 1986. Tick, E.,Studies In Prolog Architectures. PhD thesis. Stanford University, Stanford, CA 94305, June 1987. Turk, A. K., Compiler Optimizations for the WAM. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 657–662. Springer-Verlag, 1986. Ueda, K.,Guarded Horn Clauses. PhD thesis, University of Tokyo, March 1986. Ueda, K., Making Exhaustive Search Programs Determinstic. InProceedings of the Third International Conference on Logic Programming, number 225 in Lecture Notes in Computer Science, pp. 270–283. Springer-Verlag, 1986. Ueda, K., Making Exhaustive Search Programs Deterministic: Part II. InFourth International Conference on Logic Programming, pp. 356–375. University of Melbourne, MIT Press, Cambridge MA, 1987. Warren, D. H. D.,Applied Logic—Its Use and Implementation as Programming Tool. PhD thesis, University of Edinburgh, 1977. Also available as SRI Technical Note 290. Warren, D. H. D., An Abstract Prolog Instruction Set. Technical Report 309, Artificial Intelligence Center, SRI International, 333 Ravenswood Ave, Menlo Park CA 94025, 1983. Warren, D. H. D., OR-Parallel Execution Models of Prolog. InProceedings of TAPSOFT ’87, Lecture Notes in Computer Science. Springer-Verlag, March 1987. Warren, D. H. D., The SRI Model for OR-Parallel Execution of Prolog—Abstract Design and Implementation. InSymposium on Logic Programming, pp. 92–102. IEEE Computer Society, August 1987. Warren, R., Hermenegildo, M. and Debray, S., On the Practicality of Global Flow Analysis of Logic Programs. InProceedings of the Fifth International Conference and Symposium on Logic Programming, August 1988. Westphal, H. and Robert, P., The PEPSys Model: Combining Backtracking, AND- and OR-Parallelism. InSymposium of Logic Programming, pp. 436–448. IEEE Computer Society, August 1987.