Wilson AW Jr (1987) Hierarchical cache/bus architecture for shared memory multiprocessors. In: Proceedings of the 14th international symposium on computer architecture, pp 244–252
Acacio M, Gonzalez J, Garcia J, Duato J (2005) A two-level directory architecture for highly scalable cc-NUMA multiprocessors. IEEE Trans Parallel Distrib Syst 16(1):67–79
Acacio ME, Gonzalez J, Garcia JM, Duato J (2005) A two-level directory architecture for highly-scalable cc-NUMA multiprocessors. IEEE Trans Parallel Distrib Syst 16(1):67–79
Anderson C, Baer JL (1993) A multi-level hierarchical cache coherence protocol for multiprocessors. In: Proceedings of the 7th international parallel processing symposium, pp 142–148
Angiolini F, Meloni P, Carta S, Benini L, Raffo L (2006) Contrasting a NoC and a traditional interconnect fabric with layout awareness. In: Proc of the design, automation and test in Europe (DATE), pp 124–129
Benini L, Micheli D (2002) Networks on chips: a new SoC paradigm. IEEE Comput 35(1):70–78
Bolotin E, Guz Z, Cidon I, Ginosar R, Kolodny A (2007) The power of priority: NoC based distributed cache coherence. In: Proc of 1st international symposium on networks-on-chip, pp 117–126
Chaike D, Field C, Kurihara K, Agarwal A (1990) Directory-based cache coherence in large-scale multiprocessors. IEEE Comput 23:49–58
Cheng L, Muralimanohar N, Ramani K, Balasubramonian R, Carter JB (2006) Interconnect-aware coherence protocols for chip multiprocessors. ACM SIGARCH Comput Archit News 34(2):339–351
Dally W, Towles B (2001) Route packets, not wires: on-chip interconnection networks. In: Proc of design automation conference, pp 684–689
DeHon A (2000) Compact, multilayer layout for butterfly fat-tree. In: Proceedings of the twelfth annual ACM symposium on parallel algorithms and architectures, SPAA’00, pp 206–215
DeHon A (2004) Unifying mesh- and tree-based programmable interconnect. IEEE Trans Very Large Scale Integr Syst 12(10):1051–1065
Dill DL, Drexler AJ, Hu AJ, Yang CH (1992) Protocol verification as a hardware design aid. In: Proc of international conference on computer design, pp 522–525
Eisley N, Peh LS, Shang L (2006) In-network cache coherence. Comput Archit Lett 5:34–37
Feero B, Pande P (2009) Networks-on-chip in a three-dimensional environment: a performance evaluation. IEEE Trans Comput 58:32–45
Gratz P, Kim C, Sankaralingam K, Hanson H, Shivakumar P, Keckler S, Burger D (2007) On-chip interconnection networks of the trips chip. IEEE MICRO 27(5):41–50
Hennessy JL, Patterson DA (2006) Computer architecture, fourth edition: a quantitative approach. Morgan Kaufmann, San Francisco
Hoskote Y, Vangal S, Singh A, Borkar N, Borkar S (2007) A 5-GHz mesh interconnect for a teraflops processor. IEEE MICRO 27(5):51–61
Kim C, Burger D, Keckler S (2002) An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: Proc of the 10th intl conf on architectural support for programming languages and operating systems, pp 211–222
Leiserson CE (1985) Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans Comput 34(10):892–901
Lenoski D, Laudon J, Gharachorloo K, Gupta A, Hennessy J (1990) The directory-based cache coherence protocol for the DASH multiprocessor. In: Proceedings of the 17th annual international symposium on computer architecture, pp 148–159
Li F, Nicopoulos C, Richardson T, Xie Y, Narayanan V, Kandemir M (2006) Design and management of 3D chip multiprocessors using network-in-memory. In: Proc intl symposium on computer architecture, pp 131–140
Ludovici D, Villamon FG, Medardoni S, Requena CG, Gomez ME, Lopez P, Gaydadjiev GN, Bertozzi D (2009) Assessing fat-tree topologies for regular network-on-chip design under nanoscale technology constraints. In: Proc of design, automation and test in Europe (DATE), pp 562–565
Martin MMK, Hill MD, Wood DA (2003) Token coherence: a new framework for shared-memory multiprocessors. IEEE MICRO 23(6):108–116
Martin MMK, Hill MD, Wood DA (2003) Token coherence: decoupling performance and correctness. In: Proc of international symposium on computer architecture (ISCA), pp 182–193
Matsutani H, Koibuchi M, Amano H (2007) Performance, cost and energy evaluation of fat H-tree: a cost efficient tree-based on-chip network. In: Proc of parallel and distributed processing symposium (IPDPS), pp 1–10
Matsutani H, Koibuchi M, Yamada Y, Hsu DF, Amano H (2009) Fat H-tree: a cost efficient tree-based on-chip network. IEEE Trans Parallel Distrib Syst 20(8):1126–1141
Stern U, Dill DL (1995) Automatic verification of the SCI cache coherence protocol. In: Correct hardware design and verification methods. LNCS, vol 987, pp 21–34
Tomasevic M, Milutinovic V (1993) A survey of hardware solutions for maintenance of cache coherence in shared memory multiprocessors. In: Proceeding of the twenty-sixth Hawaii international conference on system sciences, 1993, vol 1, pp 863–872.
Tsui J, Aboelaze M (1996) Single copy vs. multiple copies cache coherence protocols for hierarchical bus multiprocessors. In: Proceedings of the international conference on computers and communications, pp 151–157
Wallach DA (1992) A hierarchical cache coherent protocol. PhD thesis, MIT
Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao CC, Brown JF III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE MICRO 27(5):15–31
Yang Q, Thangadurai G, Bhuyan LM (1992) Design of an adaptive cache coherence protocol for large scale multiprocessors. IEEE Trans Parallel Distrib Syst 3(3):281–293
Yousif MS, Das CR, Thazhuthaveetil MJ (1993) A cache coherence protocol for MIN-based multiprocessors with limited inclusion. In: International conference on parallel processing, pp 254–257
Zhang Y, Lu Z, Jantsch A, Li L, Gao M (2009) Towards hierarchical cluster based cache coherence for large-scale network-on-chip. In: 4th intl conference on design and technology of integrated systems in nanoscale era (DTIS), pp 119–122