Welcome to P K Kelkar Library, Online Public Access Catalogue (OPAC)

Normal view MARC view ISBD view

A primer on memory consistency and cache coherence

By: Sorin, Daniel J.
Contributor(s): Hill, Mark D | Wood, David A.
Material type: materialTypeLabelBookSeries: Synthesis digital library of engineering and computer science: ; Synthesis lectures on computer architecture: # 16.Publisher: San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool, c2011Description: 1 electronic text (xiii, 197 p.) : ill., digital file.ISBN: 9781608455652 (electronic bk.).Subject(s): Memory management (Computer science) | Cache memory | Distributed shared memory | Computer architecture | Memory consistency | Cache coherence | Shared memory | Memory systems | Multicore processor | MultiprocessorDDC classification: 005.43 Online resources: Abstract with links to resource Also available in print.
Contents:
Preface -- 1. Introduction to consistency and coherence -- Consistency (a.k.a., memory consistency, memory consistency model or memory model) -- Coherence (a.k.a., cache coherence) -- A consistency and coherence quiz -- What this primer does not do --
2. Coherence basics -- Baseline system model -- The problem: how incoherence could possibly occur -- Defining coherence -- Maintaining the coherence invariants -- The granularity of coherence -- The scope of coherence -- References --
3. Memory consistency motivation and sequential consistency -- Problems with shared memory behavior -- What is a memory consistency model -- Consistency vs. coherence -- Basic idea of sequential consistency (SC) -- A little SC formalism -- Naive SC implementations -- A basic SC implementation with cache coherence -- Optimized SC implementations with cache coherence -- Atomic operations with SC -- Putting it all together: MIPS R10000 -- Further reading regarding SC -- References --
4. Total store order and the x86 memory model -- Motivation for TSO/x86 -- Basic idea of TSO/x86 -- A little TSO formalism and an x86 conjecture -- Implementing TSO/x86 -- Atomic instructions and fences with TSO -- Atomic instructions -- Fences -- Further reading regarding TSO -- Comparing SC and TSO -- References --
5. Relaxed memory consistency -- Motivation -- Opportunities to reorder memory operations -- Opportunities to exploit reordering -- An example relaxed consistency model (XC) -- The basic idea of the XC model -- Examples using fences under XC -- Formalizing XC -- Examples showing XC operating correctly -- Implementing XC -- Atomic instructions with XC -- Fences with XC -- A caveat -- Sequential consistency for data-race-free programs -- Some relaxed model concepts -- Release consistency -- Causality and write atomicity -- A relaxed memory model case study: IBM power -- Further reading and commercial relaxed memory models -- Academic literature -- Commercial models -- Comparing memory models -- How do relaxed memory models relate to each other and TSO and SC -- How good are relaxed models -- High-level language models -- References --
6. Coherence protocols -- The big picture -- Specifying coherence protocols -- Example of a simple coherence protocol -- Overview of coherence protocol design space -- States -- Transactions -- Major protocol design options -- References --
7. Snooping coherence protocols -- Introduction to snooping -- Baseline snooping protocol -- High-level protocol specification -- Simple snooping system model: atomic requests -- Atomic transactions -- Baseline snooping system model: non-atomic requests, atomic transactions -- Running example -- Protocol simplifications -- Adding the exclusive state -- Motivation -- Getting to the exclusive state -- High-level specification of protocol -- Detailed specification -- Running example -- Adding the owned state -- Motivation -- High-level protocol specification -- Detailed protocol specification -- Running example -- Non-atomic bus -- Motivation -- In-order vs. out-of-order responses -- Non-atomic system model -- An MSI protocol with a split-transaction bus -- An optimized, non-stalling MSI protocol with a split-transaction bus -- Optimizations to the bus interconnection network -- Separate non-bus network for data responses -- Logical bus for coherence requests -- Case studies -- Sun Starfire E10000 -- IBM Power5 -- Discussion and the future of snooping -- References --
8. Directory coherence protocols -- Introduction to directory protocols -- Baseline directory system -- Directory system model -- High-level protocol specification -- Avoiding deadlock -- Detailed protocol specification -- Protocol operation -- Protocol simplifications -- Adding the exclusive state -- High-level protocol specification -- Detailed protocol specification -- Adding the owned state -- High-level protocol specification -- Detailed protocol specification -- Representing directory state -- Coarse directory -- Limited pointer directory -- Directory organization -- Directory cache backed by DRAM -- Inclusive directory caches -- Null directory cache (with no backing store) -- Performance and scalability optimizations -- Distributed directories -- Non-stalling directory protocols -- Interconnection networks without point-to-point ordering -- Silent vs. non-silent evictions of blocks in state S -- Case studies -- SGI origin 2000 -- Coherent hypertransport -- Hypertransport assist -- Intel QPI -- Discussion and the future of directory protocols -- References --
9. Advanced topics in coherence -- System models -- Instruction caches -- Translation lookaside buffers (TLBS) -- Virtual caches -- Write-through caches -- Coherent direct memory access (DMA) -- Multi-level caches and hierarchical coherence protocols -- Performance optimizations -- Migratory sharing optimization -- False sharing optimizations -- Maintaining liveness -- Deadlock -- Livelock -- Starvation -- Token coherence -- The future of coherence -- References -- Author biographies.
Abstract: Many modern computer systems and most multicore chips (chip multiprocessors) support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a shared memory machine, the memory consistency model defines the architecturally visible behavior of its memory system. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. As part of supporting a memory consistency model, many machines also provide cache coherence protocols that ensure that multiple cached copies of data are kept up-to-date. The goal of this primer is to provide readers with a basic understanding of consistency and coherence. This understanding includes both the issues that must be solved as well as a variety of solutions. We present both high level concepts as well as specific, concrete examples from real-world systems.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
E books E books PK Kelkar Library, IIT Kanpur
Available EBKE335
Total holds: 0

Mode of access: World Wide Web.

System requirements: Adobe Acrobat Reader.

Part of: Synthesis digital library of engineering and computer science.

Series from website.

Includes bibliographical references.

Preface -- 1. Introduction to consistency and coherence -- Consistency (a.k.a., memory consistency, memory consistency model or memory model) -- Coherence (a.k.a., cache coherence) -- A consistency and coherence quiz -- What this primer does not do --

2. Coherence basics -- Baseline system model -- The problem: how incoherence could possibly occur -- Defining coherence -- Maintaining the coherence invariants -- The granularity of coherence -- The scope of coherence -- References --

3. Memory consistency motivation and sequential consistency -- Problems with shared memory behavior -- What is a memory consistency model -- Consistency vs. coherence -- Basic idea of sequential consistency (SC) -- A little SC formalism -- Naive SC implementations -- A basic SC implementation with cache coherence -- Optimized SC implementations with cache coherence -- Atomic operations with SC -- Putting it all together: MIPS R10000 -- Further reading regarding SC -- References --

4. Total store order and the x86 memory model -- Motivation for TSO/x86 -- Basic idea of TSO/x86 -- A little TSO formalism and an x86 conjecture -- Implementing TSO/x86 -- Atomic instructions and fences with TSO -- Atomic instructions -- Fences -- Further reading regarding TSO -- Comparing SC and TSO -- References --

5. Relaxed memory consistency -- Motivation -- Opportunities to reorder memory operations -- Opportunities to exploit reordering -- An example relaxed consistency model (XC) -- The basic idea of the XC model -- Examples using fences under XC -- Formalizing XC -- Examples showing XC operating correctly -- Implementing XC -- Atomic instructions with XC -- Fences with XC -- A caveat -- Sequential consistency for data-race-free programs -- Some relaxed model concepts -- Release consistency -- Causality and write atomicity -- A relaxed memory model case study: IBM power -- Further reading and commercial relaxed memory models -- Academic literature -- Commercial models -- Comparing memory models -- How do relaxed memory models relate to each other and TSO and SC -- How good are relaxed models -- High-level language models -- References --

6. Coherence protocols -- The big picture -- Specifying coherence protocols -- Example of a simple coherence protocol -- Overview of coherence protocol design space -- States -- Transactions -- Major protocol design options -- References --

7. Snooping coherence protocols -- Introduction to snooping -- Baseline snooping protocol -- High-level protocol specification -- Simple snooping system model: atomic requests -- Atomic transactions -- Baseline snooping system model: non-atomic requests, atomic transactions -- Running example -- Protocol simplifications -- Adding the exclusive state -- Motivation -- Getting to the exclusive state -- High-level specification of protocol -- Detailed specification -- Running example -- Adding the owned state -- Motivation -- High-level protocol specification -- Detailed protocol specification -- Running example -- Non-atomic bus -- Motivation -- In-order vs. out-of-order responses -- Non-atomic system model -- An MSI protocol with a split-transaction bus -- An optimized, non-stalling MSI protocol with a split-transaction bus -- Optimizations to the bus interconnection network -- Separate non-bus network for data responses -- Logical bus for coherence requests -- Case studies -- Sun Starfire E10000 -- IBM Power5 -- Discussion and the future of snooping -- References --

8. Directory coherence protocols -- Introduction to directory protocols -- Baseline directory system -- Directory system model -- High-level protocol specification -- Avoiding deadlock -- Detailed protocol specification -- Protocol operation -- Protocol simplifications -- Adding the exclusive state -- High-level protocol specification -- Detailed protocol specification -- Adding the owned state -- High-level protocol specification -- Detailed protocol specification -- Representing directory state -- Coarse directory -- Limited pointer directory -- Directory organization -- Directory cache backed by DRAM -- Inclusive directory caches -- Null directory cache (with no backing store) -- Performance and scalability optimizations -- Distributed directories -- Non-stalling directory protocols -- Interconnection networks without point-to-point ordering -- Silent vs. non-silent evictions of blocks in state S -- Case studies -- SGI origin 2000 -- Coherent hypertransport -- Hypertransport assist -- Intel QPI -- Discussion and the future of directory protocols -- References --

9. Advanced topics in coherence -- System models -- Instruction caches -- Translation lookaside buffers (TLBS) -- Virtual caches -- Write-through caches -- Coherent direct memory access (DMA) -- Multi-level caches and hierarchical coherence protocols -- Performance optimizations -- Migratory sharing optimization -- False sharing optimizations -- Maintaining liveness -- Deadlock -- Livelock -- Starvation -- Token coherence -- The future of coherence -- References -- Author biographies.

Abstract freely available; full-text restricted to subscribers or individual document purchasers.

Compendex

INSPEC

Google scholar

Google book search

Many modern computer systems and most multicore chips (chip multiprocessors) support shared memory in hardware. In a shared memory system, each of the processor cores may read and write to a single shared address space. For a shared memory machine, the memory consistency model defines the architecturally visible behavior of its memory system. Consistency definitions provide rules about loads and stores (or memory reads and writes) and how they act upon memory. As part of supporting a memory consistency model, many machines also provide cache coherence protocols that ensure that multiple cached copies of data are kept up-to-date. The goal of this primer is to provide readers with a basic understanding of consistency and coherence. This understanding includes both the issues that must be solved as well as a variety of solutions. We present both high level concepts as well as specific, concrete examples from real-world systems.

Also available in print.

Title from PDF t.p. (viewed on May 20, 2011).

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha