000 | 06649nam a22006491i 4500 | ||
---|---|---|---|
001 | 8363085 | ||
003 | IEEE | ||
005 | 20200413152930.0 | ||
006 | m eo d | ||
007 | cr cn |||m|||a | ||
008 | 180525s2018 caua foab 000 0 eng d | ||
020 |
_a9781627056182 _qebook |
||
020 |
_z9781627059237 _qpaperback |
||
020 |
_z9781681733586 _qhardcover |
||
024 | 7 |
_a10.2200/S00848ED1V01Y201804CAC044 _2doi |
|
035 | _a(CaBNVSL)swl408345 | ||
035 | _a(OCoLC)1037800609 | ||
040 |
_aCaBNVSL _beng _erda _cCaBNVSL _dCaBNVSL |
||
050 | 4 |
_aT385 _b.A243 2018 |
|
082 | 0 | 4 |
_a006.6869 _223 |
100 | 1 |
_aAamodt, Tor M., _eauthor. |
|
245 | 1 | 0 |
_aGeneral-purpose graphics processor architectures / _cTor M. Aamodt, Wilson Wai Lun Fung, Timothy G. Rogers. |
264 | 1 |
_a[San Rafael, California] : _bMorgan & Claypool, _c2018. |
|
300 |
_a1 PDF (xvii, 122 pages) : _billustrations. |
||
336 |
_atext _2rdacontent |
||
337 |
_aelectronic _2isbdmedia |
||
338 |
_aonline resource _2rdacarrier |
||
490 | 1 |
_aSynthesis lectures on computer architecture, _x1935-3243 ; _v# 44 |
|
538 | _aMode of access: World Wide Web. | ||
500 | _aPart of: Synthesis digital library of engineering and computer science. | ||
504 | _aIncludes bibliographical references (pages 103-119). | ||
505 | 0 | _a1. Introduction -- 1.1 The landscape of computation accelerators -- 1.2 GPU hardware basics -- 1.3 A brief history of GPUs -- 1.4 Book outline -- | |
505 | 8 | _a2. Programming model -- 2.1 Execution model -- 2.2 GPU instruction set architectures -- 2.2.1 NVIDIA GPU instruction set architectures -- 2.2.2 AMD graphics core next instruction set architecture -- | |
505 | 8 | _a3. The SIMT core: instruction and register data flow -- 3.1 One-loop approximation -- 3.1.1 SIMT execution masking -- 3.1.2 SIMT deadlock and stackless SIMT architectures -- 3.1.3 Warp scheduling -- 3.2 Two-loop approximation -- 3.3 Three-loop approximation -- 3.3.1 Operand collector -- 3.3.2 Instruction replay: handling structural hazards -- 3.4 Research directions on branch divergence -- 3.4.1 Warp compaction -- 3.4.2 Intra-warp divergent path management -- 3.4.3 Adding MIMD capability -- 3.4.4 Complexity-effective divergence management -- 3.5 Research directions on scalarization and affine execution -- 3.5.1 Detection of uniform or affine variables -- 3.5.2 Exploiting uniform or affine variables in GPU -- 3.6 Research directions on register file architecture -- 3.6.1 Hierarchical register file -- 3.6.2 Drowsy state register file -- 3.6.3 Register file virtualization -- 3.6.4 Partitioned register file -- 3.6.5 RegLess -- | |
505 | 8 | _a4. Memory system -- 4.1 First-level memory structures -- 4.1.1 Scratchpad memory and L1 data cache -- 4.1.2 L1 texture cache -- 4.1.3 Unified texture and data cache -- 4.2 On-chip interconnection network -- 4.3 Memory partition unit -- 4.3.1 L2 cache -- 4.3.2 Atomic operations -- 4.3.3 Memory access scheduler -- 4.4 Research directions for GPU memory systems -- 4.4.1 Memory access scheduling and interconnection network design -- 4.4.2 Caching effectiveness -- 4.4.3 Memory request prioritization and cache bypassing -- 4.4.4 Exploiting inter-warp heterogeneity -- 4.4.5 Coordinated cache bypassing -- 4.4.6 Adaptive cache management -- 4.4.7 Cache prioritization -- 4.4.8 Virtual memory page placement -- 4.4.9 Data placement -- 4.4.10 Multi-chip-module GPUs -- | |
505 | 8 | _a5. Crosscutting research on GPU computing architectures -- 5.1 Thread scheduling -- 5.1.1 Research on assignment of threadblocks to cores -- 5.1.2 Research on cycle-by-cycle scheduling decisions -- 5.1.3 Research on scheduling multiple kernels -- 5.1.4 Fine-grain synchronization aware scheduling -- 5.2 Alternative ways of expressing parallelism -- 5.3 Support for transactional memory -- 5.3.1 Kilo TM -- 5.3.2 Warp TM and temporal conflict detection -- 5.4 Heterogeneous systems -- | |
505 | 8 | _aBibliography -- Authors' biographies. | |
506 | _aAbstract freely available; full-text restricted to subscribers or individual document purchasers. | ||
510 | 0 | _aCompendex | |
510 | 0 | _aINSPEC | |
510 | 0 | _aGoogle scholar | |
510 | 0 | _aGoogle book search | |
520 | 3 | _aOriginally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose programmability makes contemporary GPUs appealing to software developers in comparison to domain-specific accelerators. This book provides an introduction to those interested in studying the architecture of GPUs that support general-purpose computing. It collects together information currently only found among a wide range of disparate sources. The authors led development of the GPGPU-Sim simulator widely used in academic research on GPU architectures. The first chapter of this book describes the basic hardware structure of GPUs and provides a brief overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of the GPU memory system. After describing the architecture of existing systems, Chapters 3 and 4 provide an overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core and memory system. This book should provide a valuable resource for those wishing to understand the architecture of graphics processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain an introduction to the rapidly growing body of research exploring how to improve the architecture of these GPUs. | |
530 | _aAlso available in print. | ||
588 | _aTitle from PDF title page (viewed on May 25, 2018). | ||
650 | 0 | _aGraphics processing units. | |
650 | 0 | _aComputer architecture. | |
653 | _aGPGPU | ||
653 | _aComputer architecture | ||
655 | 0 | _aElectronic books. | |
700 | 1 |
_aFung, Wilson Wai Lun, _eauthor. |
|
700 | 1 |
_aRogers, Timothy G., _eauthor. |
|
776 | 0 | 8 |
_iPrint version: _z9781627059237 _z9781681733586 |
830 | 0 | _aSynthesis digital library of engineering and computer science. | |
830 | 0 |
_aSynthesis lectures in computer architecture ; _v# 44. _x1935-3243 |
|
856 | 4 | 2 |
_3Abstract with links to resource _uhttps://ieeexplore.ieee.org/servlet/opac?bknumber=8363085 |
999 |
_c562377 _d562377 |