000 | 06034nam a2200733 i 4500 | ||
---|---|---|---|
001 | 6828870 | ||
003 | IEEE | ||
005 | 20200413152914.0 | ||
006 | m eo d | ||
007 | cr cn |||m|||a | ||
008 | 140620s2014 caua foab 000 0 eng d | ||
020 |
_a9781608459537 _qebook |
||
020 |
_z9781608459520 _qpaperback |
||
024 | 7 |
_a10.2200/S00581ED1V01Y201405CAC028 _2doi |
|
035 | _a(CaBNVSL)swl00403529 | ||
035 | _a(OCoLC)881524638 | ||
040 |
_aCaBNVSL _beng _erda _cCaBNVSL _dCaBNVSL |
||
050 | 4 |
_aQA76.9.M45 _bF256 2014 |
|
082 | 0 | 4 |
_a005.43 _223 |
090 |
_a _bMoCl _e201405CAC028 |
||
100 | 1 |
_aFalsafi, Babak., _eauthor. |
|
245 | 1 | 2 |
_aA primer on hardware prefetching / _cBabak Falsafi, Thomas F. Wenisch. |
264 | 1 |
_aSan Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) : _bMorgan & Claypool, _c2014. |
|
300 |
_a1 PDF (xiv, 53 pages) : _billustrations. |
||
336 |
_atext _2rdacontent |
||
337 |
_aelectronic _2isbdmedia |
||
338 |
_aonline resource _2rdacarrier |
||
490 | 1 |
_aSynthesis lectures on computer architecture, _x1935-3243 ; _v# 28 |
|
538 | _aMode of access: World Wide Web. | ||
538 | _aSystem requirements: Adobe Acrobat Reader. | ||
500 | _aPart of: Synthesis digital library of engineering and computer science. | ||
500 | _aSeries from website. | ||
504 | _aIncludes bibliographical references (pages 41-52). | ||
505 | 0 | _a1. Introduction -- 1.1 The memory wall -- 1.2 Prefetching -- 1.2.1 Predicting addresses -- 1.2.2 Prefetch lookahead -- 1.2.3 Placing prefetched values -- | |
505 | 8 | _a2. Instruction prefetching -- 2.1 Next-line prefetching -- 2.2 Fetch-directed prefetching -- 2.3 Discontinuity prefetching -- 2.4 Prescient fetch -- 2.5 Temporal instruction fetch streaming -- 2.6 Return-address stack-directed instruction prefetching -- 2.7 Proactive instruction fetch -- | |
505 | 8 | _a3. Data prefetching -- 3.1 Stride and stream prefetchers for data -- 3.2 Address-correlating prefetchers -- 3.2.1 Jump pointers -- 3.2.2 Pair-wise correlation -- 3.2.3 Markov prefetcher -- 3.2.4 Improving lookahead via prefetch depth -- 3.2.5 Improving lookahead via dead block prediction -- 3.2.6 Addressing on-chip storage limitations -- 3.2.7 Global history buffer -- 3.2.8 Stream chaining -- 3.2.9 Temporal memory streaming -- 3.2.10 Irregular stream buffer -- 3.3 Spatially correlated prefetching -- 3.3.1 Delta-correlated lookup -- 3.3.2 Global history buffer PC-localized/delta-correlating (GHB PC/DC) -- 3.3.3 Code-correlated lookup -- 3.3.4 Spatial footprint prediction -- 3.3.5 Spatial pattern prediction -- 3.3.6 Stealth prefetching -- 3.3.7 Spatial memory streaming -- 3.3.8 Spatio-temporal memory streaming -- 3.4 Execution-based prefetching -- 3.4.1 Algorithm summarization -- 3.4.2 Helper-thread and helper-core approaches -- 3.4.3 Run-ahead execution -- 3.4.4 Context restoration -- 3.4.5 Computation spreading -- 3.5 Prefetch modulation and control -- 3.6 Software approaches -- | |
505 | 8 | _a4. Concluding remarks-- Bibliography -- Author biographies. | |
506 | 1 | _aAbstract freely available; full-text restricted to subscribers or individual document purchasers. | |
510 | 0 | _aCompendex | |
510 | 0 | _aINSPEC | |
510 | 0 | _aGoogle scholar | |
510 | 0 | _aGoogle book search | |
520 | 3 | _aSince the 1970's, microprocessor-based digital platforms have been riding Moore's law, allowing for doubling of density for the same area roughly every two years. However, whereas microprocessor fabrication has focused on increasing instruction execution rate, memory fabrication technologies have focused primarily on an increase in capacity with negligible increase in speed. This divergent trend in performance between the processors and memory has led to a phenomenon referred to as the "Memory Wall." To overcome the memory wall, designers have resorted to a hierarchy of cache memory levels, which rely on the principal of memory access locality to reduce the observed memory access time and the performance gap between processors and memory. Unfortunately, important workload classes exhibit adverse memory access patterns that baffle the simple policies built into modern cache hierarchies to move instructions and data across cache levels. As such, processors often spend much time idling upon a demand fetch of memory blocks that miss in higher cache levels. Prefetching--predicting future memory accesses and issuing requests for the corresponding memory blocks in advance of explicit accesses--is an effective approach to hide memory access latency. There have been a myriad of proposed prefetching techniques, and nearly every modern processor includes some hardware prefetching mechanisms targeting simple and regular memory access patterns. This primer offers an overview of the various classes of hardware prefetchers for instructions and data proposed in the research literature, and presents examples of techniques incorporated into modern microprocessors. | |
530 | _aAlso available in print. | ||
588 | _aTitle from PDF title page (viewed on June 20, 2014). | ||
650 | 0 | _aMemory management (Computer science) | |
653 | _ahardware prefetching | ||
653 | _anext-line prefetching | ||
653 | _abranch-directed prefetching | ||
653 | _adiscontinuity prefetching | ||
653 | _astride prefetching | ||
653 | _aaddress-correlated prefetching | ||
653 | _aMarkov prefetcher | ||
653 | _aglobal history buffer | ||
653 | _atemporal memory streaming | ||
653 | _aspatial memory streaming | ||
653 | _aexecution-based prefetching | ||
700 | 1 |
_aWenisch, Thomas F., _eauthor. |
|
776 | 0 | 8 |
_iPrint version: _z9781608459520 |
830 | 0 | _aSynthesis digital library of engineering and computer science. | |
830 | 0 |
_aSynthesis lectures in computer architecture ; _v# 28. _x1935-3243 |
|
856 | 4 | 2 |
_3Abstract with links to resource _uhttp://ieeexplore.ieee.org/servlet/opac?bknumber=6828870 |
856 | 4 | 0 |
_3Abstract with links to full text _uhttp://dx.doi.org/10.2200/S00581ED1V01Y201405CAC028 |
999 |
_c562076 _d562076 |