000 06034nam a2200733 i 4500
001 6828870
003 IEEE
005 20200413152914.0
006 m eo d
007 cr cn |||m|||a
008 140620s2014 caua foab 000 0 eng d
020 _a9781608459537
_qebook
020 _z9781608459520
_qpaperback
024 7 _a10.2200/S00581ED1V01Y201405CAC028
_2doi
035 _a(CaBNVSL)swl00403529
035 _a(OCoLC)881524638
040 _aCaBNVSL
_beng
_erda
_cCaBNVSL
_dCaBNVSL
050 4 _aQA76.9.M45
_bF256 2014
082 0 4 _a005.43
_223
090 _a
_bMoCl
_e201405CAC028
100 1 _aFalsafi, Babak.,
_eauthor.
245 1 2 _aA primer on hardware prefetching /
_cBabak Falsafi, Thomas F. Wenisch.
264 1 _aSan Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA) :
_bMorgan & Claypool,
_c2014.
300 _a1 PDF (xiv, 53 pages) :
_billustrations.
336 _atext
_2rdacontent
337 _aelectronic
_2isbdmedia
338 _aonline resource
_2rdacarrier
490 1 _aSynthesis lectures on computer architecture,
_x1935-3243 ;
_v# 28
538 _aMode of access: World Wide Web.
538 _aSystem requirements: Adobe Acrobat Reader.
500 _aPart of: Synthesis digital library of engineering and computer science.
500 _aSeries from website.
504 _aIncludes bibliographical references (pages 41-52).
505 0 _a1. Introduction -- 1.1 The memory wall -- 1.2 Prefetching -- 1.2.1 Predicting addresses -- 1.2.2 Prefetch lookahead -- 1.2.3 Placing prefetched values --
505 8 _a2. Instruction prefetching -- 2.1 Next-line prefetching -- 2.2 Fetch-directed prefetching -- 2.3 Discontinuity prefetching -- 2.4 Prescient fetch -- 2.5 Temporal instruction fetch streaming -- 2.6 Return-address stack-directed instruction prefetching -- 2.7 Proactive instruction fetch --
505 8 _a3. Data prefetching -- 3.1 Stride and stream prefetchers for data -- 3.2 Address-correlating prefetchers -- 3.2.1 Jump pointers -- 3.2.2 Pair-wise correlation -- 3.2.3 Markov prefetcher -- 3.2.4 Improving lookahead via prefetch depth -- 3.2.5 Improving lookahead via dead block prediction -- 3.2.6 Addressing on-chip storage limitations -- 3.2.7 Global history buffer -- 3.2.8 Stream chaining -- 3.2.9 Temporal memory streaming -- 3.2.10 Irregular stream buffer -- 3.3 Spatially correlated prefetching -- 3.3.1 Delta-correlated lookup -- 3.3.2 Global history buffer PC-localized/delta-correlating (GHB PC/DC) -- 3.3.3 Code-correlated lookup -- 3.3.4 Spatial footprint prediction -- 3.3.5 Spatial pattern prediction -- 3.3.6 Stealth prefetching -- 3.3.7 Spatial memory streaming -- 3.3.8 Spatio-temporal memory streaming -- 3.4 Execution-based prefetching -- 3.4.1 Algorithm summarization -- 3.4.2 Helper-thread and helper-core approaches -- 3.4.3 Run-ahead execution -- 3.4.4 Context restoration -- 3.4.5 Computation spreading -- 3.5 Prefetch modulation and control -- 3.6 Software approaches --
505 8 _a4. Concluding remarks-- Bibliography -- Author biographies.
506 1 _aAbstract freely available; full-text restricted to subscribers or individual document purchasers.
510 0 _aCompendex
510 0 _aINSPEC
510 0 _aGoogle scholar
510 0 _aGoogle book search
520 3 _aSince the 1970's, microprocessor-based digital platforms have been riding Moore's law, allowing for doubling of density for the same area roughly every two years. However, whereas microprocessor fabrication has focused on increasing instruction execution rate, memory fabrication technologies have focused primarily on an increase in capacity with negligible increase in speed. This divergent trend in performance between the processors and memory has led to a phenomenon referred to as the "Memory Wall." To overcome the memory wall, designers have resorted to a hierarchy of cache memory levels, which rely on the principal of memory access locality to reduce the observed memory access time and the performance gap between processors and memory. Unfortunately, important workload classes exhibit adverse memory access patterns that baffle the simple policies built into modern cache hierarchies to move instructions and data across cache levels. As such, processors often spend much time idling upon a demand fetch of memory blocks that miss in higher cache levels. Prefetching--predicting future memory accesses and issuing requests for the corresponding memory blocks in advance of explicit accesses--is an effective approach to hide memory access latency. There have been a myriad of proposed prefetching techniques, and nearly every modern processor includes some hardware prefetching mechanisms targeting simple and regular memory access patterns. This primer offers an overview of the various classes of hardware prefetchers for instructions and data proposed in the research literature, and presents examples of techniques incorporated into modern microprocessors.
530 _aAlso available in print.
588 _aTitle from PDF title page (viewed on June 20, 2014).
650 0 _aMemory management (Computer science)
653 _ahardware prefetching
653 _anext-line prefetching
653 _abranch-directed prefetching
653 _adiscontinuity prefetching
653 _astride prefetching
653 _aaddress-correlated prefetching
653 _aMarkov prefetcher
653 _aglobal history buffer
653 _atemporal memory streaming
653 _aspatial memory streaming
653 _aexecution-based prefetching
700 1 _aWenisch, Thomas F.,
_eauthor.
776 0 8 _iPrint version:
_z9781608459520
830 0 _aSynthesis digital library of engineering and computer science.
830 0 _aSynthesis lectures in computer architecture ;
_v# 28.
_x1935-3243
856 4 2 _3Abstract with links to resource
_uhttp://ieeexplore.ieee.org/servlet/opac?bknumber=6828870
856 4 0 _3Abstract with links to full text
_uhttp://dx.doi.org/10.2200/S00581ED1V01Y201405CAC028
999 _c562076
_d562076