Turkish Journal of Electrical Engineering and Computer Sciences
DOI
10.55730/1300-0632.3903
Abstract
GPUs employ simple coherence mechanisms and require explicit use of costly synchronization operations for data integrity. Local-scoped synchronization can be utilized to lower the performance penalty of synchronization when sharing is within a subgroup of threads. Unfortunately, in asymmetric sharing (which is an important dynamic sharing pattern), it is necessary to use global-scoped synchronization due to possible accesses by remote sharers. Remote Scope Promotion (RSP) was introduced to take advantage of local-scoped synchronization at regular accesses while using scope promotion at occasional remote accesses. First implementation of RSP makes use of a simple approach that performs costly cache operations on all L1 data caches when implementing scope promotion, and therefore, it performs poorly on large scale GPU systems. We present nRSP which utilizes a static naming mechanism to identify regularly accessing agent in asymmetric sharing and avoids applying costly coherence actions on every L1 data cache when implementing scope promotion. We evaluate nRSP using timing detailed Gem5-APU simulator modeling a GPU system with 128 Compute Units and show that nRSP lowers remote synchronization overhead greatly and improves performance considerably. On average, nRSP provides around 28% speedup on a 128 Compute Unit GPU device.
Keywords
Asymmetric synchronization, GPUs, remote scope promotion, work-stealing
First Page
1758
Last Page
1772
Recommended Citation
YILMAZER, AYŞE
(2022)
"Using a static naming approach to implement remote scope promotion,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 30:
No.
5, Article 6.
https://doi.org/10.55730/1300-0632.3903
Available at:
https://journals.tubitak.gov.tr/elektrik/vol30/iss5/6
Included in
Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons