Biblio
Community detection is an advanced graph operation that is used to reveal tightly-knit groups of vertices (aka. communities) in real-world networks. Given the intractability of the problem, efficient heuristics are used in practice. Yet, even the best of these state-of-the-art heuristics can become computationally demanding over large inputs and can generate workloads that exhibit inherent irregularity in data movement on manycore platforms. In this paper, we posit that effective acceleration of the graph community detection operation can be achieved by reducing the cost of data movement through a combined innovation at both software and hardware levels. More specifically, we first propose an efficient software-level parallelization of community detection that uses approximate updates to cleverly exploit a diminishing returns property of the algorithm. Secondly, as a way to augment this innovation at the software layer, we design an efficient Wireless Network on Chip (WiNoC) architecture that is suited to handle the irregular on-chip data movements exhibited by the community detection algorithm under both unicast- and broadcast-heavy cache coherence protocols. Experimental results show that our resulting WiNoC-enabled manycore platform achieves on average 52% savings in execution time, without compromising on the quality of the outputs, when compared to a traditional manycore platform designed with a wireline mesh NoC and running community detection without employing approximate updates.