Optimization of data allocation in hierarchical memory for blocked shortest paths algorithms
https://doi.org/10.21122/2309-4923-2021-3-40-50
Abstract
This paper is devoted to the reduction of data transfer between the main memory and direct mapped cache for blocked shortest paths algorithms (BSPA), which represent data by a D[M×M] matrix of blocks. For large graphs, the cache size S = δ×M2, δ < 1 is smaller than the matrix size. The cache assigns a group of main memory blocks to a single cache block. BSPA performs multiple recalculations of a block over one or two other blocks and may access up to three blocks simultaneously. If the blocks are assigned to the same cache block, conflicts occur among the blocks, which imply active transfer of data between memory levels. The distribution of blocks on groups and the block conflict count strongly depends on the allocation and ordering of the matrix blocks in main memory. To solve the problem of optimal block allocation, the paper introduces a block conflict weighted graph and recognizes two cases of block mapping: non-conflict and minimum-conflict. In first case, it formulates an equitable color-class-size constrained coloring problem on the conflict graph and solves it by developing deterministic and random algorithms. In second case, the paper formulates a problem of weighted defective color-count constrained coloring of the conflict graph and solves it by developing a random algorithm. Experimental results show that the equitable random algorithm provides an upper bound of the cache size that is very close to the lower bound estimated over the size of a complete subgraph, and show that a non-conflict matrix allocation is possible at δ = 0.5 for M = 4 and at δ = 0.1 for M = 20. For a low cache size, the weighted defective algorithm gives the number of remaining conflicts that is up to 8.8 times less than the original BSPA gives. The proposed model and algorithms are applicable to set-associative cache as well.
About the Author
A. A. PrihozhyBelarus
Anatoly Prihozhy, doctor of science, professor, Computer and system software department
Minsk
References
1. R. W. Floyd “Algorithm 97: Shortest path”, Communications of the ACM, 1962, 5(6), p.345.
2. Hofner, P. Dijkstra, Floyd and Warshall Meet Kleene / P. Hofner and B. Moller // Formal Aspect of Computing, Vol. 24, No. 4, 2012, № 2, pp. 459–476.
3. G. Venkataraman, S. Sahni, S. Mukhopadhyaya “A Blocked All-Pairs Shortest Paths Algorithm”, Journal of Experimental Algorithmics (JEA), Vol. 8, 2003, pp. 857–874
4. Prihozhy A.A., Karasik O. N. “Heterogeneous blocked all-pairs shortest paths algorithm”. «System analysis and applied information science». 2017; (3): 68–75. (In Russ.) https://doi.org/10.21122/2309–4923–2017–3–68–75.
5. C. Kozyrakis. “Computer Systems Architecture. Advanced Caching Techniques”, Stanford University, pp. 1–35, 2012.
6. Smith, A. J., “Cache Memories”, Computing Surveys. 1982, 14 (3): 473–530.
7. J. S. Park, M. Penner, and V. K. Prasanna. “Optimizing graph algorithms for improved cache performance” / J. S. Park, // IEEE Trans. on Parallel and Distributed Systems, 2004, 15(9), pp. 769–782.
8. Prihozhy A.A. Simulation of direct mapped, k-way and fully associative cache on all pairs shortest paths algorithms. «System analysis and applied information science». 2019; (4):10–18.
9. Solomonik, E. Minimizing Communication in All Pairs Shortest Paths / E. Solomonik, A. Buluc, and J. Demmel // IEEE 27th International Symposium on Parallel & Distributed Processing, 2013, pp. 548–559.
10. Tang, P. Rapid Development of Parallel Blocked All-Pairs Shortest Paths Code for Multi-Core Computers / P. Tang // IEEE SOUTHEASTCON 2014, pp. 1–7.
11. Prihozhy, A.A. Adaptive memory management. Automation and computer technology, 1988, № 3, с. 58–65.
12. Prihozhy, A.A. Asynchronous scheduling and allocation / A.A. Prihozhy / Proceedings Design, Automation and Test in Europe. Paris, France. – IEEE, 1998, pp. 963–964.
13. Prihozhy A.A., Karasik O. N. Investigation of methods for implementing multithreaded applications on multicore systems. Informatization of education, 2014, № 1, с. 43–62.
14. Prihozhy A.A., Karasik O. N. Cooperative model for optimization of execution of threads on multi-core system. «System analysis and applied information science». 2014;(4):13–20. (In Russ.)
15. Chaitin, G. J. “Register allocation & spilling via graph colouring”, Proc. 1982 SIGPLAN Symposium on Computer Construction, 1982, pp. 98–105.
16. Bodlaender, H. L., Fomin, F. V. “Equitable colorings of bounded treewidth graphs”, Theoretical Computer Science, 2005, 349 (1): 22–30.
17. Hajnal, A., Szemeredi E. “Proof of a conjecture of P. Erdős”, Combinatorial theory and its applications, II (Proc. Colloq., Balatonfüred, 1969), North-Holland, 1970, pp. 601–623
18. Cowen, L. J., Cowen, R. H., Woodall, D. R. “Defective colorings of graphs in surfaces: Partitions into subgraphs of bounded valency”. Journal of Graph Theory, 2006, 10 (2): 187–195.
Review
For citations:
Prihozhy A.A. Optimization of data allocation in hierarchical memory for blocked shortest paths algorithms. «System analysis and applied information science». 2021;(3):40-50. https://doi.org/10.21122/2309-4923-2021-3-40-50