000149208 001__ 149208
000149208 005__ 20250127135740.0
000149208 037__ $$aTAZ-TFM-2024-770
000149208 041__ $$aeng
000149208 1001_ $$aGabirondo López, Iñigo
000149208 24200 $$aTowards autonomous resource management: Deep learning prediction of CPU-GPU load balancing.
000149208 24500 $$aTowards autonomous resource management: Deep learning prediction of CPU-GPU load balancing.
000149208 260__ $$aZaragoza$$bUniversidad de Zaragoza$$c2024
000149208 506__ $$aby-nc-sa$$bCreative Commons$$c3.0$$uhttp://creativecommons.org/licenses/by-nc-sa/3.0/
000149208 520__ $$aThe demand of data centers has increased due to the latest improvements of Artificial Intelligence. These data centers are composed of thousands of servers with cooling systems that consume high amounts of energy. The servers usually contain several processing units that can cooperate for solving computational tasks. When making a proper partitioning of the entire workload among the processing units of the same machine, the total execution time and power consumption of the server is highly decreased. Hence, creating load balancing algorithms that create proper workload partitions can improve the energy efficiency. This work presents a deep learning based load balancer that is thought for CPU-GPU heterogeneous systems. The load balancer takes as input an OpenCL kernel, the work group size of the kernel, and the input size in bytes, and it outputs the amount of work to assign to the CPU. The load balancer leverages the heterogeneous device mapping model (Device mapping aims to select the best processing unit, only one, among several of them) presented in ProGraML. ProGraML exhibited a very poor performance in our setup, which made it impossible to do any kind of experiment. Hence, in order to solve this performance issue, we decided to migrate the full ProGraML project to the pytorch-geometric library. After that, we adapted the heterogeneous device mapping model for the load balancing task. Experimental results show that the load balancer is able to accurately predict workload partitions, even if the testing setup and the setup used for labelling the datasets differ. The final model has been able to predict 6 of the 8 tested kernels with a difference of less than 20% with respect to the theoretical work partition. Among those 6 kernels, 3 of them were predicted with an error less than 10%. In addition, our new ProGraML implementation removes the performance issue of the original work, it is publicly available and it is based on a library that enables performing new experiments more easily.<br /><br />
000149208 521__ $$aMáster Universitario en Robótica, Gráficos y Visión por Computador
000149208 540__ $$aDerechos regulados por licencia Creative Commons
000149208 691__ $$a7 9
000149208 692__ $$aThe created model is able to create accurate load balancing schedules for heterogeneous systems. By doing this, the performance of the systems is improved, lowering the execution time of the computational tasks, and therefore improving the energy consumption of these systems. 
000149208 700__ $$aSuárez Gracia, Darío$$edir.
000149208 700__ $$aGran Tejero, Rubén$$edir.
000149208 7102_ $$aUniversidad de Zaragoza$$bInformática e Ingeniería de Sistemas$$cArquitectura y Tecnología de Computadores
000149208 8560_ $$f876333@unizar.es
000149208 8564_ $$s1538526$$uhttps://zaguan.unizar.es/record/149208/files/TAZ-TFM-2024-770.pdf$$yMemoria (eng)
000149208 909CO $$ooai:zaguan.unizar.es:149208$$pdriver$$ptrabajos-fin-master
000149208 950__ $$a
000149208 951__ $$adeposita:2025-01-27
000149208 980__ $$aTAZ$$bTFM$$cEINA
000149208 999__ $$a20240625125212.CREATION_DATE