AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch By: Chris Fregly Read more
Programming Your GPU with OpenMP: Performance Portability for GPUs By: Timothy G Mattson, Tom Deakin Read more
XeHE: an Intel GPU Accelerated Fully Homomorphic Encryption Library: A SYCL Sparkler: Making the Most of C++ and SYCL (SYCL Sparklers: Making the Most of C++ and SYCL) By: Alexander Lyashevsky, Alexey Titov, Yiqin Qiu, and 1 more Read more
Learn CUDA Programming: A beginner’s guide to GPU programming and parallel computing with CUDA 10 x and C:C++ By: Jaegeun Han Read more