AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch By: Chris Fregly Read more
Programming Your GPU with OpenMP: Performance Portability for GPUs By: Timothy G Mattson, Tom Deakin Read more