AMD has launched ROCm version 6.3, introducing significant updates to its open-source driver stack. The new SGLang runtime improves latency and throughput for generative AI models on Instinct GPUs, achieving up to 6X performance boosts in large language model inferencing.
Meanwhile, FlashAttention-2 improves upon its predecessor by reducing memory usage and offering up to 3x speedup for AI model training. Additionally, a new Fortran compiler allows users to run legacy Fortran applications on modern GPUs, featuring GPU offloading through OpenMP and backward compatibility for existing applications, simplifying integration with HIP kernels and ROCm libraries.
Read More: ROCm 6.3 adds several new features including a Fortran compiler, and SGLang
Check Out The New TalkDev Podcast. For more such updates follow us on Google News TalkDev News.