Conveners
WestAI: Boosting ML Trainings with HPC Resources
- John Arnold
Description
Abstract:
The capabilities of Machine Learning (ML) models usually scale with model size. However, this expansion also entails a need for greater computational resources. The AI service center WestAI addresses this by providing high performance computing (HPC) hardware for large model trainings in academia and industry. WestAI’s primary offerings include 10,000 hours on NVIDIA H100 GPUs and specialized ML consulting.
This session will cover:
- An overview of WestAI and how its services can support your work.
- Instruction on how to apply for computing time.
- An in-depth look at the HPC systems at RWTH Aachen University and the Jülich Supercomputing Centre.
- How to utilize the HPC system at RWTH with Jupyter Notebooks for small-scale testing and training.
- Instructions on how to use the batch system for large model trainings at RWTH’s HPC system.
The session will include ample time for questions and discussions.