About The Workshop
Training and deploying huge machine learning models, such as GPT, Llama, or large GNNs, require a vast amount of compute resources, power, storage, memory. The size of such models is growing exponentially, as is the training time and the resources required. The cost to train large foundation models has become prohibitive for everyone but very few large players. While the challenges are most visible in training, similar considerations apply to deploying and serving large foundation models for a large user base.
The proposed workshop aims to bring together AI/ML researchers, computer architects, and engineers working on a range of topics focused on training and serving large ML models. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area and to discuss and explore hardware/software techniques and tools to lower the significant barrier of entry in the computation requirements of AI foundation models.
Motivation
We are seeking innovative, evolutionary and revolutionary ideas around software and hardware architectures for training such challenging models and strive to present and discuss new approaches that may lead to alternative solutions.
Location
The workshop will be held in Tokyo, Japan.
The workshop will be co-located with ISCA 2025.
Date: 22 June 2025
Call for papers
The workshop will present original works in areas such as (but not limited to):
- Workload Characterization
- Inference Serving at Scale
- Distributed Training
- Novel Networking and Interconnect Approaches for Large AI/ML Workloads
- Addressing Resilience of Large Training Runs
- Data Reduction Techniques
- Better Model Partitioning
- Data Formats and Precision
- Efficient Hardware and Competitive Accelerators
Scope of Papers
Authors can submit either 8-page full papers or up to 4-page short papers.
For the short paper, out-of-the box ideas and position papers are especially encouraged.
Important Deadlines
All deadlines are at 11:59 PM AoE (Anywhere on Earth).
Paper Submission: 15 April 2025 22 April 2025
Accept Notification: 10 May 2025
Workshop Date: 22 June 2025
Event Schedule
LG-ARC 2025
New Approaches to Addressing the Computing Requirements of LLMs and GNNs
Welcome
Dejan Milojicic, HPE Labs
Keynote Talk
Masaaki Kondo, Keio University
F-BFQ: Flexible Block-Floating Point Quantization Accelerator for LLMs
Jude Harris and José Cano, University of Glasgow
Break
Compressing Large Language Models with ZFP: Lessons Learned
Maximilian Sand, TUD Dresden and Jens Domke, RIKEN
Fine-Grained Low-latency GPU Sharing for HPC and Cloud
Aditya Dhakal, Gourav Rattihalli, Pavana Prakash and Dejan Milojicic, HPE Labs
Invited Talk: NeuraChip – Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator
Kaustubh Shivdikar, AMD
Keynote Talk
Norm Jouppi, Google
ORGANIZATION
Program Co-Chairs
Avi Mendelson Technion
David Kaeli Northeastern University
Dejan S. Milojicic Hewlett Packard Labs
Program Committee
Jose Luis Abellan University of Murcia
Rosa M Badia Barcelona Supercomputer Center
Chaim Baskin Technion
Jose Cano University of Glasgow
Freddy Gabbay Ruppin College
John Kim KAIST
Paolo Faraboschi Hewlett Packard Labs
Alexandra Posoldova Sigma
Chang Qiong Institute of Science Tokyo
Bin Ren William and Mary
Carole Jean Wu META
Jhibin Yu Shenzhen Institute of Technology
Kaustubh Shivdikar AMD
Zlatan Feric Northeastern University
Publicity Chair
Pavana Prakash Hewlett Packard Labs
Web Chair
Zlatan Feric Northeastern University
Contact Us
For queries regarding submission