ISCA LG-ARC Workshop

About The Workshop

Training and deploying huge machine learning models, such as GPT, Llama, or large GNNs, require a vast amount of compute resources, power, storage, memory. The size of such models is growing exponentially, as is the training time and the resources required. The cost to train large foundation models has become prohibitive for everyone but very few large players. While the challenges are most visible in training, similar considerations apply to deploying and serving large foundation models for a large user base.

The proposed workshop aims to bring together AI/ML researchers, computer architects, and engineers working on a range of topics focused on training and serving large ML models. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area and to discuss and explore hardware/software techniques and tools to lower the significant barrier of entry in the computation requirements of AI foundation models.

Motivation

We are seeking innovative, evolutionary and revolutionary ideas around software and hardware architectures for training such challenging models and strive to present and discuss new approaches that may lead to alternative solutions.

Location

The workshop will be held in Tokyo, Japan.

The workshop will be co-located with ISCA 2025.

Date: 22 June 2025

Call for papers

The workshop will present original works in areas such as (but not limited to):

Workload Characterization
Inference Serving at Scale
Distributed Training
Novel Networking and Interconnect Approaches for Large AI/ML Workloads
Addressing Resilience of Large Training Runs
Data Reduction Techniques
Better Model Partitioning
Data Formats and Precision
Efficient Hardware and Competitive Accelerators

Scope of Papers

Authors can submit either 8-page full papers or up to 4-page short papers.

For the short paper, out-of-the box ideas and position papers are especially encouraged.

Important Deadlines

All deadlines are at 11:59 PM AoE (Anywhere on Earth).

Paper Submission: ~~15 April 2025~~ 22 April 2025
Accept Notification: 10 May 2025
Workshop Date: 22 June 2025

Event Schedule

LG-ARC 2025

New Approaches to Addressing the Computing Requirements of LLMs and GNNs

08:30 AM08:45 AM

Welcome

Dejan Milojicic, HPE Labs

08:45 AM09:35 AM

Keynote Talk: Convergence of AI and HPC Toward the AI-for-Science Era

Masaaki Kondo, Keio Keio University / Riken Center for computational Science

09:35 AM10:00 AM

F-BFQ: Flexible Block Floating-Point Quantization Accelerator for LLMs

Jude Haris and José Cano, University of Glasgow

10:00 AM10:20 AM

Break

10:20 AM10:45 AM

Compressing Large Language Models with ZFP: Lessons Learned

Maximilian Sand, TUD Dresden and Jens Domke, RIKEN

10:45 AM11:10 AM

Fine-Grained Low-latency GPU Sharing for HPC and Cloud

Aditya Dhakal, Gourav Rattihalli, Pavana Prakash and Dejan Milojicic, HPE Labs

11:10 AM11:45 AM

Invited Talk: NeuraChip – Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

Kaustubh Shivdikar, AMD

11:45 AM12:35 PM

Keynote Talk: Immense-scale Machine Learning: The Big, the Small, and the Not Right At All

Norm Jouppi, Google

ORGANIZATION

Program Co-Chairs

Program Committee

Jose Luis Abellan University of Murcia

Rosa M Badia Barcelona Supercomputer Center

Chaim Baskin Technion

Jose Cano University of Glasgow

Freddy Gabbay Ruppin College

John Kim KAIST

Paolo Faraboschi Hewlett Packard Labs

Alexandra Posoldova Sigma

Chang Qiong Institute of Science Tokyo

Bin Ren William and Mary

Carole Jean Wu META

Jhibin Yu Shenzhen Institute of Technology

Kaustubh Shivdikar AMD

Zlatan Feric Northeastern University

Publicity Chair

Pavana Prakash Hewlett Packard Labs

Web Chair

Zlatan Feric Northeastern University

feric.z@northeastern.edu

Contact Us

For queries regarding submission

Avi Mendelson

avi.mendelson@technion.ac.il

David Kaeli

kaeli@ece.neu.edu

Dejan S. Milojicic

dejan.milojicic@hpe.com

LG-ARC’2025 workshop@ ISCA 2025

About The Workshop

Motivation

Location

Call for papers

Scope of Papers

Important Deadlines

Event Schedule

Welcome

Keynote Talk: Convergence of AI and HPC Toward the AI-for-Science Era

F-BFQ: Flexible Block Floating-Point Quantization Accelerator for LLMs

Break

Compressing Large Language Models with ZFP: Lessons Learned

Fine-Grained Low-latency GPU Sharing for HPC and Cloud

Invited Talk: NeuraChip – Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

Keynote Talk: Immense-scale Machine Learning: The Big, the Small, and the Not Right At All

ORGANIZATION

Program Co-Chairs

Avi Mendelson Technion

David Kaeli Northeastern University

Dejan S. Milojicic Hewlett Packard Labs

Program Committee

Jose Luis Abellan University of Murcia

Rosa M Badia Barcelona Supercomputer Center

Chaim Baskin Technion

Jose Cano University of Glasgow

Freddy Gabbay Ruppin College

John Kim KAIST

Paolo Faraboschi Hewlett Packard Labs

Alexandra Posoldova Sigma

Chang Qiong Institute of Science Tokyo

Bin Ren William and Mary

Carole Jean Wu META

Jhibin Yu Shenzhen Institute of Technology

Kaustubh Shivdikar AMD

Zlatan Feric Northeastern University

Publicity Chair

Pavana Prakash Hewlett Packard Labs

Web Chair

Zlatan Feric Northeastern University

Contact Us

Avi Mendelson

David Kaeli

Dejan S. Milojicic

LG-ARC’2025 workshop
@ ISCA 2025