ARC-LG workshop
@ ISCA 2024

New Approaches for Addressing the Computing Requirements of LLMs and GNNs

Workshop Schedule

About The Workshop

Training and deployment of huge machine learning models, such as GPT, Llama, or large GNNs, require a vast amount of compute resources, power, storage, memory. The size of such models is growing exponentially, as is the training time and the resources required. The cost to train large foundation models has become prohibitive for everyone but very few large players. While the challenges are most visible in training, similar considerations apply to deploying and serving large foundation models for a large user base.

The proposed workshop aims to bring together AI/ML researchers, computer architects, and engineers working on a range of topics focused on training and serving large ML models. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area and to discuss and explore hardware/software techniques and tools to lower the significant barrier of entry in the computation requirements of AI foundation models.


We are seeking innovative, evolutionary and revolutionary ideas around software and hardware architectures for training such challenging models and strive to present and discuss new approaches that may lead to alternative solutions.


The workshop will be held in Buenos Aires, Argentina.

The workshop will be co-located with ISCA 2024.

Date: 30 June 2024

Invited Talks


Carole-Jean Wu


LLMs, Fast and Everywhere: Acceleration for the Age of GenAI



Igor Arsovski

Chief architect at Groq

Think Faster: 1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine



Sumanth Gudaparthi

Member of Technical Staff, AMD Research

Training Massive-Scale AI Foundational Models on Frontier


Event Schedule

ARC-LG workshop on Large Language Models and Graph Neural Networks


Registrations and Welcome note

Carole-Jean Wu

Keynote Carole-Jean Wu META

LLMs, Fast and Everywhere: Acceleration for the Age of GenAI


Session 1 Chair: David Kaeli

Can Tree-Based Model Improve Performance Prediction for LLMs?

Karthick Panner Selvam and Mats Brorsson


CaR: An Efficient KV Cache Reuse System for Large Language Model Inference

Kexin Chu, Tzechinh Liu, Yunding Li, Pengchao Yuan and Wei Zhang


LGNNIC: Acceleration of Large-Scale GNN Training using SmartNICs

Liad Gerstman, Aditya Dhakal, Sai Rahul Chalamalasetti, Chaim Baskin and Dejan Milojicic

Igor Arsovski

Invited Talk Igor Arsovski GROQ

Think Faster:1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine



Break for Lunch

Session 2 Chair: Avi Mendelson

Comparing Data Precision on Low-Rank Adaptation for Fine-tuning Large Language Models

Bagus Hanindhito, Bhavesh Patel and Lizy K. John.


Casting off the Old Guard: Achieving Superior A.I. Performance through Simplification

Jerry Felix, Steve Brunker and Carol Hibbard


SECDA-LLM: Designing Efficient LLM Accelerators for Edge Devices

Jude Haris, Rappy Saha, Wenhao Hu and José Cano

Sumanth Gudaparthi

Invited Talk Sumanth Gudaparthi AMD

Training Massive-Scale AI Foundational Models on Frontier



Break for Snacks

Session 3 Chair: Paolo Faraboschi

PrefixSmart: Enhancing Large Language Model Efficiency through Advanced Prompt Management

Yunding Li, Kexin Chu, Nannan Zhao and Wei Zhang


hLLM: A Numa-aware Heterogeneous Platform for High-throughput Large Language Models Service

Kexin Chu, Tzechinh Liu, Pengchao Yuan and Wei Zhang


LLM-VeriPPA: Power, Performance, and Area-aware Verilog Code Generation and Refinement with Large Language Models

Kiran Gautam Thorat, Amit Hasan, Jiahui Zhao, Yaotian Liu, Xi Xie, Hongwu Peng, Bin Lei, Jeff Zhang and Caiwen Ding


PANEL: What is the path forward to environmentally friendly LLMs?

Lizy John, Univ. of Texas

Josep Torrellas, UIUC

Dr. Binbin Meng, Huawei


Concluding remarks


Program Co-Chairs

Avi Mendelson Technion

David Kaeli Northeastern University

Paolo Faraboschi Hewlett Packard Labs

Program Committee

Jose Luis Abellan University of Murcia

Rosa M Badia Barcelona Supercomputer Center

Chaim Baskin Technion

Jose Cano University of Glasgow

Freddy Gabbay Ruppin College

John Kim KAIST

Dejan S. Milojicic HPE

Alexandra Posoldova Sigma

Bin Ren William and Mary

Carole Jean Wu META

Jhibin Yu Shenzhen Institute of Technology

Kaustubh Shivdikar Northeastern University

Publicity Chair

Pavana Prakash Hewlett Packard Labs

Web Chair

Kaustubh Shivdikar Northeastern University

Contact Us

For queries regarding submission

David Kaeli

Paolo Faraboschi