ARC-LG workshop
@ ISCA 2024

New Approaches for Addressing the Computing Requirements of LLMs and GNNs

Register for Workshop

About The Workshop

Training and deployment of huge machine learning models, such as GPT, Llama, or large GNNs, require a vast amount of compute resources, power, storage, memory. The size of such models is growing exponentially, as is the training time and the resources required. The cost to train large foundation models has become prohibitive for everyone but very few large players. While the challenges are most visible in training, similar considerations apply to deploying and serving large foundation models for a large user base.

The proposed workshop aims to bring together AI/ML researchers, computer architects, and engineers working on a range of topics focused on training and serving large ML models. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area and to discuss and explore hardware/software techniques and tools to lower the significant barrier of entry in the computation requirements of AI foundation models.

Motivation

We are seeking innovative, evolutionary and revolutionary ideas around software and hardware architectures for training such challenging models and strive to present and discuss new approaches that may lead to alternative solutions.

Location

The workshop will be held in Buenos Aires, Argentina.

The workshop will be co-located with ISCA 2024.

Date: 30 June 2024





Invited Talks

META

Michael Gschwind

Meta

LLMs, Fast and Everywhere: Acceleration for the Age of GenAI

[Abstract]

GROQ

Igor Arsovski

Chief architect at Groq

Think Faster: 1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine

[Abstract]

AMD

Sumanth Gudaparthi

Member of Technical Staff, AMD Research

Training Massive-Scale AI Foundational Models on Frontier

[Abstract]

Event Schedule

ARC-LG workshop on Large Language Models and Graph Neural Networks

Welcome

Registrations and Welcome note

Michael Gschwind

Keynote Michael Gschwind META

LLMs, Fast and Everywhere: Acceleration for the Age of GenAI

Session 1 Chair: David Kaeli

Can Tree-Based Model Improve Performance Prediction for LLMs?

Karthick Panner Selvam and Mats Brorsson

CaR: An Efficient KV Cache Reuse System for Large Language Model Inference

Kexin Chu, Tzechinh Liu, Yunding Li, Pengchao Yuan and Wei Zhang

LGNNIC: Acceleration of Large-Scale GNN Training using SmartNICs

Liad Gerstman, Aditya Dhakal, Sai Rahul Chalamalasetti, Chaim Baskin and Dejan Milojicic

Igor Arsovski

Invited Talk Igor Arsovski GROQ

Think Faster:1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine

Lunch

Break for Lunch

Session 2 Chair: Avi Mendelson

SECDA-LLM: Designing Efficient LLM Accelerators for Edge Devices

Jude Haris, Rappy Saha, Wenhao Hu and José Cano

Casting off the Old Guard: Achieving Superior A.I. Performance through Simplification

Jerry Felix, Steve Brunker and Carol Hibbard

Comparing Data Precision on Low-Rank Adaptation for Fine-tuning Large Language Models

Bagus Hanindhito, Bhavesh Patel and Lizy K. John.

Sumanth Gudaparthi

Invited Talk Sumanth Gudaparthi AMD

Training Massive-Scale AI Foundational Models on Frontier

Break

Break for Snacks

Session 3 Chair: Paolo Faraboschi

PrefixSmart: Enhancing Large Language Model Efficiency through Advanced Prompt Management

Yunding Li, Kexin Chu, Nannan Zhao and Wei Zhang

hLLM: A Numa-aware Heterogeneous Platform for High-throughput Large Language Models Service

Kexin Chu, Tzechinh Liu, Pengchao Yuan and Wei Zhang

LLM-VeriPPA: Power, Performance, and Area-aware Verilog Code Generation and Refinement with Large Language Models

Kiran Gautam Thorat, Amit Hasan, Jiahui Zhao, Yaotian Liu, Xi Xie, Hongwu Peng, Bin Lei, Jeff Zhang and Caiwen Ding

PANEL: What is the path forward to environmentally friendly LLMs?

Michael Gschwind, Meta

Lizy John, Univ. of Texas

Josep Torrellas, UIUC

Dr. Binbin Meng, Huawei

Closing

Concluding remarks

ORGANIZATION

Program Co-Chairs

Avi Mendelson Technion

avi.mendelson@technion.ac.il

David Kaeli Northeastern University

kaeli@ece.neu.edu

Paolo Faraboschi Hewlett Packard Labs

paolo.faraboschi@hpe.com

Program Committee

Jose Luis Abellan University of Murcia

Rosa M Badia Barcelona Supercomputer Center

Chaim Baskin Technion

Jose Cano University of Glasgow

Freddy Gabbay Ruppin College

John Kim KAIST

Dejan S. Milojicic HPE

Alexandra Posoldova Sigma

Bin Ren William and Mary

Carole Jean Wu META

Jhibin Yu Shenzhen Institute of Technology

Kaustubh Shivdikar Northeastern University

Publicity Chair

Pavana Prakash Hewlett Packard Labs

Web Chair

Kaustubh Shivdikar Northeastern University

shivdikar.k@northeastern.edu

Contact Us

For queries regarding submission

David Kaeli

kaeli@ece.neu.edu

Paolo Faraboschi

paolo.faraboschi@hpe.com