ARC-LG workshop
@ ISCA 2024

New Approaches for Addressing the Computing Requirements of LLMs and GNNs

Workshop Schedule

About The Workshop

Training and deployment of huge machine learning models, such as GPT, Llama, or large GNNs, require a vast amount of compute resources, power, storage, memory. The size of such models is growing exponentially, as is the training time and the resources required. The cost to train large foundation models has become prohibitive for everyone but very few large players. While the challenges are most visible in training, similar considerations apply to deploying and serving large foundation models for a large user base.

The proposed workshop aims to bring together AI/ML researchers, computer architects, and engineers working on a range of topics focused on training and serving large ML models. The workshop will provide a forum for presenting and exchanging new ideas and experiences in this area and to discuss and explore hardware/software techniques and tools to lower the significant barrier of entry in the computation requirements of AI foundation models.

Motivation

We are seeking innovative, evolutionary and revolutionary ideas around software and hardware architectures for training such challenging models and strive to present and discuss new approaches that may lead to alternative solutions.

Location

The workshop will be held in Buenos Aires, Argentina.

The workshop will be co-located with ISCA 2024.

Date: 30 June 2024





Invited Talks

META

Carole-Jean Wu

Meta

LLMs, Fast and Everywhere: Acceleration for the Age of GenAI

[Abstract]

GROQ

Igor Arsovski

Chief architect at Groq

Think Faster: 1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine

[Abstract]

AMD

Sumanth Gudaparthi

Member of Technical Staff, AMD Research

Training Massive-Scale AI Foundational Models on Frontier

[Abstract]

Event Schedule

ARC-LG workshop on Large Language Models and Graph Neural Networks

Welcome

Registrations and Welcome note

Carole-Jean Wu

Keynote Carole-Jean Wu META

LLMs, Fast and Everywhere: Acceleration for the Age of GenAI

[slides]

Session 1 Chair: David Kaeli

Can Tree-Based Model Improve Performance Prediction for LLMs?

Karthick Panner Selvam and Mats Brorsson

[slides]

CaR: An Efficient KV Cache Reuse System for Large Language Model Inference

Kexin Chu, Tzechinh Liu, Yunding Li, Pengchao Yuan and Wei Zhang

[slides]

LGNNIC: Acceleration of Large-Scale GNN Training using SmartNICs

Liad Gerstman, Aditya Dhakal, Sai Rahul Chalamalasetti, Chaim Baskin and Dejan Milojicic

[slides]
Igor Arsovski

Invited Talk Igor Arsovski GROQ

Think Faster:1300 t/s/u on Llama3 with Groq software scheduled LPU Inference Engine

[slides]

Lunch

Break for Lunch

Session 2 Chair: Avi Mendelson

Comparing Data Precision on Low-Rank Adaptation for Fine-tuning Large Language Models

Bagus Hanindhito, Bhavesh Patel and Lizy K. John.

[slides]

Casting off the Old Guard: Achieving Superior A.I. Performance through Simplification

Jerry Felix, Steve Brunker and Carol Hibbard

[slides]

SECDA-LLM: Designing Efficient LLM Accelerators for Edge Devices

Jude Haris, Rappy Saha, Wenhao Hu and José Cano

[slides]
Sumanth Gudaparthi

Invited Talk Sumanth Gudaparthi AMD

Training Massive-Scale AI Foundational Models on Frontier

[slides]

Break

Break for Snacks

Session 3 Chair: Paolo Faraboschi

PrefixSmart: Enhancing Large Language Model Efficiency through Advanced Prompt Management

Yunding Li, Kexin Chu, Nannan Zhao and Wei Zhang

[slides]

hLLM: A Numa-aware Heterogeneous Platform for High-throughput Large Language Models Service

Kexin Chu, Tzechinh Liu, Pengchao Yuan and Wei Zhang

[slides]

LLM-VeriPPA: Power, Performance, and Area-aware Verilog Code Generation and Refinement with Large Language Models

Kiran Gautam Thorat, Amit Hasan, Jiahui Zhao, Yaotian Liu, Xi Xie, Hongwu Peng, Bin Lei, Jeff Zhang and Caiwen Ding

[slides]

PANEL: What is the path forward to environmentally friendly LLMs?

Lizy John, Univ. of Texas

Josep Torrellas, UIUC

Dr. Binbin Meng, Huawei

Closing

Concluding remarks

ORGANIZATION

Program Co-Chairs

Avi Mendelson Technion

avi.mendelson@technion.ac.il

David Kaeli Northeastern University

kaeli@ece.neu.edu

Paolo Faraboschi Hewlett Packard Labs

paolo.faraboschi@hpe.com

Program Committee

Jose Luis Abellan University of Murcia

Rosa M Badia Barcelona Supercomputer Center

Chaim Baskin Technion

Jose Cano University of Glasgow

Freddy Gabbay Ruppin College

John Kim KAIST

Dejan S. Milojicic HPE

Alexandra Posoldova Sigma

Bin Ren William and Mary

Carole Jean Wu META

Jhibin Yu Shenzhen Institute of Technology

Kaustubh Shivdikar Northeastern University

Publicity Chair

Pavana Prakash Hewlett Packard Labs

Web Chair

Kaustubh Shivdikar Northeastern University

shivdikar.k@northeastern.edu

Contact Us

For queries regarding submission

David Kaeli

kaeli@ece.neu.edu

Paolo Faraboschi

paolo.faraboschi@hpe.com