Utham Kamath
Director Machine Learning Systems, Groq


The Groq Tensor Streaming Processor (TSP) and the Value of Deterministic Instruction Execution


The explosion of machine learning and its many applications has motivated a variety of new

domain-specific architectures to accelerate these deep learning workloads. The Groq Tensor Streaming Processor (TSP) is based on a deterministic instruction set architecture (ISA) with a single large core. The ISA exposes temporal information indicating the number of cycles the instruction requires to produce the output stream -- its functional latency. Determinism allows the compiler to reason about program correctness and track the exact spatial and temporal position of every tensor on the chip. Events in a deterministic system cannot be permuted by the underlying hardware -- that is, the total program order is the interleaving of individual instruction queues of each functional unit. This total ordering is entirely software controlled and the underlying hardware cannot reorder these events and they must complete in a fixed amount of time. This has implications for hardware design: it removes use of hardware interlocks to coordinate between functional units, or use any “reactive components” in the data path such as arbiters, caching agents, replay or retransmission mechanisms, etc. It also has several consequences for system design: zero variance latency, low latency and high throughput at batch size 1, reduced total cost of ownership (TCO) for data centers with diverse service level agreements (SLAs) and the ability to scale to large training and inference systems without traversing networking switches. In this talk we discuss the TSP and the design implications of its architecture.


Utham Kamath is Director of Machine Learning Systems at Groq where he works on the implementation and optimization of ML models for Groq's hardware and performance analysis of ML workloads. He has over twenty years of industry experience including technical and management roles at Qualcomm, Atheros Communications and Hewlett Packard. He has a Bachelor's degree in Engineering from Bangalore University and an MS and PhD from the University of Southern California.

Please note that in-person conferences (i.e. AI Hardware Expo on May 5th-6th 2020, AI Enterprise Expo on August 25th-26th 2020 and AI Robotics Expo on November 12th-13th 2020 at SEMI, Milpitas, Silicon Valley) are converted into this virtual AI Expo series. 

Attendee Resources: Live Zoom Webinars, Community Rooms for each week/theme, Presentations and Recordings (after live sessions)