In-SRAM Compute For Generative AI and Large Language Models
The recent uptick in generative artificial intelligence (GAI) has put the more pressure on hardware vendors to reduce the carbon footprint of running these power hungry large language models (LLM) in the datacenter. One way to accomplish a lower in-silicon power profile is to break the Von-Neumann bottleneck by tightly integrating traditional SRAM memory cells with interleaved programable processors in the same die.