MEMSYS 2023

Welcome to MEMSYS23. We are happy to be back after three years of virtual conferences!

Keynotes:

Keynote: Memory Technologies — Truths, Myths, and Hype

Ever wondered why SRAM, DRAM, and Flash are the only three successful memory technologies today, despite several emerging technologies being hyped over decades with outrageous claims and promises? These new emerging technologies either failed to deliver, or overpromised, or misrepresented their benefits. In this talk we will first discuss salient memory attributes such as energy, performance, persistence, and endurance. Then describe how to measure these attributes for existing memory technologies, exposing some of the myths and hype. Next, we will establish a system level value metric for comparison and evaluate different memory technologies, comparing their attributes using the established metric, and conclude on what is required for a memory technology to succeed!

Shekhar Borkar is a Sr. Director of Technology at Qualcomm Inc. He started his career with Intel Corp, worked on the 8051 family of microcontrollers, supercomputers, and high performance & low power digital circuits research. He has authored over 100 peer reviewed publications in conferences and journals, over 60 invited papers and keynotes, five book chapters, and has more than 60 issued patents. His research interests are low power, high performance digital circuits and system level optimization.

Keynote: Next Steps in 3D Memory Systems

Modern machine learning workloads are increasing in scale at a rapid rate. For example, GPT-3 has 175 billion parameters, requiring 570 GB of storage, and GPT-4 is even larger. ML workloads are typically memory bandwidth constrained. For example, a saturated Google TPU can support four HBM channels operating at full capacity. The Cerebras Wafer Scale Engine is designed for ML workloads and supports 20 PBps of memory bandwidth. The problem is not just in inference but also in training. The cost of training is a serious issue especially for large models and can benefit from acceleration. Compute is made more difficult due the irregular nature of training and Transformer workloads. Processor near memory (PnM) paradigms with high density memory stacked on logic are especially attractive options.

We describe multiple memory alternatives to address these issues and resulting opporunities. All are 3DIC technology enabled with aggressive use of high density Through Silicon Vias (TSVs) and two-sided hybrid bonding. First, we modified the design of the Tezzaron 64 Gb diRAM memory to expose over 130 Tbps of peak memory bandwidth. Second, we revisit the DRAM stack using more conventional bank designs and aggressive use of 3DIC technologies. Finally, we explore a mix of non-volatile and SRAM to enable a stack that can be built with accessible foundry technologies. In each instance the memory array(s) are matched with a network layer and array of SIMD processors to create a high performance, high capacity power-efficient solution. The resulting solution is very area efficient and has the best power efficiency out of programmable solutions.

Paul D. Franzon is currently the Cirrus Logic Distinguished Professor and the Director of Graduate programs in the Department of Electrical and Computer Engineering at North Carolina State University. He is also a site Director for the Center for Advanced Electronics through Machine Learning (CAEML). He earned his Ph.D. from the University of Adelaide, Adelaide, Australia. He has also worked at AT&T Bell Laboratories, DSTO Australia, Australia Telecom, Rambus, and four companies he cofounded, Communica, LightSpin Technologies, Polymer Braille Inc. and Indago Technologies. His current interests include applying machine learning to EDA, building AI accelerators, RFID, advanced packaging, heterogeneous integration, 2.5D and 3DICs and secure chip design. He has lead several major efforts and published over 300 papers in these areas. In 1993 he received an NSF Young Investigators Award, in 2001 was selected to join the NCSU Academy of Outstanding Teachers, in 2003, selected as a Distinguished Undergraduate Alumni Professor, received the Alcoa Research Award in 2005, the Board of Governors Teaching Award in 2014, and the Distinguished Graduate Alumni Professor in 2021. He has been awarded faculty awards from Qualcomm, IBM, Synopsys, and Google. He served with the Australian Army Reserve for 13 years as an Infantry Soldier and Officer. He is a Fellow of the IEEE.

Monday, October 2nd

18:00

Welcome Reception

Tuesday, October 3rd

8:00	Breakfast in Restaurant
8:50	Opening Remarks
9:00	Keynote: Memory Technologies: Truths, Myths, and Hype, Shekhar Borkar
10:00	Break
Session	1: Applications
10:20	An Empirical Analysis on Memcached’s Replacement Policies
10:40	Large-scale Graph Processing on Commodity Systems: Understanding and Mitigating the Impact of Swapping
11:00	LLVM Static Analysis for Program Characterization and Memory Reuse Profile Estimation
11:20	Evaluating Gather Scatter Performance on CPUs and GPUs *
11:40	Writeback Modeling: Theory and Application to Zipfian Workloads *
12:00	Lunch
Session	2: Safety and Security
13:00	RAMPART: RowHammer Mitigation and Repair for Server Memory Systems
13:20	ECC-Map: A Resilient Wear-Leveled Memory-Device Architecture with Low Mapping Overhead
13:40	Error Detecting and Correcting Codes for DRAM Functional Safety
14:00	An LPDDR4 Safety Model for Automotive Applications *
14:20	Break
Session	3: Modeling and Simulation
14:40	Memory Workload Synthesis Using Generative AI
15:00	Multifidelity Memory System Simulation in SST
15:20	Modeling and Characterizing Shared and Local Memories of the Ampere GPUs
15:40	PPT-SASMM: Scalable Analytical Shared Memory Model *
16:00	Break
16:30	Spirited Discussion
19:00	TPC Dinner

Wednesday, October 4th

8:00	Breakfast in Restaurant
9:00	Keynote: Next Steps in 3DMemory Systems, Paul Franzon
10:00	Break
Session	4: Architecture
10:20	Efficient Mobility Centric Caching
10:40	Linear-Mark: Locality vs. Accuracy in Mark-Sweep Garbage Collection
11:00	The Feasibility of Utilizing Low-Performance DRAM in Disaggregated Systems
11:20	CLAM: Compiler Lease of Cache Memory *
11:40	Thoughts on Merging the File System with the Virtual Memory System
12:00	Lunch
Session	5: Processing in/near Memory
13:00	An In-Storage Processing Architecture with 3D NAND Heterogeneous Integration for Spectra Open Modification Search
13:20	Sadram: A new Memory Addressing Paradigm
13:40	Streaming Sparse Data on Architectures with Vector Extensions using Near Data Processing
14:00	PEPERONI: Pre-Estimating the Performance of Near-Memory Integration *
14:20	Break
Session	6: The Devil is in the Details
14:40	A Precise Measurement Platform for LPDDR4 Memories
15:00	Extending the Life of Old Systems with More Memory
15:20	Addressing DRAM Performance Analysis Challenges for Network-on-Chip (NoC) Design
15:40	Break
16:30	Spirited Discussion
19:00	Conference Dinner

Thursday, October 5th

8:00	Breakfast in Restaurant
Session	7: Prefetching and Paging
9:00	An Empirical Evaluation of PTE Coalescing
9:20	Building Efficient Neural Prefetcher
9:40	Protean: Resource-efficient Instruction Prefetching
10:00	Break
Session	8: Non Volatile Memories
10:20	MC-ELMM: Multi-Chip Endurance-Limited Memory Management
10:40	Critical Issues in Advanced ReRAM Development
11:00	ENTS: Flush-and-Fence-Free Failure Atomic Transactions
11:20	Closing Remarks and Award Ceremony

Papers marked with * are pandemic papers, presented at virtual MEMSYS 2020, 2021, or 2022.

Sponsors

Keynotes:

Monday, October 2nd

Tuesday, October 3rd

Wednesday, October 4th

Thursday, October 5th

Important Dates:

Venue: