Ataberk Olgun is a Computer Architecture researcher and a Ph.D. student in SAFARI Research Group at ETH Zürich led by Prof. Onur Mutlu. He obtained his BSc and MSc degrees from TOBB ETÜ under Prof. Oğuz Ergin’s supervision.
His research interests lie primarily in the area of Computer Architecture. He is interested in designing reliable (and as performance and energy efficient) memory systems from the ground up. These days, he is trying to develop a better understanding of the RowHammer problem and come up with more efficient system-level solutions to it. Previously, he developed an FPGA-based DDR4 memory testing platform that enabled cutting-edge research characterizing real DDR3/4 and HBM2 DRAM chips. He built an end-to-end system design for processing-in-memory techniques using real off-the-shelf DDR3 chips, which he also prototyped on an FPGA-based system.
He worked at Kasirgalabs until 2021 where he led the design of a RISC-V system-on-chip that was later manufactured using SKY130 technology.
Even before then, he worked with Prof. Kemal Bicakci to develop an authentication method based on behavioral biometrics.
- USENIX SecurityABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer MitigationAtaberk Olgun, Yahya Can Tugrul, Nisa Bostanci, Ismail Emir Yuksel, Haocong Luo, Steve Rhyner, Abdullah Giray Yaglikci, Geraldo F. Oliveira, and Onur MutluIn USENIX Security, 2024
We introduce ABACuS, a new low-cost hardware-counter-based RowHammer mitigation technique that performance-, energy-, and area-efficiently scales with worsening RowHammer vulnerability. We observe that both benign workloads and RowHammer attacks tend to access DRAM rows with the same row address in multiple DRAM banks at around the same time. Based on this observation, ABACuS’s key idea is to use a single shared row activation counter to track activations to the rows with the same row address in all DRAM banks. Unlike state-of-the-art RowHammer mitigation mechanisms that implement a separate row activation counter for each DRAM bank, ABACuS implements fewer counters (e.g., only one) to track an equal number of aggressor rows. Our evaluations show that ABACuS securely prevents RowHammer bitflips at low performance/energy overhead and low area cost. We compare ABACuS to four state-of-the-art mitigation mechanisms. At a near-future RowHammer threshold of 1000, ABACuS incurs only 0.58% (0.77%) performance and 1.66% (2.12%) DRAM energy overheads, averaged across 62 single-core (8-core) workloads, requiring only 9.47 KiB of storage per DRAM rank. At the RowHammer threshold of 1000, the best prior low-area-cost mitigation mechanism incurs 1.80% higher average performance overhead than ABACuS, while ABACuS requires 2.50X smaller chip area to implement. At a future RowHammer threshold of 125, ABACuS performs very similarly to (within 0.38% of the performance of) the best prior performance- and energy-efficient RowHammer mitigation mechanism while requiring 22.72X smaller chip area.
- ASP-DACFundamentally Understanding and Solving RowHammerOnur Mutlu, Ataberk Olgun, and A Giray YağlıkcıIn ASP-DAC, 2023
We provide an overview of recent developments and future directions in the RowHammer vulnerability that plagues modern DRAM (Dynamic Random Memory Access) chips, which are used in almost all computing systems as main memory. RowHammer is the phenomenon in which repeatedly accessing a row in a real DRAM chip causes bitflips (i.e., data corruption) in physically nearby rows. This phenomenon leads to a serious and widespread system security vulnerability, as many works since the original RowHammer paper in 2014 have shown. Recent analysis of the RowHammer phenomenon reveals that the problem is getting much worse as DRAM technology scaling continues: newer DRAM chips are fundamentally more vulnerable to RowHammer at the device and circuit levels. Deeper analysis of RowHammer shows that there are many dimensions to the problem as the vulnerability is sensitive to many variables, including environmental conditions (temperature & voltage), process variation, stored data patterns, as well as memory access patterns and memory control policies. As such, it has proven difficult to devise fully-secure and very efficient (i.e., low-overhead in performance, energy, area) protection mechanisms against RowHammer and attempts made by DRAM manufacturers have been shown to lack security guarantees. After reviewing various recent developments in exploiting, understanding, and mitigating RowHammer, we discuss future directions that we believe are critical for solving the RowHammer problem. We argue for two major directions to amplify research and development efforts in: 1) building a much deeper understanding of the problem and its many dimensions, in both cutting-edge DRAM chips and computing systems deployed in the field, and 2) the design and development of extremely efficient and fully-secure solutions via system-memory cooperation.
- ACM TACOPiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAMAtaberk Olgun, Juan Gómez Luna, Konstantinos Kanellopoulos, Behzad Salami, Hasan Hassan, Oguz Ergin, and Onur MutluACM TACO, 2023
Processing-using-memory (PuM) techniques leverage the analog operation of memory cells to perform computation. Several recent works have demonstrated PuM techniques in off-the-shelf DRAM devices. Since DRAM is the dominant memory technology as main memory in current computing systems, these PuM techniques represent an opportunity for alleviating the data movement bottleneck at very low cost. However, system integration of PuM techniques imposes non-trivial challenges that are yet to be solved. Design space exploration of potential solutions to the PuM integration challenges requires appropriate tools to develop necessary hardware and software components. Unfortunately, current specialized DRAM-testing platforms, or system simulators do not provide the flexibility and/or the holistic system view that is necessary to deal with PuM integration challenges. We design and develop PiDRAM, the first flexible end-to-end framework that enables system integration studies and evaluation of real PuM techniques. PiDRAM provides software and hardware components to rapidly integrate PuM techniques across the whole system software and hardware stack (e.g., necessary modifications in the operating system, memory controller). We implement PiDRAM on an FPGA-based platform along with an open-source RISC-V system. Using PiDRAM, we implement and evaluate two state-of-the-art PuM techniques: in-DRAM (i) copy and initialization, (ii) true random number generation. Our results show that the in-memory copy and initialization techniques can improve the performance of bulk copy operations by 12.6x and bulk initialization operations by 14.6x on a real system. Implementing the true random number generator requires only 190 lines of Verilog and 74 lines of C code using PiDRAM’s software and hardware components.
- DSNAn Experimental Analysis of RowHammer in HBM2 DRAM ChipsAtaberk Olgun, Majd Osseiran, Yahya Can Tuğrul, Haocong Luo, Steve Rhyner, Behzad Salami, Juan Gomez Luna, and Onur MutluIn DSN, 2023
RowHammer (RH) is a significant and worsening security, safety, and reliability issue of modern DRAM chips that can be exploited to break memory isolation. Therefore, it is important to understand real DRAM chips’ RH characteristics. Unfortunately, no prior work extensively studies the RH vulnerability of modern 3D-stacked high-bandwidth memory (HBM) chips, which are commonly used in modern GPUs. In this work, we experimentally characterize the RH vulnerability of a real HBM2 DRAM chip. We show that 1) different 3D-stacked channels of HBM2 memory exhibit significantly different levels of RH vulnerability (up to 79% difference in bit error rate), 2) the DRAM rows at the end of a DRAM bank (rows with the highest addresses) exhibit significantly fewer RH bitflips than other rows, and 3) a modern HBM2 DRAM chip implements undisclosed RH defenses that are triggered by periodic refresh operations. We describe the implications of our observations on future RH attacks and defenses and discuss future work for understanding RH in 3D-stacked memories.
- IEEE TCADDRAM Bender: An Extensible and Versatile FPGA-based Infrastructure to Easily Test State-of-the-art DRAM ChipsAtaberk Olgun, Hasan Hassan, A Giray Yağlıkçı, Yahya Can Tuğrul, Lois Orosa, Haocong Luo, Minesh Patel, Oğuz Ergin, and Onur MutluIEEE TCAD, 2023
To understand and improve DRAM performance, reliability, security and energy efficiency, prior works study characteristics of commodity DRAM chips. Unfortunately, state-of-the-art open source infrastructures capable of conducting such studies are obsolete, poorly supported, or difficult to use, or their inflexibility limit the types of studies they can conduct. We propose DRAM Bender, a new FPGA-based infrastructure that enables experimental studies on state-of-the-art DRAM chips. DRAM Bender offers three key features at the same time. First, DRAM Bender enables directly interfacing with a DRAM chip through its low-level interface. This allows users to issue DRAM commands in arbitrary order and with finer-grained time intervals compared to other open source infrastructures. Second, DRAM Bender exposes easy-to-use C++ and Python programming interfaces, allowing users to quickly and easily develop different types of DRAM experiments. Third, DRAM Bender is easily extensible. The modular design of DRAM Bender allows extending it to (i) support existing and emerging DRAM interfaces, and (ii) run on new commercial or custom FPGA boards with little effort. To demonstrate that DRAM Bender is a versatile infrastructure, we conduct three case studies, two of which lead to new observations about the DRAM RowHammer vulnerability. In particular, we show that data patterns supported by DRAM Bender uncovers a larger set of bit-flips on a victim row compared to the data patterns commonly used by prior work. We demonstrate the extensibility of DRAM Bender by implementing it on five different FPGAs with DDR4 and DDR3 support.
- ISCARowPress: Amplifying Read Disturbance in Modern DRAM ChipsHaocong Luo, Ataberk Olgun, Abdullah Giray Yağlıkçı, Yahya Can Tuğrul, Steve Rhyner, Meryem Banu Cavlak, Joël Lindegger, Mohammad Sadrosadati, and Onur MutluIn ISCA, 2023
Memory isolation is critical for system reliability, security, and safety. Unfortunately, read disturbance can break memory isolation in modern DRAM chips. For example, RowHammer is a well-studied read-disturb phenomenon where repeatedly opening and closing (i.e., hammering) a DRAM row many times causes bitflips in physically nearby rows. This paper experimentally demonstrates and analyzes another widespread read-disturb phenomenon, RowPress, in real DDR4 DRAM chips. RowPress breaks memory isolation by keeping a DRAM row open for a long period of time, which disturbs physically nearby rows enough to cause bitflips. We show that RowPress amplifies DRAM’s vulnerability to read-disturb attacks by significantly reducing the number of row activations needed to induce a bitflip by one to two orders of magnitude under realistic conditions. In extreme cases, RowPress induces bitflips in a DRAM row when an adjacent row is activated only once. Our detailed characterization of 164 real DDR4 DRAM chips shows that RowPress 1) affects chips from all three major DRAM manufacturers, 2) gets worse as DRAM technology scales down to smaller node sizes, and 3) affects a different set of DRAM cells from RowHammer and behaves differently from RowHammer as temperature and access pattern changes. We demonstrate in a real DDR4-based system with RowHammer protection that 1) a user-level program induces bitflips by leveraging RowPress while conventional RowHammer cannot do so, and 2) a memory controller that adaptively keeps the DRAM row open for a longer period of time based on access pattern can facilitate RowPress-based attacks. To prevent bitflips due to RowPress, we describe and evaluate a new methodology that adapts existing RowHammer mitigation techniques to also mitigate RowPress with low additional performance overhead. We open source all our code and data to facilitate future research on RowPress.
- arXivUnderstanding Read Disturbance in High Bandwidth Memory: An Experimental Analysis of Real HBM2 DRAM ChipsAtaberk Olgun, Majd Osseiran, Yahya Can Tuğrul, Haocong Luo, Steve Rhyner, Behzad Salami, Juan Gomez Luna, and Onur Mutlu2023
DRAM read disturbance is a significant and worsening safety, security, and reliability issue of modern DRAM chips that can be exploited to break memory isolation. Two prominent examples of read-disturb phenomena are RowHammer and RowPress. However, no prior work extensively studies read-disturb phenomena in modern high-bandwidth memory (HBM) chips. In this work, we experimentally demonstrate the effects of read disturbance and uncover the inner workings of undocumented in-DRAM read disturbance mitigation mechanisms in HBM. Our characterization of six real HBM2 DRAM chips shows that (1) the number of read disturbance bitflips and the number of row activations needed to induce the first read disturbance bitflip significantly varies between different HBM2 chips and different 3D-stacked channels, pseudo channels, banks, and rows inside an HBM2 chip. (2) The DRAM rows at the end and in the middle of a DRAM bank exhibit significantly fewer read disturbance bitflips than the rest of the rows. (3) It takes fewer additional activations to induce more read disturbance bitflips in a DRAM row if the row exhibits the first bitflip already at a relatively high activation count. (4) HBM2 chips exhibit read disturbance bitflips with only two row activations when rows are kept active for an extremely long time. We show that a modern HBM2 DRAM chip implements undocumented read disturbance defenses that can track potential aggressor rows based on how many times they are activated, and refresh their victim rows with every 17 periodic refresh operations. We draw key takeaways from our observations and discuss their implications for future read disturbance attacks and defenses. We explain how our findings could be leveraged to develop both i) more powerful read disturbance attacks and ii) more efficient read disturbance defense mechanisms.
- ACM TACOMetaSys: A Practical Open-Source Metadata Management System to Implement and Evaluate Cross-Layer OptimizationsNandita Vijaykumar, Ataberk Olgun, Konstantinos Kanellopoulos, F Nisa Bostanci, Hasan Hassan, Mehrshad Lotfi, Phillip B Gibbons, and Onur MutluACM TACO, 2022
This paper introduces the first open-source FPGA-based infrastructure, MetaSys, with a prototype in a RISC-V core, to enable the rapid implementation and evaluation of a wide range of cross-layer techniques in real hardware. Hardware-software cooperative techniques are powerful approaches to improve the performance, quality of service, and security of general-purpose processors. They are however typically challenging to rapidly implement and evaluate in real hardware as they require full-stack changes to the hardware, OS, system software, and instruction-set architecture (ISA). MetaSys implements a rich hardware-software interface and lightweight metadata support that can be used as a common basis to rapidly implement and evaluate new cross-layer techniques. We demonstrate MetaSys’s versatility and ease-of-use by implementing and evaluating three cross-layer techniques for: (i) prefetching for graph analytics; (ii) bounds checking in memory unsafe languages, and (iii) return address protection in stack frames; each technique only requiring 100 lines of Chisel code over MetaSys. Using MetaSys, we perform the first detailed experimental study to quantify the performance overheads of using a single metadata management system to enable multiple cross-layer optimizations in CPUs. We identify the key sources of bottlenecks and system inefficiency of a general metadata management system. We design MetaSys to minimize these inefficiencies and provide increased versatility compared to previously-proposed metadata systems. Using three use cases and a detailed characterization, we demonstrate that a common metadata management system can be used to efficiently support diverse cross-layer techniques in CPUs.
- arXivA Case for Self-Managing DRAM Chips: Improving Performance, Efficiency, Reliability, and Security via Autonomous in-DRAM Maintenance OperationsHasan Hassan, Ataberk Olgun, A Giray Yaglikci, Haocong Luo, and Onur Mutlu2022
The memory controller is in charge of managing DRAM maintenance operations (e.g., refresh, RowHammer protection, memory scrubbing) in current DRAM chips. Implementing new maintenance operations often necessitates modifications in the DRAM interface, memory controller, and potentially other system components. Such modifications are only possible with a new DRAM standard, which takes a long time to develop, leading to slow progress in DRAM systems. In this paper, our goal is to 1) ease, and thus accelerate, the process of enabling new DRAM maintenance operations and 2) enable more efficient in-DRAM maintenance operations. Our idea is to set the memory controller free from managing DRAM maintenance. To this end, we propose Self-Managing DRAM (SMD), a new low-cost DRAM architecture that enables implementing new in-DRAM maintenance mechanisms (or modifying old ones) with no further changes in the DRAM interface, memory controller, or other system components. We use SMD to implement new in-DRAM maintenance mechanisms for three use cases: 1) periodic refresh, 2) RowHammer protection, and 3) memory scrubbing. We show that SMD enables easy adoption of efficient maintenance mechanisms that significantly improve the system performance and energy efficiency while providing higher reliability compared to conventional DDR4 DRAM. A combination of SMD-based maintenance mechanisms that perform refresh, RowHammer protection, and memory scrubbing achieve 7.6% speedup and consume 5.2% less DRAM energy on average across 20 memory-intensive four-core workloads.
- ISCAQUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM ChipsAtaberk Olgun, Minesh Patel, A Giray Yağlıkçı, Haocong Luo, Jeremie S Kim, F Nisa Bostancı, Nandita Vijaykumar, Oğuz Ergin, and Onur MutluIn ISCA, 2021
True random number generators (TRNG) sample random physical processes to create large amounts of random numbers for various use cases, including security-critical cryptographic primitives, scientific simulations, machine learning applications, and even recreational entertainment. Unfortunately, not every computing system is equipped with dedicated TRNG hardware, limiting the application space and security guarantees for such systems. To open the application space and enable security guarantees for the overwhelming majority of computing systems that do not necessarily have dedicated TRNG hardware, we develop QUAC-TRNG. QUAC-TRNG exploits the new observation that a carefully-engineered sequence of DRAM commands activates four consecutive DRAM rows in rapid succession. This QUadruple ACtivation (QUAC) causes the bitline sense amplifiers to non-deterministically converge to random values when we activate four rows that store conflicting data because the net deviation in bitline voltage fails to meet reliable sensing margins. We experimentally demonstrate that QUAC reliably generates random values across 136 commodity DDR4 DRAM chips from one major DRAM manufacturer. We describe how to develop an effective TRNG (QUAC-TRNG) based on QUAC. We evaluate the quality of our TRNG using NIST STS and find that QUAC-TRNG successfully passes each test. Our experimental evaluations show that QUAC-TRNG generates true random numbers with a throughput of 3.44 Gb/s (per DRAM channel), outperforming the state-of-the-art DRAM-based TRNG by 15.08x and 1.41x for basic and throughput-optimized versions, respectively. We show that QUAC-TRNG utilizes DRAM bandwidth better than the state-of-the-art, achieving up to 2.03x the throughput of a throughput-optimized baseline when scaling bus frequencies to 12 GT/s.