3D NAND， How to evolve?

Nov 11, 2025

Since its introduction to the memory market in the late 1980s, NAND flash memory has fundamentally changed the way large amounts of data are stored and retrieved.

This non-volatile memory designed specifically for high-density data storage is applied in almost every field of the electronic market, from smartphones to data centers, covering everything. It is used in most removable and portable storage devices, such as SD cards and USB drives. In recent years, 3D NAND has also played an important role in the booming development of artificial intelligence, providing an efficient storage solution for the large amount of data required for training AI models.

With the explosive growth of data storage demand, chip companies are competing to increase the storage cell density of NAND flash memory (in gigabits per square millimeter (Gb/mm ²)) while reducing the cost per bit. More than a decade ago, the semiconductor industry transitioned from 2D NAND to 3D NAND to overcome the limitations of traditional memory size reduction. In recent years, companies have increased storage density by increasing the number of storage cell layers per chip and the number of storage bits per cell (commercial NAND flash memory can reach up to four bits).

One of the most important advances is the transition from floating gate transistors to charge trap units. Floating gate technology stores charges in conductors, while charge trap units store charges in insulators. This reduces the electrostatic coupling between storage units, thereby improving read and write performance. In addition, due to the smaller manufacturing size of charge trap units compared to floating gate transistors, it also paves the way for higher storage densities.

But as 3D NAND technology continues to break through physical limits, the semiconductor industry is turning to various new technologies to arrange storage units more tightly - not only horizontally, but also vertically. Several innovative technologies developed by IMEC have achieved vertical expansion without sacrificing the performance and reliability of the memory: air gap integration and charge trap layer separation.

Inside the Charge Trap Unit: The Basic Building Blocks of 3D NAND

The semiconductor industry plans to apply full ring gate (GAA) or nanosheet transistors to logic chips in the coming years. But the GAA architecture has been widely applied in the field of 3D NAND flash memory and is the main force in high-density data storage. In this 3D architecture, storage units are stacked in vertical chains and addressed through horizontal word lines.

In most cases, charge trap cells act as storage devices in 3D NAND. This storage unit is similar to a MOSFET, but it embeds a thin layer of silicon nitride (SiN) within the gate oxide layer of the transistor. This turns the gate oxide layer into a semiconductor material layer called an oxide nitride oxide (ONO) stack, where each layer serves as a barrier oxide layer, a trap nitride layer, and a tunnel oxide layer (Figure 1).

The figure shows a 3D NAND GAA architecture with a series of vertical charge trap cells, which have oxide nitride oxide (ONO) gate dielectrics and a limited number of word lines (WL).

When a positive bias voltage is applied to the gate, electrons in the channel region tunnel through the silicon oxide layer and are captured in the silicon nitride layer. This will increase the threshold voltage of the transistor. The state of a storage cell can be measured by applying a voltage between the source and drain electrodes. If current flows, it indicates that no electrons are captured and the storage unit is in the "1" state. If no current is measured, the storage unit is in a so-called "electron captured" state, corresponding to "0".

The charge trap unit is implemented in a 3D NAND structure using the GAA vertical channel method. Imagine rotating a planar transistor 90 degrees, with the vertical conductive channel surrounded by a gate stack structure.

The manufacturing process of GAA channel first involves alternately stacking conductors (silicon, used as word lines) and insulation layers (silicon oxide, used to separate word lines). Next, use advanced dry etching tools to drill down and form cylindrical holes. Finally, alternate deposition of silicon oxide and silicon nitride layers on the sidewalls of the holes, with the channel of the polycrystalline silicon transistor located at the center of all layers. This structure is commonly referred to as the 'macaroni channel'.

Next Generation 3D NAND: Cell Stacking and Cell Scaling

In the coming years, the memory industry will push the GAA based 3D NAND flash roadmap to its ultimate limit.

Nowadays, mainstream manufacturers are launching 3D NAND flash memory chips composed of over 300 layers of oxide/word line stacks (Figure 2). It is expected that by 2030, this number will further increase to 1000 layers, equivalent to approximately 100 Gbit/mm ² of storage capacity. The challenge is how to maintain a consistent word line diameter in a 30 micron thick stacked layer. However, maintaining uniformity of all components in such a small space will continuously increase the complexity and cost of the process, placing higher demands on high stack deposition and high aspect ratio etching processes.

This 3D NAND flash image highlights the z-spacing between adjacent word lines.

In order to accommodate stacking more layers, semiconductor companies are investing in the development of various supporting tools to improve the storage density of 3D NAND. These 'expansion accelerators' include increasing the number of bits per unit and reducing the xy spacing of GAA units (lateral expansion). In addition to improving bit density and cell density, companies are also taking measures to increase the area efficiency of storage arrays.

Another method to increase storage capacity is stacking technology, which involves stacking flash memory devices on top of each other to increase the total number of layers. In 3D NAND flash memory, storage cells are connected in series to form a chain, which is achieved by alternately stacking insulation layers and conductor layers and drilling holes on them. The unit stacking process can be repeated two to three times - possibly even four times in the future - to create longer chains on each chip. Each unit stack is sometimes referred to as a 'layer'.

By stacking a large number of storage units and stacking each layer to create higher 3D NAND chips, enterprises can increase the total number of layers without having to manufacture all layers at once. For example, a company can assemble 250 layers of storage units and stack four of them into a 3D NAND chip with 1000 layers. The main challenge is how to etch deep enough holes on these multi-layer storage chips and evenly fill these holes.

In addition, some companies are separating the underlying logic from NAND arrays and re integrating it onto NAND arrays in a configuration called CMOS bonded array (CbA). In this configuration, CMOS chips are manufactured on separate silicon wafers and then connected to NAND arrays using advanced packaging techniques, particularly hybrid bonding technology. CbA is the next stage of development for CMOS Down Array (CuA), in which NAND chips are directly manufactured on top of CMOS chips in the same single-chip process.

Looking ahead, companies are considering bonding multiple storage arrays onto a single CMOS wafer as an alternative to layered stacking - even bonding multiple array wafers onto multiple CMOS wafers.

In order to control the continuously rising manufacturing costs, IMEC and other semiconductor companies are actively exploring vertical or "z-spacing" scaling technologies to reduce the thickness of oxide layers and word line layers. In this way, more storage layers can be stacked at a controllable cost.

Advantages and disadvantages of Z-spacing scaling in 3D NAND flash memory

Reducing the spacing between storage layers is crucial for continuously lowering the cost of next-generation 3D NAND. The spacing between adjacent word lines is about 40 nanometers, and the purpose of scaling the z-axis spacing is to further reduce the thickness of the word line layer and the silicon oxide layer in the stacked structure. In this way, for every micrometer increase in stacking height, the number of storage layers can be increased, thereby increasing the number of storage units and ultimately reducing costs.

However, without optimization, scaling the z-axis spacing will have a negative impact on the electrical performance of the storage unit. This may lead to a decrease in threshold voltage, an increase in subthreshold swing, and a decrease in data retention capability. In addition, it will also increase the voltage required to program and erase the data stored in the storage unit, which will inevitably increase power consumption, reduce the speed of the storage unit (RC delay), and may lead to breakdown of the gate dielectric between adjacent units.

These effects can be traced back to two physical phenomena that become more pronounced when memory cells are squeezed closer together: intercellular interference and lateral charge transfer.

When the thickness of the word line layer decreases, the gate length of the charge trap transistor also shortens accordingly. As a result, the control ability of the gate over the channel gradually weakens, thereby promoting electrostatic coupling between different cells.

In addition to mutual interference between cells, the reduction of storage cells in the vertical direction can also lead to lateral charge transfer (or vertical charge loss): the charges captured inside the storage cells often migrate out of the vertical SiN layer, thereby affecting data retention.

The charge trap unit has two geometric directions: z and xy (due to the cylindrical symmetry of the unit, x and y have the same size). Charge can leak from the storage unit in these two directions. The charge will pass through the tunnel in the gate along the xy direction and/or block the oxide from escaping the unit, while also escaping along the z direction, ultimately entering the interior of adjacent units or being too close to them. This is due to lateral charge transfer, which becomes more significant as the vertical size of the cells decreases and the distance between them decreases.

Next, we will discuss the technological driving factors that can address these drawbacks, enabling researchers to unlock z-spacing scaling for future generations of 3D NAND flash memory.

Between word lines: using air gaps to reduce cell interference

Integrating air gaps between adjacent word lines is a potential solution to address inter cell interference issues. The dielectric constant of these air gaps is lower than that of the gate to gate dielectric, thereby reducing the electrostatic coupling between storage cells. This technology has been widely applied in planar two-dimensional NAND flash memory architectures. However, integrating air gaps into high silicon oxide/word line stack structures is more challenging.

To overcome these complexities, IMEC proposed a unique integration scheme at the IEEE International Memory Workshop (IMW) in 2025, which enables precise control of the air gap position between word lines.

In 3D NAND memory, a thin layer of silicon oxide is placed inside the gate of the storage unit - as a "gate dielectric", separating the word line from the transistor channel - and between the word lines of different storage units - as a "gate to gate dielectric", separating adjacent units from each other (Figure 3). The gate dielectric constitutes the tunnel layer and barrier layer of the ONO stack structure, and surrounds the charge trap SiN layer.

3. The 3D integrated process flow of the air gap (ad) shown in the figure, as well as the transmission electron microscopy (TEM) and energy dispersive X-ray spectroscopy (EDS) images of the air gap (ef).

Therefore, silicon oxide not only exists inside each storage cell, but also between cells. Due to the manufacturing process of 3D NAND storage cells, the gate dielectric extends continuously from one cell to another and intersects with the inter gate dielectric in the space between adjacent storage cells. IMEC believes that this is the ideal location for placing the air gap. However, with current process technology, removing (or cutting) the charge trap SiN layer between cells remains a huge challenge.

At IMEC, we have found a new method to integrate air gaps without cutting SiN from the storage unit. This innovation introduces an air gap from within the storage hole region by concaving the intergate silicon oxide before depositing the ONO stack layer. The air gap and word line self align to achieve very precise placement. This method also has potential scalability, which is the main issue with other proposed solutions.

The results indicate that devices with air gaps are less sensitive to interference from adjacent cells than devices without air gaps. This conclusion is drawn by applying a so-called "on voltage" on the unselected gate, which results in a smaller threshold voltage shift for bandgap devices (Figure 4). This result was obtained on a test device with limited word line layers, a spacing of 30 nm (gate length of 15 nm, thickness of the silicon oxide dielectric layer between gates of 15 nm), and a storage hole diameter of 80 nm.

4. Threshold voltage changes of charge trap devices with and without air gaps (left) at different passing voltages.

IMEC researchers also investigated the impact of air gaps on memory performance and reliability. The results indicate that the air gap does not affect the operation of the memory, and its durability can reach 1000 programming/erasing cycles, which is comparable to devices without air gaps.

Based on these results, hole side air gap integration is considered a key step in achieving future z-axis spacing scaling.

Charge trap cutting: its position in the future development of flash memory

IMEC has proven that introducing an air gap in the gate dielectric layer is feasible. However, currently these cavities in storage units only exist before blocking the oxide layer. What if we could drill deeper into the storage unit and introduce air gaps into the regions of the barrier oxide layer and charge trap layer?

We tested this method in simulation and the results showed that this charge trap layer separation (or charge trap cutting) can increase the storage window of the storage unit (Figure 5). In addition, charge trap cutting can prevent the captured charges in the storage unit from laterally migrating along the SiN line from top to bottom along the height direction of the oxide layer/word line stack.

5. The difference between a continuous gate stack (left) and a gate stack with charge trap layer cutting and air gap integration (right).

The data is stored in flash memory units by programming the threshold voltage to different levels. To store one bit of data, a cell requires two levels: for example, 0V and 1V. To store two bits of data, a cell requires four levels: for example, 0V, 0.5V, 1V, and 1.5V. As the number of bits increases, the number of required voltage levels also increases.

It is necessary to increase the total range of threshold voltage (storage window) or reduce the interval between adjacent levels (using a 1-bit interval of 1 V and a 2-bit interval of 0.5 V). However, when these voltage levels are too close, distinguishing them becomes even more difficult. By increasing the storage window, charge trap reduction technology can help each storage unit achieve more levels, thereby storing more bits.

However, integrating charge trap cutting in 3D NAND flash memory is not an easy task, as it requires directional etching and deposition of extremely deep and narrow hole walls. For this structure, the technical toolbox used for 2D NAND flash memory is no longer applicable. Currently, IMEC is collaborating with its suppliers to develop new technologies to achieve controllable charge trap cutting.

Once the charge trap layer can be interrupted, IMEC plans to combine it with an air gap integration scheme to provide a complete and scalable solution for the z-spacing scaling challenge.