Maximum memory bandwidth ddr4. The beginning of a new era

In order for modern games to run faster, the computer needs not only , but also a sufficient amount of RAM. Why is this necessary? Today's games have very large locations with a considerable number of objects, which are stored in the RAM. If there is not enough RAM, the game will access permanent memory and, if it is a slow HDD, the user will invariably receive freezes.

Corridor shooters may not require a lot of memory, but if you play large-scale RTS or FPS games, this makes a difference. For example, to play Battlefield 1, the manufacturer recommends using 16 GB of RAM or higher. If you have not yet decided how much RAM you need, use ours.

Samsung DDR4 2666 DIMM 8Gb

Probably, almost every user has heard about this sensational model from Samsung. Unfortunately, you won’t find a set of several planks, but nothing prevents you from buying one piece at a time and installing them together. In addition to its extremely low cost, this memory has excellent overclocking potential, which is why overclockers love this RAM. The stock speed here is only 2666 MHz, but without much difficulty on a good motherboard this module will take the frequency from 3200 to 3666 MHz, despite the fact that dual-rank memory usually runs worse than single-rank memory.

Advantages

Excellent overclocking potential
Very cheap
Very common in the market

Flaws

Appearance - couldn’t be simpler

Patriot Memory PV416G320C6K

If you don’t want to overclock, but your budget is very limited, then we recommend looking towards the Patriot Memory company. The factory overclocked kit has a frequency of 3200 MHz. If you want, you can, of course, try to squeeze more, but most likely you won’t succeed. The dual-rank memory PV416G320C6K will run at 3200 MHz only when the XMP profile is activated and the timings are increased. Out of the box you will only see a measly 2133 MHz.

In addition to the high frequency, the developers offer the user an interesting design that will fit well into the red assembly. In addition, it is possible to detach the radiator if suddenly you cannot install a tower cooler to cool the processor. The kit comes with a 10-year warranty!

Advantages

Low cost
High frequency with XMP profile support
Nice design
Detachable radiators
10 years warranty

Flaws

No specific memory chip vendor

Kingston HyperX HX432C16PB3K2/16

Kingston is one of the oldest memory manufacturers on the market. Its HyperX brand is aimed at gamers, and its products meet high quality standards. It is not surprising that the memory kit HyperX HX432C16PB3K2/16 A lifetime warranty is provided. Of course, this is not the cheapest option, but it still works out very well on a budget.

The operating memory frequency of this model is the same as the previous set - 3200 MHz with support for the XMP profile, but overclocking is much more stable. Apparently, this is precisely what the buyer overpays for when compared with Patriot Memory PV416G320C6K. It is also worth noting the traditional black aggressive style of Kingston.

Advantages

Lifetime Warranty
Stable overclocking
Interesting design

Flaws

Slightly overpriced

Patriot Memory PVS416G400C9K

If you own a processor from Ryzen, then you are probably looking at high-frequency memory that is overclocked from the factory. Patriot Viper brings to your attention the cheapest “whale” on the market, which will operate at 4000 MHz. Of course, in order to run the bars at such a frequency, you will need to dance with a tambourine for a long time in any case, but the performance gain is worth it. Please note that even the highest quality peer-to-peer models built on B-die chips will not always be able to reach the 400 MHz mark. So why pay extra for a brand then, right?

Advantages

Interesting design
High frequency from factory
B-die chips are often found
Low cost
Almost always in stock

DDR4 SDRAM is the latest JEDEC memory standard. It provides higher levels of performance, with lower power consumption and greater reliability than DDR3.

JEDEC began work on DDR4 back in 2005, with the final specification in September 2012. Samsung released the first prototypes of DDR4 modules at the end of 2010, and the first sample of a 16GB DDR4 module in July 2012. The first ones to support DDR4 memory were released with the Intel X99 chipset in August 2014.

DDR4 SDRAM modules use the Pseudo Open Drain (POD) interface (previously used in high-performance graphics DRAM) and operate at a lower voltage of 1.2 V (compared to 1.5 V for DDR3). This allows DDR4 modules to consume 40% less total power than previous modules. This saves energy and generates less heat. And also, DDR4, to improve system reliability, supports cyclic redundancy check (CRC) recording.

The 288-pin DDR4 SDRAM module is 1mm longer and 1mm taller than 240-pin DDR3/DDR2 modules. This was achieved by creating individual pins that were only 0.85mm wide. Which is smaller than the 1 mm pins used on previous modules. About halfway between the edge and the center notch, DDR4 SDRAM modules don't flex much. Which, to make installation easier, makes the outer contacts at the central cutout shorter than the pins. Due to the use of different sizes and signals, DDR4 modules are physically and electrically incompatible with previous memory modules and socket designs.

DDR4 modules were available at speeds of 1600 MHz (effective) and higher. Currently, with speeds up to 3,200 MHz (effective). As with DDR and DDR3, the true clock speed is half the effective speed, which is technically expressed in millions of transfers per second (MTps). The table below shows JEDEC's officially approved DDR4 module types and their throughput specifications.

JEDEC Standard DDR4 Modules(260-pin DIMM) Speeds and Transfer Rates

Module type Chip type Base clock speed Cycle time Cycles over time Bus speed Tire width Baud Rate Module Dual Channel Data Rate
PC4-12800 DDR4-1600 800MHz 1.25ns 2 1,600MTps 8 bytes 12,800MBps 25,600MBps
PC4-14900 DDR4-1866 933MHz 1.07ns 2 1,866MTps 8 bytes 14.933MBps 29.866MBps
PC4-17000 DDR4-2133 1066MHz 0.94ns 2 2,133MTps 8 bytes 17.066MBps 34.133MBps
PC4-19200 DDR4-2400 1,200MHz 0.83ns 2 2,400MTps 8 bytes 19,200MBps 38,400MBps
PC4-21300 DDR4-2666 1.333MHz 0.75ns 2 2,666MTps 8 bytes 21.333MBps 42.666MBps
PC4-25600 DDR4-3200 1,600MHz 0.63ns 2 3,200MTps 8 bytes 25,600MBps 51,200MBps

DDR = Double Data Rate
MHz = million cycles per second
MTps = million transfers per second
Mbps = million bytes per second
NS = nanoseconds (billionths of a second)

Technically, the DDR4 topology is not a bus, as was used in DDR3 and earlier memory standards. Instead, DDR4 SDRAM uses a point-to-point connection, where each channel in the memory controller is connected to a single module.

Typically, you can find DDR4 modules rated CL12 - CL16.

RDRAM

Rambus DRAM (RDRAM) is a proprietary memory technology (not JEDEC) that was used primarily in some Intel-based Pentium III and 4 systems from 2000 to 2002. Today these systems are almost never used.

We carried out a small express test of the operation of LGA1151 processors with memory, such as DDR3 and DDR4, last year, and this year we slightly expanded the studied area in the direction of budget models for this platform. In general, there was a feeling that the new type of memory does not have any performance advantages, but it does save some energy, which in recent years has become the main focus of Intel's efforts when developing new microarchitectures. True, we did not study the effect of memory on the power consumption of older models of Intel processors. And in general, their tests were carried out using the old testing methodology, and very different motherboards, etc., so the conclusions made last year may become outdated. Therefore, we decided to investigate the issue more carefully and in detail.

Test bench configuration

CPUIntel Celeron G3900Intel Pentium G4500TIntel Core i3-6100Intel Core i5-6400Intel Core i7-6700K
Kernel nameSkylakeSkylakeSkylakeSkylakeSkylake
Production technology14 nm14 nm14 nm14 nm14 nm
Core frequency std/max, GHz2,8 3,0 3,7 2,7/3,3 4,0/4,2
Number of cores/threads2/2 2/2 2/4 4/4 4/8
L1 cache (total), I/D, KB64/64 64/64 64/64 128/128 128/128
L2 cache, KB2×2562×2562×2564×2564×256
L3 (L4) cache, MiB2 3 3 6 8
RAM2×DDR3-1600 /
2×DDR4-2133
2×DDR3-1600 /
2×DDR4-2133
2×DDR3-1600 /
2×DDR4-2133
2×DDR3-1600 /
2×DDR4-2133
2×DDR3-1600 /
2×DDR4-2133
TDP, W51 35 51 65 91
GraphicsHDG 510HDG 530HDG 530HDG 530HDG 530
Qty EU12 23 23 24 24
Frequency std/max, MHz350/950 350/950 350/1050 350/950 350/1150
PriceT-13475848T-12874617T-12874330T-12873939T-12794508

We used five processors, and two of them had already been tested earlier - which is why today we will use the results of the Pentium G4500T, and not the somewhat more relevant G4500/G4520 for retail buyers: the usual time savings. Still, we are most interested not in them, but in processors of a slightly higher class - for example, the younger ones in the Core i3-6100 and i5-6400 lines. Why the younger ones? It seems to us that it is these buyers who are most likely to want to save money when upgrading the system without changing the hardware from DDR3 to DDR4. And when buying a new system, the fact that at the moment budget boards with DDR3 support are slightly cheaper than their counterparts with DDR4 slots is most important for those who are assembling a budget computer. And if he can afford some Core i3-6320, then it would be better to “hold out” to the “real quad-core” Core i5-6400. But, nevertheless, we also couldn’t help but test the top-end Core i7-6700K together with DDR3 - after all, this is Intel’s fastest (and most power-hungry) offering for this platform, and therefore extremely necessary for assessing the maximum potential effect of switching to new memory standard.

As for the memory modules themselves, in both cases we used a pair of them with a total capacity of 8 GB. The frequency corresponded to that supported by the standard - 1600 MHz for DDR3 and 2133 MHz for DDR4. In principle, some motherboard manufacturers offer memory overclocking capabilities for DDR3, but there is one delicate point - to achieve high frequencies, the supply voltage is usually increased to 1.65 V (instead of the standard 1.5 V). At the same time, Intel has not recommended doing this since the days of LGA1156, warning that increased voltage can lead to damage to the processor. But officially, devices for LGA1151 are allowed to work not even with DDR3, but with DDR3L operating at a voltage of 1.35 V, i.e. for them this problem may be more pronounced. However, to be fair, over the past seven years we have never encountered processor failures, even when using “overclocker” modules. Moreover, we have not heard of situations in which it was possible to unambiguously declare the presence of such problems. But you know who saves the thrifty :) Moreover, various “high-end” modules with decorative radiators and other LEDs are still not suitable for the concept of minimizing the price of the system, since they are already more expensive than mass-produced DDR4. But the banal DDR3-1600 can still be useful.

Two motherboards were required. Ideally, of course, such testing should have been carried out on a universal model, three of which are already in ASRock’s assortment, but we haven’t gotten our hands on them yet. Therefore, we simply took two boards that were as similar as possible in design and even in purpose: ASRock Fatal1ty B150 Gaming K4 and Asus B150 Pro Gaming D3. And they are based on the same chipset, which can also be important, as well as a similar (ten-channel) processor power circuit.

Testing methodology

The technique is described in detail in a separate article. Let us briefly recall here that it is based on the following four pillars:

  • Methodology for measuring power consumption when testing processors
  • Methodology for monitoring power, temperature and processor load during testing

And detailed results of all tests are available in the form of a complete table with results (in Microsoft Excel 97-2003 format). In our articles, we use already processed data. This especially applies to application tests, where everything is normalized relative to the reference system (like last year, a laptop based on a Core i5-3317U with 4 GB of memory and a 128 GB SSD) and grouped by areas of application of the computer.

iXBT Application Benchmark 2016

The very first group of programs brought a surprise - on three out of five processors, DDR3 turned out to be faster than DDR4. Studying the detailed results shows that we have one program to “thank” for this, namely Adobe After Effects CC 2015. Its previous version, as I remember, spoiled a lot of blood for us due to its requirements for memory capacity (and depending on other hardware environment), Now here’s a new misfortune - and it’s related specifically to memory. On slow processors, however, it is imperceptible - there the confidence intervals of different measurements overlap significantly. But if it is possible to use four or more computation threads, the difference can no longer be attributed to error: on the Core i3-6100 and i5-6400 it exceeds 10%. And for the i7-6700K it decreases slightly: apparently, due to the larger cache memory capacity. In general, “progress” can sometimes turn out to be like that. Locally, the rest of the group's programs work on a system with DDR4 either the same or a little faster, which ultimately leads to almost equal results. For different types of memory, but not processors, of course, i.e. this is exactly the case when saving by saving old memory can allow you to purchase a faster processor, which will pay off handsomely.

In this case, on the contrary, we have some increase in results when using DDR4, and the faster the processor, the higher it is. But even in extreme cases it does not exceed 3%, i.e. it’s not worth rushing to change memory just because of performance.

Formally, the new memory is better, but in fact the difference of a fraction of a percent may be of interest only to fans of benchmarks, but not for practical use.

A similar case. No, of course, the results are consistently higher. But such an increase in performance cannot be recorded without a photo finish, so it is better to simply not pay attention to it.

Again the differences are within 1%. Even where they exist at all. For buyers of entry-level systems, it makes even more sense not to worry, but to try to save money. Even when buying a new computer, you can still think about this, not to mention the case when a sufficient amount of DDR3 remains from the old one.

When packing the data, the Core i7-6700K still managed to heroically squeeze out as much as 2% of the difference due to the higher bandwidth. For the rest, DDR3-1600 is more than enough, and DDR4 may even get in the way due to still high latencies.

Over the last five years, file operations have been able to actively “load” memory, but in this case we are not inclined to attribute the effect to its performance. Rather, other third-party factors, such as the controller operating in the mode for which it is mainly designed.

Looking at the results of lower-end Intel processors, we felt that the higher latencies of DDR4 are generally contraindicated for this program. However, using faster models, you can see that as their performance increases, the requirements for memory bandwidth also increase. As a result, it is possible to “squeeze” up to 3-4%. Which, however, looks good only against the background of other groups of applications, but is too small for practical significance.

Ultimately, we come to almost complete equivalence of the two types of memory, since the difference between them is within the error. However, as we saw above, there are programs that “rigidly vote” for one of the options, but in such a strange way that it can generally be attributed to some kind of errors (or, what is the same thing, excessive and unnecessary optimization), which will be corrected over time. But it’s not even close that the results would increase by a third (in proportion to the effective frequency).

Energy consumption and energy efficiency

In order not to overdo it with the size of the diagrams, we decided to limit ourselves to three points - extreme and middle (the results of the other two systems can be viewed in the summary file). In principle, they demonstrate well why all this was started. And also the fact that for lower configurations the effect can, in principle, be neglected: some savings are also observed in the case of the Celeron G3900, but taking into account its very small “appetite” in general... Plus or minus five watts in a desktop system won't be a problem. 10-15 when using top processors is already something, but in relative terms it’s also not worth attention.

But, of course, it can bring a little moral satisfaction to a big fan of “greens.” Like LGA1151 in general - according to tests, even when using DDR3, it is still the most “energy efficient” desktop platform today, not inferior even to surrogate systems, but with incomparably higher performance. However, the LGA1150 was not bad in this capacity, and the “old” LGA1155, if its life was extended and there were no new developments, would have looked good. In fact, among desktop platforms there has been no competition in terms of energy efficiency for a long time. So the “strengthening and deepening” of work in this direction are echoes of events in completely different markets.

However, another question still remains unsolved, namely the effect of different types of memory on the power consumption of the processor itself. “Platform” efficiency is understandable: after all, the memory modules themselves have different power consumption. Does this directly affect the operation of the controller integrated into the processor? You can't tell in advance. For example, a discrete video card also “spoils” energy efficiency indicators, but does not directly affect the processor in any way. This means we need to measure. Moreover, this is not a problem for new platforms - since the days of LGA1150, the company has “transferred” the processor power system directly to a dedicated power supply line in its entirety.

As we see, there is an effect - more modest than for the “platform”, but it cannot be called loyal to the memory of the old type. Again, for younger models in the Intel range it can be neglected, but for older ones you can get an extra ten watts “under the hood”. And this is even for standard DDR3 modules with a supply voltage of 1.5 V - increasing the latter (when trying to increase the memory frequency), of course, will only worsen the situation. Thus, the recommendation “not to raise” the supply voltage of memory modules can be trusted - this will not bring anything good. Bad, quite possibly, too. But let everyone decide for themselves whether to take risks or not. In any case, the impact of using DDR3 memory on the CPU's own power consumption (and, accordingly, heat dissipation) is a documented fact. As well as the small size of this “influence” in the case of processors in the budget segment. Or even mid-level models.

iXBT Game Benchmark 2016

In order not to overload the article with a large number of generally similar diagrams, we once again decided to make do with the integral score (remember: it does not reflect absolute indicators, but the ability of systems to somehow “pull” at least 30 frames per second in different games).

Actually, everything is obvious. Of course, higher memory bandwidth has a beneficial effect on the integrated GPU, but the situation cannot fundamentally change. In some places this allows, for example, to increase the frame rate from 28 to 31, which affects the overall result, but no wow effects are observed. This once again confirms that when purchasing a computer for gaming purposes, you need to “dance” from the video card. Then you can think about the processor, and everything else is up to your taste. If the money remains :) But the demands of modern (and even not so modern) games are such that they are unlikely to remain after the first step. So if using “old” memory allows you to purchase a slightly faster video card, you should definitely take advantage of it. And all attempts to improve the performance of integrated graphics without radical changes are not even worth the time spent, not to mention the money.

Total

So, we have clarified the previously obtained results and came to the conclusion that so far the effect of the transition to DDR4 is even more modest than it previously seemed. From which, however, it does not follow that this transition needs to be specifically counteracted in some way. First, the new memory saves some power. Moreover (which is also important) we are not only talking about greater efficiency of the entire system, but also the consumption of the processor turns out to be slightly lower, so the latter will work in a more gentle mode, and everything is easier to solve with cooling. Secondly, shipments of DDR3 are declining quite quickly, so this memory will certainly not become cheaper, unlike DDR4. Which we will still have to switch to sooner or later, and we will not be surprised if DDR3 support disappears over time from new processors already within LGA1151. On the other hand, if you already have such memory, and in sufficient quantity, which is not planned to be increased in the near future, the moment of transition can be postponed until a more successful one financially. This will not pose any problems, even when purchasing a top-end processor, not to mention mid- and low-end devices. But, naturally, you should not get carried away with excessively increasing the voltage on the modules, since this has a certain negative impact on the processor.

Here come the Intel Haswell-E processors. The site has already tested the top 8-core Core i7-5960X, as well as the ASUS X99-DELUXE motherboard. And, perhaps, the main feature of the new platform is support for the DDR4 RAM standard.

The beginning of a new era, the DDR4 era

About the SDRAM standard and memory modules

The first SDRAM modules appeared back in 1993. They were released by Samsung. And by 2000, SDRAM memory, due to the production capacity of the Korean giant, had completely ousted the DRAM standard from the market.

The abbreviation SDRAM stands for Synchronous Dynamic Random Access Memory. This can be literally translated as “synchronous dynamic random access memory”. Let us explain the meaning of each characteristic. Memory is dynamic because, due to the small capacity of the capacitors, it constantly requires updating. By the way, in addition to dynamic memory, there is also static memory, which does not require constant data updating (SRAM). SRAM, for example, underlies cache memory. In addition to being dynamic, the memory is also synchronous, unlike asynchronous DRAM. Synchronicity means that the memory performs each operation for a known amount of time (or clock cycles). For example, when requesting any data, the memory controller knows exactly how long it will take for it to get to it. The synchronicity property allows you to control the flow of data and queue it. Well, a few words about “random access memory” (RAM). This means that you can simultaneously access any cell at its address for reading or writing, and always at the same time, regardless of location.

SDRAM memory module

If we talk directly about the design of memory, then its cells are capacitors. If there is a charge in the capacitor, then the processor regards it as a logical unit. If there is no charge - as a logical zero. Such memory cells have a flat structure, and the address of each of them is defined as the row and column number of the table.

Each chip contains several independent memory arrays, which are tables. They are called banks. You can work with only one cell in a bank per unit of time, but it is possible to work with several banks at once. The information being recorded does not have to be stored in a single array. Often it is split into several parts and written to different banks, and the processor continues to consider this data as a single whole. This recording method is called interleaving. In theory, the more such banks in memory, the better. In practice, modules with a density of up to 64 Mbit have two banks. With a density of 64 Mbit to 1 Gbit - four, and with a density of 1 Gbit and higher - already eight.

What is a memory bank

And a few words about the structure of the memory module. The memory module itself is a printed circuit board with chips soldered on it. As a rule, you can find devices on sale made in the DIMM (Dual In-line Memory Module) or SO-DIMM (Small Outline Dual In-line Memory Module) form factors. The first is intended for use in full-fledged desktop computers, and the second is for installation in laptops. Despite the same form factor, memory modules of different generations differ in the number of contacts. For example, an SDRAM solution has 144 pins for connecting to the motherboard, DDR - 184, DDR2 - 214 pins, DDR3 - 240, and DDR4 - already 288 pieces. Of course, in this case we are talking about DIMM modules. Devices made in the SO-DIMM form factor naturally have a smaller number of contacts due to their smaller size. For example, a DDR4 SO-DIMM memory module is connected to the motherboard using 256 pins.

The DDR module (bottom) has more pins than SDRAM (top)

It is also quite obvious that the volume of each memory module is calculated as the sum of the capacities of each soldered chip. Memory chips, of course, can differ in their density (or, more simply, in volume). For example, last spring Samsung launched mass production of chips with a density of 4 Gbit. Moreover, in the foreseeable future it is planned to release memory with a density of 8 Gbit. Memory modules also have their own bus. The minimum bus width is 64 bits. This means that 8 bytes of information are transmitted per clock cycle. It should be noted that there are also 72-bit memory modules in which the “extra” 8 bits are reserved for ECC (Error Checking & Correction) error correction technology. By the way, the bus width of a memory module is also the sum of the bus widths of each individual memory chip. That is, if the memory module bus is 64-bit and there are eight chips soldered on the strip, then the memory bus width of each chip is 64/8 = 8 bits.

To calculate the theoretical bandwidth of a memory module, you can use the following formula: A*64/8=PS, where “A” is the data transfer rate, and “PS” is the required bandwidth. As an example, we can take a DDR3 memory module with a frequency of 2400 MHz. In this case, the throughput will be 2400*64/8=19200 MB/s. This is the number referred to in the marking of the PC3-19200 module.

How does information directly read from memory occur? First, the address signal is sent to the corresponding row (Row), and only then the information is read from the desired column (Column). The information is read into the so-called Sense Amplifiers - a mechanism for recharging capacitors. In most cases, the memory controller reads an entire packet of data (Burst) from each bit of the bus at once. Accordingly, when recording, every 64 bits (8 bytes) are divided into several parts. By the way, there is such a thing as data packet length (Burst Length). If this length is 8, then 8*64=512 bits are transmitted at once.

Memory modules and chips also have such a characteristic as geometry, or organization (Memory Organization). The module geometry shows its width and depth. For example, a chip with a density of 512 Mbit and a bit depth (width) of 4 has a chip depth of 512/4 = 128M. In turn, 128M=32M*4 banks. 32M is a matrix containing 16000 rows and 2000 columns. It can store 32 Mbit of data. As for the memory module itself, its capacity is almost always 64 bits. The depth is easily calculated using the following formula: the volume of the module is multiplied by 8 to convert from bytes to bits, and then divided by the bit depth.

You can easily find the timing values ​​on the markings

It is necessary to say a few words about such characteristics of memory modules as timings. At the very beginning of the article, we said that the SDRAM standard provides for such a point that the memory controller always knows how long a particular operation takes to complete. Timings precisely indicate the time required to execute a specific command. This time is measured in memory bus clocks. The shorter this time, the better. The most important delays are:

  • TRCD (RAS to CAS Delay) - the time required to activate the bank line. Minimum time between activation command and read/write command;
  • CL (CAS Latency) - time between issuing a read command and the start of data transfer;
  • TRAS (Active to Precharge) - line activity time. The minimum time between activating a line and the command to close the line;
  • TRP (Row Precharge) - time required to close a row;
  • TRC (Row Cycle time, Activate to Activate/Refresh time) - time between activation of rows of the same bank;
  • TRPD (Active bank A to Active bank B) - time between activation commands for different banks;
  • TWR (Write Recovery time) - time between the end of writing and the command to close the bank line;
  • TWTR (Internal Write to Read Command Delay) - time between the end of the write and the read command.

Of course, these are not all the delays that exist in memory modules. You can list a dozen more different timings, but only the above parameters significantly affect memory performance. By the way, only four delays are indicated in the labeling of memory modules. For example, with parameters 11-13-13-31, the CL timing is 11, TRCD and TRP are 13, and TRAS is 31 clock cycles.

Over time, the potential of SDRAM reached its ceiling, and manufacturers were faced with the problem of increasing the speed of RAM. This is how the DDR.1 standard was born

The Coming of DDR

The development of the DDR (Double Data Rate) standard began back in 1996 and ended with the official presentation in June 2000. With the advent of DDR, SDRAM memory became a thing of the past and was simply called SDR. How does the DDR standard differ from SDR?

After all SDR resources were exhausted, memory manufacturers had several options to solve the problem of improving performance. It would be possible to simply increase the number of memory chips, thereby increasing the capacity of the entire module. However, this would have a negative impact on the cost of such solutions - this idea was very expensive. Therefore, the JEDEC manufacturers association took a different route. It was decided to double the bus inside the chip, and also transmit data at twice the frequency. In addition, DDR provided for the transmission of information on both edges of the clock signal, that is, twice per clock. This is where the abbreviation DDR - Double Data Rate - comes from.

Kingston DDR Memory Module

With the advent of the DDR standard, such concepts as real and effective memory frequency appeared. For example, many DDR memory modules ran at 200 MHz. This frequency is called real. But due to the fact that data transfer was carried out on both edges of the clock signal, manufacturers, for marketing purposes, multiplied this figure by 2 and obtained a supposedly effective frequency of 400 MHz, which was indicated in the labeling (in this case, DDR-400). At the same time, the JEDEC specifications indicate that using the term “megahertz” to characterize the level of memory performance is completely incorrect! Instead, "millions of transfers per second per data output" should be used. However, marketing is a serious matter, and few people were interested in the recommendations specified in the JEDEC standard. Therefore, the new term never took root.

Also in the DDR standard, a dual-channel memory mode appeared for the first time. It could be used if there was an even number of memory modules in the system. Its essence is to create a virtual 128-bit bus by interleaving modules. In this case, 256 bits were sampled at once. On paper, a dual-channel mode can double the performance of the memory subsystem, but in practice the speed increase is minimal and is not always noticeable. It depends not only on the RAM model, but also on timings, chipset, memory controller and frequency.

Four memory modules operate in dual-channel mode

Another innovation in DDR was the presence of a QDS signal. It is located on the printed circuit board along with the data lines. QDS was useful when using two or more memory modules. In this case, the data arrives at the memory controller with a slight time difference due to the different distances to them. This creates problems when choosing a clock signal for reading data, which QDS successfully solves.

As mentioned above, DDR memory modules were made in DIMM and SO-DIMM form factors. In the case of DIMMs, the number of pins was 184 pieces. In order for DDR and SDRAM modules to be physically incompatible, for DDR solutions the key (the cut in the pad area) was located in a different location. In addition, DDR memory modules operated at a voltage of 2.5 V, while SDRAM devices used a voltage of 3.3 V. Accordingly, DDR had lower power consumption and heat dissipation compared to its predecessor. The maximum frequency of DDR modules was 350 MHz (DDR-700), although JEDEC specifications only provided for a frequency of 200 MHz (DDR-400).

DDR2 and DDR3 memory

The first DDR2 modules went on sale in the second quarter of 2003. Compared to DDR, second-generation RAM has not received significant changes. DDR2 used the same 2n-prefetch architecture. If previously the internal data bus was twice as large as the external one, now it has become four times wider. At the same time, the increased performance of the chip began to be transmitted via an external bus at double the frequency. Precisely frequency, but not double transmission speed. As a result, we found that if the DDR-400 chip operated at a real frequency of 200 MHz, then in the case of DDR2-400 it operated at a speed of 100 MHz, but with twice the internal bus.

Also, DDR2 modules received a larger number of contacts for connection to the motherboard, and the key was moved to another location for physical incompatibility with SDRAM and DDR sticks. The operating voltage has been reduced again. While DDR modules operated at a voltage of 2.5 V, DDR2 solutions operated at a potential difference of 1.8 V.

By and large, this is where all the differences between DDR2 and DDR end. At first, DDR2 modules were characterized by high latencies, which made them inferior in performance to DDR modules with the same frequency. However, the situation soon returned to normal: manufacturers reduced latencies and released faster sets of RAM. The maximum DDR2 frequency reached an effective 1300 MHz.

Different key positions for DDR, DDR2 and DDR3 modules

The transition from DDR2 to DDR3 followed the same approach as the transition from DDR to DDR2. Of course, data transmission at both ends of the clock signal has been preserved, and the theoretical throughput has doubled. DDR3 modules retained the 2n-prefetch architecture and received 8-bit prefetch (DDR2 had 4-bit). At the same time, the internal tire became eight times larger than the external one. Because of this, once again, with the change of memory generations, its timings increased. The nominal operating voltage for DDR3 has been reduced to 1.5 V, making the modules more energy efficient. Note that, in addition to DDR3, there is DDR3L memory (the letter L means Low), which operates with a voltage reduced to 1.35 V. It is also worth noting that DDR3 modules turned out to be neither physically nor electrically compatible with any of the previous generations of memory.

Of course, DDR3 chips have received support for some new technologies: for example, automatic signal calibration and dynamic signal termination. However, in general, all changes are predominantly quantitative.

DDR4 - another evolution

Finally, we get to the completely new DDR4 memory. The JEDEC Association began developing the standard back in 2005, but only in the spring of this year the first devices went on sale. As stated in a JEDEC press release, during development, engineers tried to achieve the highest performance and reliability, while increasing the energy efficiency of the new modules. Well, we hear this every time. Let's see what specific changes DDR4 memory has received in comparison with DDR3.

In this picture you can trace the evolution of DDR technology: how the voltage, frequency and capacitance indicators changed

One of the first DDR4 prototypes. Oddly enough, these are laptop modules

As an example, consider an 8GB DDR4 chip with a 4-bit wide data bus. Such a device contains 4 groups of banks, 4 banks each. Inside each bank there are 131,072 (2 17) rows with a capacity of 512 bytes each. For comparison, you can give the characteristics of a similar DDR3 solution. This chip contains 8 independent banks. Each bank contains 65,536 (2 16) rows, and each row contains 2048 bytes. As you can see, the length of each line of a DDR4 chip is four times less than the length of a DDR3 line. This means that DDR4 scans banks faster than DDR3. At the same time, switching between the banks themselves also occurs much faster. Let us immediately note that for each group of banks there is an independent choice of operations (activation, read, write or regeneration), which allows increasing the efficiency and memory bandwidth.

The main advantages of DDR4: low power consumption, high frequency, large capacity of memory modules

The evolution of technology is rapidly moving forward, giving way to more progressive, miniature and less resource-intensive standards in the production of processors, SSDs and RAM. Prices for previous product lines are rapidly falling, since they are no longer able to satisfy the ever-growing appetites of the user environment.

In the second half of 2014, a line of DDR4 RAM modules went into mass production. It took about two years until the new technology gained enough popularity and prices dropped, and now these chips have become available for purchase at the optimal price and in the optimal configuration. In connection with this important event, we decided to prepare for you a review of the new RAM standard and tell you what it is DDR4 RAM, how it differs from previous generations of RAM and how it stands out from its predecessors.

First, a few words about what RAM is in general. Let's imagine for a moment that you are a middle manager in a company, and you have a staff of several departments under you. Your company has a corporate portal where all internal company news is published. You, like everyone else, publish new tasks and requirements for your subordinates on this portal, and do this regularly, every morning, and old tasks are deleted so that there is no confusion from a pile-up of tasks. Every morning, your colleagues open the corresponding page in the browser and get acquainted with their tasks for the next day, while the requirements for the previous day have already been deleted. RAM works in exactly the same way. Essentially, this is a kind of information stack where the operating system operating data is written. Every time you turn off your computer, the contents of RAM are cleared and refilled as new applications are launched. The amount of RAM can vary from approximately 1-2 GB to 16-32 GB for modern gaming systems that require a large amount of system resources. There were times when the amount of RAM was just a few MB, but that’s history.

The first platform on which the installation of DDR4 chips became possible was the Intel Haswell-E line and, accordingly, the X99 Express platform, released in the third quarter of 2014. Based on it, a new flagship 8-core processor Core i7-5960X was released, and the first motherboard to support it was ASUS X99-DELUXE. It is certainly worth noting that the main feature of this technology is support for the new RAM standard - DDR4.

Now a little reference to historical facts. In fact, the development of DDR4 was started back in 2005 by the JEDEC association, but the first devices based on it went on sale only in the spring of 2014. JEDEC engineers were tasked with achieving greater levels of power and stability compared to DDR3. Moreover, the goal was to increase the energy efficiency of the next standard. However, we hear such promises in literally every announcement. How much progress have engineers actually achieved?

Like earlier chip models, DDR4 managed to adopt 2n-prefetch technology (JEDEC calls it 8n-Prefetch in their developments). Any new memory chip can accommodate two or four discrete groups of banks.

To look at a real example of a module, let's take a closer look at an 8 GB DDR4 chip equipped with a 4-bit data bus. This board accommodates 4 groups of banks with 4 banks in an individual group. Each bank contains 131072 lines with a capacity of 512 bytes each. To have something to compare with, let's take a closer look at the corresponding DDR3 module. Such a chip contains only 8 autonomous banks. Each bank contains 65536 lines, and each line contains 2048 bytes of memory. As you may have noticed, the length of each line of a DDR4 module is four times shorter than the width of a DDR3 line. This means that DDR4 RAM performs memory bank revisions much faster than DDR3. Moreover, the memory banks themselves switch much faster. Here it should be noted that for each individual set of banks, a choice of various operations is provided (restore, extract, write or activate), which makes it possible to increase the level of aperture and memory efficiency.

Performance

A significant innovation in the DDR4 standard is the use of an interface that uses a topology called “point-to-point”, when DDR3 uses the Multii-Drop bus. Why is this necessary? The internal structure of the Multi-Drop bus implies the operation of only a pair of channels connecting the modules with the RAM controller. When four DIMM ports are enabled at once, the controller communicates with each pair of RAM cards using only one single channel. This state of affairs has the most negative impact on the efficiency of the RAM subsystem.

A bus design using a point-to-point aperture provides a discrete channel for an individual DIMM socket, meaning each individual module will connect directly to the controller without sharing that channel with anyone else. We could already see a similar innovative solution during the transition of video cards from the PCI standard to PCI Express. Of course, the presented approach also has its own shortcomings. So, for example, 4-channel systems will be limited to four DIMM slots, and 2-channel systems will be limited to two. However, if we take into account the more significant capacity of DDR4 modules, this will not limit users in any way. We will talk about this in more detail later.

Each DDR4 RAM module with a DIMM connector has 288 pins. The number of pins was increased in order to make it possible to address the largest possible amount of RAM. The largest volume of one RAM module is 128 GB (here we mean the use of crystals with a capacity of 8 GB and QPD technology, the purpose of which is to place four chips in a single package). It is also quite likely that 16 GB crystals with a larger capacity will be used, as well as more capacious packaging (up to 8 crystals in a single case). Under the indicated conditions, the capacity of one RAM module can be equal to 512 GB.

By the way, not only the capacity of RAM modules will be increased, but also their frequency. Within the DDR4 standard, the actual frequency can be 2133 MHz.

Energy efficiency

In order to reduce power consumption and heat generation, the DDR4 standard implies another reduction in the active voltage. This time to 1.2 V. In addition to this, the voltage in the chip itself was, in turn, increased, and this made it possible to guarantee faster access and, under similar conditions, minimize leakage current. Judging by the theoretical statements, the total energy consumption of DDR4 will be 30% lower than that of DDR3. Manufacturing companies will most likely use the resulting reserve to increase the frequency of RAM.

Reliability

The remaining changes relate primarily to the reliability of the devices. For example, DDR4 RAM chips are capable of independently detecting, identifying and fixing errors that relate to command and address parity management. In addition, the DDR4 standard supports a connection check operation, as a result of which the main controller has the right to identify errors without using the DRAM initialization chains. In addition, the memory register turned out to be polished. From now on, it is possible to configure it in such a way that commands that contain parity errors are blocked. The register in the previous standard, DDR3, did not have a similar function, and commands combining parity errors occasionally made their way to the RAM chips, which was one of the first causes of PC failures. In addition to the previously listed features, the new DDR4 memory includes a number of auxiliary options that are aimed at improving the reliability of the memory subsystem. One of them is checking the check amounts before writing to memory.

Today, choosing DDR4 RAM in any case becomes a win-win option. The chips have already become widespread enough to make plans to purchase them. This is an excellent foundation for computer performance for the future, and given the constant decline in prices for small-volume modules, such chips are becoming a tasty morsel. For DDR4 chips, the price varies from 2,400 rubles for one low-power module of 8 GB with a frequency of 2133 MHz to 5,900 rubles for a set of two chips of 8 GB each with a frequency of 2666 MHz. It is important to note that it is better to purchase two low-power modules than one high-power one, since a pair of modules of the same frequency with similar characteristics operate in parallel mode, which adds another 10-15% to the overall speed of the PC.

This concludes the review of the innovations that DDR4 RAM has brought us. Having studied many descriptions and technical characteristics of the new standard, in theory everything looks quite promising. In addition to basic improvements (higher frequencies and lower voltage), the technology began to support a new bus and a number of innovations designed to improve the reliability of RAM use. The last ability of those mentioned will be especially useful in the field of the server segment, which is already a huge “plus” for performing corporate tasks.