Test setup
An overclocked and slightly undervolted AMD Ryzen 9800X3D is used on an MSI MAG X870E Carbon WiFi with 2x 16 GB T-Force Delta RGB DDR5 6000 (32 GB kit) @CL30 as well as a TB MSI Spatium M480 Pro and an M580 4TB. The whole thing is powered by a Be Quiet! Dark Power Pro 13 1600 Watt. A semi-custom water cooler from Alphacool in the form of the Eiswolf Extreme keeps the CPU cool.
Review of the AMD benchmarks using the example of Cyberpunk 2077
Today, Cyberpunk 2077 serves as the first practical reference to understand the AMD benchmark data. The title offers one of the most consistent integrated benchmarks ever, which is essential for comparing individual upscaling and ray tracing modes. Precisely because only one graphics card is tested, an absolutely reproducible scene is crucial in order to be able to reliably quantify differences between the modes.
The official AMD slide shows an increase from 26 FPS native to 123 FPS in the performance mode of FSR Redstone in Cyberpunk 2077, i.e. a factor of 4.7. This figure is exactly reproducible in my own benchmark. The resulting bar (FSR4 RT On FG Performance, Foil) in my diagram shows an average value of 126 FPS and is therefore within the expected range. The deviation of a few percentage points can be fully explained by the natural dispersion between individual benchmark runners and minor differences in the render path and driver version.
It is noteworthy that AMD clearly uses the performance mode for the slide. This is legitimate, as the image quality of FSR Redstone has also been significantly increased at this level. However, I deliberately use the Balanced mode for the further tests, as this offers a significantly better ratio of image quality and performance. In Cyberpunk 2077, Balanced produces visibly more stable details, more precise edge reconstruction and an overall subjectively more coherent image than Performance, even if the performance advantage is less than in AMD’s marketing material.
Average FPS and percentiles
This decision is also reflected in the benchmark diagram, which focuses on technically plausible comparability. The bars show very clearly how different modes relate to each other, not only in terms of the average frame rate, but above all also taking into account the 1-percent-low values. The latter are particularly important in the context of an ML-based render stack, as they provide information on how stable the entire pipeline is and how consistent the reconstruction remains under load.
The results show that FSR Redstone achieves a significant increase in performance in Balanced mode with ray tracing and frame generation enabled, which is functionally in the same order of magnitude as the performance values specified by AMD. The difference is that Balanced delivers visibly better image quality without the performance gains becoming implausibly smaller as a result. The performance ratio between FSR3, FSR4 and native modes remains structurally identical, only the absolute values shift due to the higher input resolution.
The comparison of the two benchmarks therefore makes two things clear. Firstly, AMD’s figures are not just approximate, but almost exactly reproducible, provided the same quality modes are used. Secondly, our own benchmark shows that FSR4 in Cyberpunk 2077 also offers consistent and technically comprehensible scaling outside of performance mode, which keeps the render path stable and optimally utilizes the known strengths of the game as a benchmark platform.
FSR3 RT Off delivers 95 FPS on average, but only 30.43 FPS in the 1-percent lows. This is a relatively large gap and indicates temporal drops. FSR4 RT Off achieves 87.64 FPS at a similar average load, but with significantly more stable 1-percent lows of 29.74 FPS. Despite a slightly lower average frequency, the swings are less pronounced. The reason for this is that FSR4 uses internal image consistency priorities that smooth out the short-term latency peaks.
FSR3 RT FG achieves 110.43 FPS, but drops more sharply in fast scenes than FSR4 RT On FG. FSR4 RT On FG shows a slightly lower average rate of 103.33 FPS, but significantly higher stability. The ML frame generation generates fewer double contours and miscalculations, which means that the render pipeline falls into unfavorable vertex projections less frequently. Pathtracing shows the same pattern. While FSR3 RT Pathtracing FG delivers an average of 61.31 FPS, FSR4 RT Pathtracing FG only achieves 56.50 FPS, but remains more controllable across the entire percentile spectrum. The differences can be explained by the way FSR4 uses Ray Regeneration before upscaling, resulting in higher quality input data, which in turn improves temporal reconstruction. In all modes where ray tracing is enabled, the 1-percent lows benefit more from FSR4 than the average frame rates. This is typical for ML-based methods, as they particularly reduce the extreme areas of temporal instability.
By choosing the balanced mode in later diagrams, I deliberately focus on a technically objective view, where image quality and reproducibility take precedence over maximum marketing figures. This approach is intended to be a useful addition to the many visually-oriented image comparisons that can be expected today during the course of the release day. Therefore, my focus is on a structured presentation of the pipeline, its strengths and limitations as well as the comparability to NVIDIA’s technologies. Cyberpunk 2077 offers the ideal basis for this with its stable benchmark.
Analysis of the percentile curves
The percentile diagram clearly shows that FSR4 offers improved temporal consistency compared to FSR3. This occurs regardless of RT settings. For identical scenes, the FSR4 curves are above the corresponding FSR3 curves in all percentile ranges and above all show a flatter drop-off rate between the 50th and 99.9th percentile. For FSR3, it can be seen in the upper half of the diagram that although the curves deliver decent FPS on average, the last ten percentage points show a stronger divergence between the median, 95th percentile and 99th percentile. This is typical for a method that is more strongly affected by temporal instability. FSR3 has structurally higher fluctuations, especially with fast object changes and transparent effects, as temporal upscaling is less able to deal with contradictory motion vectors.
FSR4 shows a much more homogeneous image here. The ML-based reconstruction has a stabilizing effect and, above all, reduces the short-term dropouts that occur in the upper percentile range with FSR3. The FSR4 RT On, FSR4 RT Off and FSR4 RT On FG lines in particular show that the method offers greater temporal stability in both ray tracing and raster mode. The percentiles fall more slowly and remain closer to their medians. FSR3 RT Pathtracing and FSR4 RT Pathtracing are lower overall, as expected due to the extreme RT load, but again FSR4 is visibly smoother. The FSR3 curve shows typical dips in the area above the 90th percentile, while FSR4 shows an almost continuous linear attenuation. The best curve is provided by FSR4 RT On FG Performance, whose curve dominates practically from start to finish. This shows that frame generation and ML upscaling together result in an exceptionally stable pipeline. The difference between FSR3 RT FG and FSR4 RT On FG is particularly large, as FSR4’s ML-based motion estimation produces fewer mismatches.
Analysis of the frame time results
The frame time display shows what percentage of the frames are in which time windows and thus makes micro-stuttering visible. The advantage of FSR4 becomes particularly clear here, as FSR3 RT Off shows a high proportion of frames in the range around 16.66 ms, but a large number of outliers up to over 22 ms. FSR4 RT Off reduces this proportion, which indicates a more even pipeline. FSR3 RT On FG has a visible scatter into the yellow and light green areas. These are uneven transition zones with temporal uncertainty in the frame interpolation.
FSR4 RT On FG, on the other hand, has significantly more frames in the range below 11.11 ms and below 16.66 ms. This means that the synthetic intermediate frames are generated more stably and the optical flow fluctuates less strongly. FSR3 pathtracing shows a clear clustering in the range above 33 ms and many outliers, while FSR4 pathtracing is slower overall, but generates less micro-stuttering because ML upscaling with ray regeneration receives significantly cleaner input data.
The reasons are technically clear, as FSR3 uses purely algorithmic temporal filters, which quickly stumble in the case of contradictory motion vectors (e.g. particles, dense vegetation, transparent objects). FSR4 uses an ML model that has been trained on typical patterns and better recognizes irregularities in movement and structure. This reduces the likelihood of a frame falling out of the grid due to incorrect reprojection. Frame Generation works on a more precise optical flow model in FSR4. As a result, there are fewer misframes and the pipeline has to make fewer corrections, which stabilizes frame times. Ray regeneration acts as a pre-cleaner of the RT data. FSR3 has to work with noisy RT inputs, FSR4 already receives a stabilized signal. FSR4 prioritizes internal frame consistency over maximum average performance. As a result, average FPS are sometimes slightly lower, but percentiles and frametimes are significantly better. As expected, the native pathtracing mode shows the worst distribution and at the same time demonstrates why ML-based methods will be essential in the future.
Classification of the frame time variance and significance for practice
The analysis of the frame time variance clearly shows that the order of the tested modes deviates considerably from the pure FPS rankings. While high average rates and good percentiles initially indicate strong overall performance, the variance graph reveals how evenly the render pipeline actually works. The decisive factor here is not the absolute speed, but the consistency of the frame output. A mode can deliver high frame rates, but still create a choppy feel if the intervals between frames fluctuate greatly. The next graphic serves as an important anchor point, as it shows that modern upscaling and frame generation methods should not be evaluated solely on the basis of FPS. The frame time variance in particular provides a realistic picture of the actual stability of the game and the technical integrity of the respective render paths.
The graphic shows that the classic raster modes without frame generation, regardless of whether FSR3 or FSR4 is used, deliver particularly consistent frame times. FSR4 benefits here from the new ML pipeline, which works more stably internally and absorbs abrupt outliers better. It is also clear that patch tracing modes generally generate more variance, as stochastic effects in the lighting cause the computing load per frame to fluctuate more. This characteristic is independent of the upscaler and characterizes the lower areas of the graphics.
Frame generation further amplifies the differences. It can smooth the motion display, but is also sensitive to input fluctuations in motion vectors or temporal artifacts. FSR4 is more stable here than FSR3, but both variants still rank behind the pure raster modes due to the additional calculation stage. The observed order thus results from the combination of render complexity, temporal processing and the quality of the input data, not solely from the achievable frame rate.
Power consumption of the GPU in the various FSR modes
The graph on average power consumption initially shows that the GPU operates in a comparatively narrow power window in all modes. The pure GPU values are roughly between around 345 and 361 watts. The magenta components mark the total consumption of GPU plus CPU and are accordingly between around 435 and 456 watts, depending on the respective mode. The differences between FSR3 and FSR4 at GPU level are smaller than one would expect based on the pure FPS jumps. FSR3 RT Pathtracing FG has the lowest GPU power consumption at around 345 watts, while FSR4 RT Off and FSR4 RT On FG are at the upper end at just under 360 watts. The total GPU plus CPU consumption then rises towards 450 watts and slightly above. This is an indication that the GPU is working close to its practical utilization limit in all cases and that the additional work for ML upscaling and frame generation is distributed in the details of the pipeline rather than pushing the GPU into completely new load regions.
The comparison between native modes and ML-supported modes is interesting. Native RT On (Foil) is in the midfield with around 357 watts of GPU power consumption and around 445 watts of total consumption. FSR3 RT On and FSR4 RT On have very similar GPU values of around 357 to 359 watts and slightly higher overall consumption, as the CPU has to do more work when higher frame rates are achieved. The FSR4 RT On FG Performance (Foil) combination stands out in terms of overall consumption, not because the GPU draws significantly more energy, but because the additional CPU share drives up the system performance. This becomes clear in the second diagram.
CPU power consumption and its significance for efficiency
The CPU power consumption varies significantly more than the GPU load and ranges between around 78 and 99 watts depending on the mode. The decisive factor is that conclusions can be drawn from the CPU consumption about the proportion of workload caused by the increased frame rate, motion calculation and frame interpolation. The lowest CPU loads occur in the patch tracing modes without frame generation. These modes generate the lowest FPS and therefore also reduce the load on the CPU, as fewer simulation steps and fewer motion vectors have to be calculated per second. FSR3 RT Pathtracing and FSR4 RT Pathtracing are both in the lower range here, whereby FSR4 works slightly more efficiently, as the ML denoising path of the GPU relieves some of the pre-calculations that used to depend on the CPU.
With ray tracing, the CPU load increases visibly. This becomes particularly clear during the transition to FSR4 RT On FG and FSR4 RT On FG Performance. The increased frame rate means that the CPU has to prepare more frames per second, update motion vectors more frequently and run simulations faster. The CPU is therefore more heavily involved in the render pipeline. The peak load occurs in classic raster FSR-off mode because the highest native GPU load is generated here and the CPU has to provide a correspondingly large number of render commands. The difference between FSR3 and FSR4 in the RT modes can be explained by two factors. Firstly, FSR4 uses an ML model that generates less noise and structural instability internally. This means that fewer correction steps are required, which in turn would take up CPU cycles. Secondly, the variance of the frame pipeline is reduced, making CPU loads more predictable.
Efficiency analysis
The efficiency analysis is based solely on GPU power consumption and thus clearly shows how much energy the graphics card requires to generate a single frame. The differences between the modes are clear and allow conclusions to be drawn about the internal operation of the respective render paths. The large gap between the ML-supported methods and all classic render paths is particularly striking. Frame generation in conjunction with ML upscaling generates synthetic intermediate images, the calculation of which requires only a fraction of the energy of a fully rendered frame. This significantly reduces the GPU effort per frame, although the GPU continues to operate close to the power limit on average. FSR4 achieves the best efficiency in combination with Frame Generation and falls below the three watts per frame mark for the first time in the corresponding modes. FSR3 follows with slightly higher values, as its temporal reconstruction is less stable and generates inefficient intermediate states more frequently.
The purely rasterized modes without frame generation are significantly higher. Here, each frame must be generated completely, which leads to higher watt-per-frame figures. Native rendering without upscaling is even more inefficient because neither the number of pixels is reduced nor is the render pipeline relieved by ML methods. The worst performers are the pathtracing modes. The GPU has to calculate a large number of ray tracing samples per image, which leads to extremely high energy requirements, even if FSR3 or FSR4 subsequently upscale the result. The difference in efficiency between FSR3 and FSR4 remains small in this area, as neither variant substantially reduces the basic computing load of the patch tracing. Overall, it can be seen that FSR4 utilizes the GPU more efficiently due to its more stable ML pipeline and can deliver more usable images per watt than FSR3, while classic and especially patch tracing modes are significantly less energy-efficient.
And for those who are particularly curious, I have included a comparison of frame time and power consumption measurements (low pass filter set to 1 ms) for each individual benchmark run. I would also like to point out that frame pacing still needs to be improved. A simple visual comparison of the curves with and without FSR is sufficient to see this, although it is definitely not an FSR4-only problem, but affects FSR in general.
- 1 - Introduction, three looks back and one forward
- 2 - ML Radiance Caching in Detail
- 3 - ML Ray Regeneration in Detail
- 4 - ML Upscaling in Detail
- 5 - ML Frame Generation in Detail
- 6 - Aktiviation of FSR4 in game or in the Adrenaline drivers
- 7 - Benchmarks and Metrics
- 8 - Quality Comparison and Conclusion



























































11 Antworten
Kommentar
Lade neue Kommentare
Urgestein
Urgestein
Veteran
1
1
Urgestein
1
Moderator
Veteran
Mitglied
Neuling
Alle Kommentare lesen unter igor´sLAB Community →