Introduction and special features of the thermal path for GPUs
I will now transfer the previously explained approach for direct die and notebook or embedded CPUs to a GPU and adapt it to the typical conditions of a graphics processor. Modern desktop GPUs generally work without a classic heatspreader; the silicon chip is located directly under the external thermal paste and a cooling base or a cooling plate. The internal chain of silicon, possibly very thin internal layers and package material is still present, but there is no additional solid copper cover as with a CPU with IHS. This makes the thermal path from the active area in the silicon to the cooling medium significantly shorter and more direct, similar to a direct die CPU, but usually with significantly higher power dissipation and often a larger die area.
In the internal representation of the GPU structure, the relevant sections of the chain can be arranged serially. From the GPU die, the heat runs through the silicon itself, then via the external TIM layer into the cooler base or a coldplate and from there via the convective heat transfer into the cooling medium flowing through, i.e. water in the laboratory, the temperature of which is kept constant at 30 °C with a chiller. The role of a separate heatspreader is eliminated, meaning that the external thermal paste becomes much more important compared to the CPU with IHS. At the same time, however, the overall consideration must not end at the TIM layer, as the heat sink and in particular the convection to the liquid are an integral part of the thermal resistance chain. In the case of a GPU, these relationships are exacerbated by the fact that power losses of well over 200 watts are not the exception, but rather the rule, and the heat flux densities on the die are very high for compact chips.
Transparent calculation path for the GPU including convection
I first define the variables I am calculating with, i.e. the variables for the electrical and thermal side
P_CPU in watts, real power dissipation of the CPU
T_Water in degrees Celsius, water temperature in the cooling circuit kept constant by the chiller
T_CPU in degrees Celsius, temperature directly on the silicon die on the surface
R_th,Die in Kelvin per Watt, thermal resistance of the silicon from the active area to the Die surface
R_th,TIM in Kelvin per Watt, external thermal resistance of the applied paste, measured with TIMA
R_th,Block in Kelvin per Watt, conductive resistance in the cooler base from the contact to the fluid side
R_th,convection in Kelvin per watt, convective contact resistance from the inside of the radiator to the water
R_th,cooler in Kelvin per watt, total resistance of the cooler from contact to water temperature
R_th,total in Kelvin per Watt, total sum of all serial partial resistances from the die to the water
The cooler resistance is thus composed of the conduction and convection side, formally the following applies
R_th,cooler = R_th,block R_th,convection
The total serial thermal resistance from the GPU die to the water is accordingly
R_th,ges = R_th,Die R_th,TIM R_th,Cooler
The temperature on the GPU surface then follows from the usual relationship between power dissipation, total resistance and reference temperature of the fluid
T_GPU = T_Water P_GPU × R_th,ges
For the convective component, a physical relationship can be established via the heat transfer coefficient h and the wetted inner surface of the cooling channels A_Fluid, so that
R_th,convection = 1 ÷ (h × A_fluid)
In the practical implementation in the laboratory, this variable is not determined anew for each configuration, but is packed into a fixed boundary condition by selecting a powerful cooler, a defined volume flow and a constantly controlled chiller. R_th,cooler thus remains constant throughout the entire test program so that temperature differences between pastes are not distorted by fluctuating convection conditions. For the external TIM of the GPU, the relationship between layer thickness, thermal conductivity and surface area continues to apply formally. With d as the real layer thickness in meters, λ as the thermal conductivity of the paste in watts per meter and Kelvin and A_GPU as the effective contact area of the GPU Dies in square meters, the following results
R_th,TIM = d ÷ (λ × A_GPU)
In the calculation, I replace this theoretical expression with the effective resistance R_th,TIM,measured with TIMA, which already includes all real influences from BLT, micro-roughness and contact quality under pressure. A_GPU is therefore implicitly included in the measurement. The actual temperature calculation for various real power points is then carried out in the same way as Direct Die, except that I am now looking at a GPU instead of a CPU. For a typical high load of 600 watts, the result is
P_GPU,600 = 600 W
T_GPU,600 = T_water P_GPU,600 × R_th,ges
In all cases, R_th,ges contains the unchanged portion of the die and the cooler including convection, the differences between two pastes arise exclusively via the difference in R_th,TIM,measured, as the entire remaining structure remains constant. The GPU area A_GPU also determines via the heat flux density q double line how strongly the TIM and the die are loaded locally. Formally, the following applies here
q” = P_GPU ÷ A_GPU
The smaller A_GPU is at high P_GPU, the more sensitive the system reacts to additional Kelvin per watt in R_th,TIM or R_th,cooler, which increases the importance of the TIM on the GPU once again.
Practical temperature evaluation for GPUs based on complete resistor chains
Finally, I would like to make it clear for the GPU evaluation that the new, practical temperature calculation is only realistic if the complete path from the silicon surface to the cooling medium is included. The heat path does not physically end in the paste, but in the water of the circuit; the convective part in particular remains an essential link in the chain. In the laboratory setup, I use a powerful cooler, a stable volume flow and a chiller regulated to 30 °C to ensure that R_th,cooler including convection remains a constant and reproducible variable. This ensures that differences in R_th,TIM,measured are not masked by fluctuating cooling conditions.
This creates a very sensitive thermal situation, especially for GPUs with high power dissipation and often relatively small die area, in which even small differences in the external thermal resistance of the paste lead to clearly measurable changes in T_GPU. The calculation based on real measured R_th values of the TIMA system translates these differences directly into practice-relevant Die temperatures with a defined cooling concept and constant water temperature. In this way, the sequence of pastes for GPU use is not derived from theoretical data sheet specifications, but from a complete, physically consistent resistance chain including convection, which can be found in practice.
Please turn the page again, then there is a conclusion!





































47 Antworten
Kommentar
Lade neue Kommentare
Urgestein
1
Urgestein
1
Urgestein
1
Veteran
Urgestein
Urgestein
Veteran
Urgestein
1
1
Veteran
Mitglied
1
1
Urgestein
1
Alle Kommentare lesen unter igor´sLAB Community →