A slightly better way to overclock and tweak your Nvidia GPU

slightly better way to overclock and tweak your Nvidia GPU

 

Hello all, in this doc i’ll try to show a more optimal and better way  to overclock modern Nvidia graphics cards based on Pascal and Turing architecture.

 

Most guides you can find online about overclocking Nvidia gpus only talk about opening an overclocking software and putting random offsets on the gpu core and gpu memory as long the card is not crashing,

that’s sadly not anymore a great way to overclock Pascal and Turing cards

 

Since Nvidia implemented the GPUBOOST technology on their GTX 600 series cards, people started facing some annoying issues like too low power limit, enforced by the card’s vbios, weird adaptive clocking behaviour making cards not run at full speeds during some specific loads  and the card automatically clocking down due to reaching specified temperature “steps”.

GPUBOOST is the reason why old bare offset overclocking isn’t really a great way to oc anymore, overclocking just by putting offsets will still result in, yes higher, but still really unstable core clocks and poorer performance

 

-DISCLAIMER

 

Overclocking is by definition, an attempt to make hardware run faster at out of spec frequencies and ranges, even though for most users and most cards the chances to damage something are pretty low, there’s always some risk involved into it.

Also keep in mind that due to the milion different combinations of cards, bioses, systems in general you may experience weird/different behaviors in doing such things.

Also i want to point out that this is just informative content and i do not take any responsibility if you damage your hardware of something goes wrong.

 

-SOME OF THE THEORY BEHIND 

 

–THE VOLTAGE/FREQUENCY CURVE (V/F CURVE)

Before starting we need to talk a bit on how these graphics cards handle their adaptive clocking capabilities,  GPUBOOST in order to handle different gpu power states and loads rises and lowers the core frequency based on a voltage/frequency curve on the NVAPI level.

 

Since Pascal cards nvidia implemented a new way to handle gpu clocks, these clocks are now managed by Voltage/Frequency curves,

Just like Maxwell and Kepler cards, Pascal and Turing based cards have a default V/F curve, which is completely BIOS dependant,

 the main difference though is that now we can live-modify the curve parameters from the OS using overclocking softwares like MSI Afterburner.

 

“Old style” core clock offsets calls are still present on the NVAPI level but that will just shift the whole curve as is a bit higher on the graph, resulting in yes higher, but still unstable boost clocks due to various limitations i will talk about later in the doc

V/F curves are nothing more than just a series of voltage and frequency points located every 15 MHz and 25mV

 

You can access to such curve by pressing CTRl+F on your keyboard while on the main page of MSI Afterburner, or by clicking the little “bars” symbol on the left side of the core clock slider

 

 

There’s obviously a limit on how far frequency and voltage-wise we can actually make the gpu run on that curve,

parameters like the curve itself, the maximum voltage, power and temperature limits are allowed are completely bios dependant

 

The “voltage slider” on tools like Afterburner, “unlock” the upper part of the curve with the consequence of potentially allowing the card to access and use higher voltage points.

 

As a general baseline, without considering power limit, (read below) most cards on most standard and non-special bioses are allowed to go up to 1.093V when the voltage slider is maxed out.

Extreme OC BIOSes (XOC BIOSes) are usually not publicly available, but they exist and among the other things they do, they usually rise the maximum allowed voltage usually around 1.1-1.2V

 

Fantastic, so should i just use the 1.093V voltage point on the curve and drag it up till a frequency i want to run?
Sadly that’s not quite the case and here’s the second problem our cards will encounter, power limit 

 

–THROTTLING CAUSES

–POWER LIMIT

 

Power limit is part of GPUBOOST itself and it’s mainly meant to balance performances and power savings, but it’s really tightly configured to the point the cards are almost power starved.

Having such tight power limit is nice for power efficiency but is also really disruptive for graphics cards overclock potential, limiting a lot the possible obtainable frequencies on the gpu,

 

Every card has a power monitoring circuitry implemented on it, that as you might guess, monitors how much power the graphics card is pulling out of your PSU, when the maximum allowed power limit is reached, the card automatically steps down to a lower V/F curve point, lowering its frequency and voltage in order to keep the power draw within allowed spec.

 

Maximum allowed power limit is also another parameter completely dependant and enforced by the vbios the card is currently using.

Users are allowed to increase or decrease power limit to some extent, but  not allowed to increase the power limit nearly enough to allow the card to constantly run on high Voltage/Frequency curve points, like for example 1.093V

 

this is the main reason why graphics cards seem to randomly throttle and have a pretty unstable core frequency.

Combining the previous informations, we can say that generally most Pascal and Turing cards can constantly and stably run voltages somewhat in between the 1.00-1.043V area on the V/F curve  (again all this is card and BIOS dependant)

 

–POWER LIMIT REMOVAL/MITIGATION

 

For Nvidia’s GTX900 series and earlier generations bios modding softwares were and still are publicly available on the net, such softwares allowed users to modify critical bios parameters like, frequencies, power, temperature and voltage limitations.

Sadly, since Pascal generation of cards, Nvidia started encrypting their bioses and no more bios modding tools were officially published online.

 

So what can we do to take care about power limit?

 

-The most effective way would be physically hardmodding the graphics card, but i won’t be covering that aspect in this doc, it’s definitely advanced stuff that ,if done wrong, can result in your graphics card becoming a nice expensive paperweight

 

-Software-wise, there’s sadly not much that can be done to completely remove power limit, but there are some mitigations that can definitely make a difference when overclocking

One of them is flashing a different  bios with a higher maximum power limit allowed  on your graphics card,

There are some special bioses (XOC BIOSes) for top end card models like the old GTX 1080 and 1080 Ti and current top end RTX 2080 Ti that actually remove completely power limit for extreme overclocking purposes, but most of these special made bioses are private and not publicly available to use (there are some exceptions of course)

 

Although flashing different vbioses is not a mandatory step to take, it can definitely affect how far you’ll be able to push your card, i’ll cover part of this procedure later in the doc.

 

–THE TEMPERATURE ASPECT

 

On Pascal and Turing cards temperature is another key factor of GPUBOOST and admittedly also heavily dictates the card’s behaviour under load and it’s overclockability

These cards automatically run at higher core frequencies on lower temperatures, so effectively and properly cooling your cards is actually the most important step to take when dealing with these cards

On top of that, lowering the gpu temperature also lowers the gpu power draw, possibly resulting in a higher obtainable running voltage point on the V/F curve.

 

[THERMAL DOWNCLOCK]

One of the big problems related to temperature, besides actual physical core clock stability, is the gpu automatically downclocking on its own when specific temperature thresholds are reached, these temperature thresholds are present all along the normal operating temperature range

This behaviour is pretty much unavoidable on ambient temperature cooling and is tied to the card’s normal operation behaviour, (first downclocking steps starting at even 3°C) luckly there’s a workaround to stop thermal downclocking,

i have to say, it’s a pretty buggy method and not always works, depending on your card and bios, but by using a very old Nvidia laptop power saving feature implemented in their drivers  we can try to somewhat mitigate adaptive clocking and thermal downclocking. i’ll talk of this later in the doc

 

-THE TL;DR

 

-GPUBOOST, under load will automatically use higher points on the V/F curve as long Temp. or Power limit are not reached ,

-To overclock Pascal and Turing cards it’s better to use the NVAPI Voltage/Frequency curve, by editing a specific stable Voltage point,

-Power limit is the biggest issue for Pascal and Turing overclocking, but it’s somewhat mitigable,

-Temperature is a key factor for frequencies and stability

-A combination of power limit and reached temperature thresholds will make cards downclock and have an unstable core clock under load

 

–THE PRACTICAL STUFF

 

First of all let’s prepare the basic tools needed for the work,

 

Overclock utilities

The OC utiliy: MSI Afterburner, downloadable here https://www.guru3d.com/files-details/msi-afterburner-beta-download.html

General purposes graphics card monitoring and information utility: GPU-Z,  downloadable here  https://www.techpowerup.com/gpuz/

Stress test/Benchmarking software: UNIGINE Superposition benchmark, downloadable here https://benchmark.unigine.com/superposition

 

Bios flashing and backupping tool

Stock nvflash    Modified Nvflash with board id mismatch disabled

 

[Advanced] potential advanced clocking fix softwares (see the Potential fix to thermal downclocking section below)

 

NVPMManager ThermSpyPremium

 

POWER LIMIT MITIGATION WITH BIOS FLASHING

I won’t illustrate a full bios flashing guide here since it will make everything too long and confusing, but i can give you some advices.

 

As said before, this step is not mandatory but can drastically change the result of your overclock.

Little disclaimer about bios flashing,

 

although chances to permanently break your card by flashing other bioses is quite low keep in mind that there’s always a risk when doing this kind of procedure

Worst case scenario, if things go wrong there’s a high chance you will be able to recover the card by flashing again the stock bios while using the display with your igpu or with a different dedicated gpu

 

first thing first, backup your original bios, there are different ways to do it but doing it with GPU-Z is probably the easiest, just open the software and on the right side of the interface, right next to the UEFI text  there’s the bios dump button

 

 

After saving your original bios you need to find a compatible bios for your card with a higher allowed power limit.

 

First thing you need to check what’s your current max allowed power limit to do so, open the “Advanced” tab in GPU-Z , select the “Nvidia BIOS” tab on the menu and check under Power Limit section your Default and Maximum power limits

 

 

Now that you know your card’s max power limit you need to find a bios that allows a higher power limit for your card,

To find a potentially better bios you may want to go on TechPowerUP’s massive VGA BIOS archive https://www.techpowerup.com/vgabios/

Just select NVIDIA as GPU brand and select your card model, a list of all the uploaded bios will pop up.

Sadly there’s no fast way to compare various BIOSes power limits all at once and you will have to open all the different BIOSes one by one and compare the maximum allowed power limits

 

Here’s an example of what a bios page on TechPowerUP looks like

 

 

Once you have found a bios with a higher power limit, just press the download now button and save the .rom file

 

To flash the bios, here’s a nice guide on how to do it (Turing cards) https://www.overclockersclub.com/guides/how_to_flash_rtx_bios/

Here’s a guide for Pascal cards, the procedure is the same anyways https://www.overclock.net/forum/69-nvidia/1627212-how-flash-different-bios-your-1080-ti.html

 

–THE ACTUAL OVERCLOCKING 

 

Ok now here’s the actual overclocking part,

 

-GPU CORE CLOCK OC

 

let’s start by opening MSI Afterburner and setting it up correctly,  open the settings menu by clicking the gear on the UI,

Open the “General” Tab in the settings and set it as show in the picture

 

Now open the “Monitoring” tab and enable the GPU Voltage graph

 

 

Hit apply and press OK, Afterburner should ask to be restarted, do it and let it reopen

 

Now we have to prepare the graphics card for its first test run

 

-Make sure your gpu is at stock, reset it by pressing the reset arrow in the middle of Afterburner or by pressing CTRL+D

 

-Increase to maximum the Voltage, Temperature limit and Power limit sliders

 

– i highly suggest you to set a fixed fan speed you’re still comfortable with noise level wise because as said earlier temperature is a key factor, so running the card cooler will help achieving higher frequencies automatically

 

Your afterburner UI should look similar to this

 

 

Now hit the Apply check mark on afterburner and open UNIGINE Superposition Benchmark

 

Do not close Afterburner since we need it running in background monitoring  the gpu for later

 

Navigate into the “Game” tab of the software and select either 1080p extreme or 8K optimized presets, those are the heaviest presets on Superposition, 8K optimized is suggested to load up more heavily gpu memory modules, especially on high  vram quantity cards

 

Using other presets will invalidate the whole procedure due to being lighter presets on the gpu, making it draw slightly less power

 

 

Now click run and wait for the test to load up

 

Once the test loaded up, press the “Cinematic mode” button in the top left corner, Superposition will start to run infinitely till manually stopped on it’s presetted scenes

 

Leave it running for at least 10 minutes, or more till you are sure the gpu reached its thermal and voltage stability.

 

Once you’re done close the stress test and quickly open Afterburner, click on the “Detach” Button on the lower part of the UI to see the full lenght graphs.

 

You need to look for the GPU Voltage graph, you’ll see it’s pretty unstable with a lot of dips and peaks, you have to find what’s your lowest dip when the card was still under load

 

as you can see in MY particular case the lowest Voltage under load for my card with this particular bios and cooling was 1.037V

 

 

Again, keep in mind that your voltage can differ from mine

 

Now that you know your card’s stable voltage under load  we can actually getting into rising up the core clock frequency using the V/F curve

 

Open the V/F curve on Afterburner by pressing CTRL+F

 

And find your exact stable voltage point, again in MY case is the 1037mV point on the curve

 

 

Now you need to select that particular point with your mouse and “drag” it upwards till a it matches a frequency you want to try to run on your gpu

 

Remember Nvidia uses a 15MHz clockgen for these cards, so you should only increase the point by 15MHz at a time,

For example let’s say i want to try run 2070 MHz on my gpu

i have to drag the 1037mV point up to the 2070MHz mark,  MHz are on the left vertical column

 

 

After increasing the point till a desired frequency, keep the V/F graph still open and hit apply on afterburner

 

If everything was done correctly now the curve should look something like this

 

 

As you can see, the curve from the stable voltage point (in this case 1037 mV) and above is slightly higher

Make sure you have no other V/F points on the same horizontal line of you stable voltage point, the stable voltage point must be the last one on its horizontal line before a slightly lower point

 

Now last thing to do, is locking that V/F point, forcing the card to always run in P0-State and full clock speeds, to do that click on the point you just set up and lock it by pressing L on your keyboard,

the point will now become yellow

 

 

You can see now that GPU and MEM clocks and GPU Voltage are locked to full speed,

 

This frequency lock will make your card run slightly hotter during idle, even though it’s not needed is highly suggested to prevent clock fluctuations during loads 

 

Now you just need to run again UNIGINE Superposition in “Game” mode and let it run for another 10 minutes at least to make sure there’s no more voltage fluctuation

 

 

As you can see in the pic after running superposition again, the difference is quite dramatic, voltage line is literally a flat with no fluctuations at all and so is gpu core clock.

 

IMPORTANT NOTE, remember that due to reached temperature thresholds gpu core clock might still downclock by a 15MHz or more steps if your card’s temperature is constantly increasing during under load

 

Crucial thing that you have to check though is that your GPU voltage is stable and constant, this means that you nailed the card’s stable V/F point and you’re running the card within the power limit spec

 

OK cool, but i want MOAR core clock

 

to further increase your core clock speed till your gpu physically can’t run any higher frequency because voltage or temperature,  you just have to open again the V/F curve editor on afterburner and drag the stable V/F point even more

 

If you crash because you pushed too much i HIGHLY suggest you to reset your card completely on afterburner by pressing CTRL+D and redo the curve from scratch, still using the same stable V/F curve point of course

 

-GPU MEMORY OC 

 

Gpu memory oc is pretty straight forward, just like old cards generations, add an offset to the stock frequency on afterburner, to keep within reasonable ranges i’d say push your memory clock by adding +250/300 at a time

 

So basically add +250 MHz to your memory offset and run Superposition to be sure it’s stable (8K preset recommended for MEM oc stability)

 

if it’s stable, add another 250MHz or so and so on till you crash or start losing performance

 

 

–Potential fix to thermal downclocking on modern nvidia cards (BSOD RISK)

 

–DISCLAIMER 2

 

before starting, as far as i could test, this should theoretically work fine on most Pascal based cards, on Turing though, this procedure can result in a bsodded driver, forcing the user to boot into safemode and reinstall the driver itself.

Still not sure why sometimes works and sometimes doesn’t, i think it might have something to do with various cards and bios combinations or something else on the driver level.

 

Anyways here i’ll try to write a guide on how to try to fix the thermal downclocking issue on modern cards.

 

–First thing first open MSI Afterburner or similar software and reset your card at stock values.

 

–Now open nvidia control panel, click on the “desktop” tab on top and check “enable developer settings”, a new menu should pop up in the panel, open it and select “Allow access to the GPU performance counters to all users” . Now hit apply

Your screen may flicker for a second and then recover

 

 

–After applying that, restart your system

 

–Once system has restarted, download and open Nsudo , select TrustedInstaller as user and check “Enable all privileges”

–Now download this small script i made

–click on the “browse” box on Nsudo and select the script you just downloaded

–Hit Run on NSudo and a small cmd box should pop up on the screen confirming the gpu clock policy was set to unrestricted

 

 

–After running the script you can now close Nsudo and the cmd prompt.

 

–Now dowload NVPMManager, a little handy software that will apply all the registry entries automatically instead of doing it manually

 

–Open NVPMManager with admin rights

–Click on “Create PowerMizer settings”

–Below Check “Enable PowerMizer feature”

–Set the first 2 boxes below as shown, so “fixed performance level” Max Perf/Min Powersave

–Check the “Overheat Slowdown Override” box and set it on “Disable Overheat Slowdown”

–Now hit Apply and reboot and reboot your system

 

 

–Now that’s the critical part, rebooting might or might not result in a BSOD, if you manage to reboot successfully, the procedure probably is working fine, if you can’t boot into windows and you get BSODS, usually BAD_POOL_HEADER or straight up nvdllkm.sys it means that something obviously is wrong.

 

As said before that’s what i’m currently trying to figure out, i’m not yet sure on what makes this procedure work or fail 

 

This is pretty much all Turing dependant, on Pascal cards i’ve never seen it fail so far.

 

–Continuing with the guide, if you managed to boot correctly and everything is working fine, open afterburner and you should see your gpu locked at around 700ish MHz on idle and always full speed gpu memory clock

 

 

Now just set a stable curve point, like shown in the first part of the doc, so you won’t power throttle and then run something to warm up the gpu.

 

Theoretically, now your gpu should not downclock anymore because reached temperature steps

 

–Alternative method to potentially stop thermal downclocking

 

–Here’s the second method to try achieving the same thing

 

–Again, reset gpu at stock values and Allow access to all users in nvidia control panel

 

–Run the set_unrestricted_clock_policy.bat with NSudo on TrustedIstaller user and all privileges enabled

 

–Download ThermSpy, And open it with Nsudo always on TrustedIstaller user and all privileges enabled

 

 

–Now open the Test P-States section of ThermSpy

 

–Keep afterburner open next to ThermSpy to live check if changes take effect

 

–Now just click on “Turn off” Right next to the adaptive clocking box  (don’t mind if the adaptive clocking box resets on the ON position)

 

–the card should now be locked at full speed frequency on both core and memory

 

 

–Now set a stable curve point like shown before and run something to check if the card is still downclocking

–If nothing went wrong and everything works, your gpu should not downclock anymore because of temps

 

So far i can confirm these 2 methods work for me on my 2080Ti FTW3 on all Galax XOC bioses and also on stock 375W evga bios

 

Here we are at the end of the doc, hope this guide and procedure was useful and might have helped someone,

 

i’d be glad to help if you have some doubts or have experienced some issues,

 

Leave a Comment

Your email address will not be published. Required fields are marked *