The ThrottleStop Guide

Discussion in 'Hardware Components and Aftermarket Upgrades' started by unclewebb, Nov 7, 2010.

  1. Xonar

    Xonar Notebook Deity

    Reputations:
    1,457
    Messages:
    1,518
    Likes Received:
    13
    Trophy Points:
    56
    Understand though, when using programs like wPrime it throttles at ~88*C. Also, notice the graph, even before my temperature peaks, i have constant fluctuations (look at the the dark blue line). SC2 temps max at ~90*C with a x31 eventually to x28 multiplier. That's why my initial assumption was that TDP was the factor. I could test it always with my cooler to double check if it's temps, but I really don't think they are.
     
  2. Zero989

    Zero989 Notebook Virtuoso

    Reputations:
    910
    Messages:
    2,823
    Likes Received:
    572
    Trophy Points:
    131
    its ok xonar i cant maintain my cpus advertised 4 core speeds unless the cpu is not at 100% usage on each core. i get 2.7ghz when using 8 threads 100%. my cpu is also at 76c on the highest core -_-.
     
  3. T1mur

    T1mur Notebook Guru

    Reputations:
    0
    Messages:
    55
    Likes Received:
    3
    Trophy Points:
    15
    TPL is an *average*. When the CPU is put on constant load it is throttled to match that average, when the load is going up and down the peaks can exceed that average for a limited time.

    By increasing the TPL of my mobile i7 Quad by 3 watts I can increase the throttled multiplier by 1x on constant Prime95 load.

    So both temperature and *average* wattage affect the effective multiplier.

    Furthermore *all* of the factory Windows 7 power-profiles keep *half* the CPU cores unparked (=hyperthreaded cores are usually parked only). This keeps Turboboost from reaching its maximum multiplier (=single core operation).
     
  4. Dufus

    Dufus .

    Reputations:
    1,194
    Messages:
    1,336
    Likes Received:
    548
    Trophy Points:
    131
    From what I've seen hyper-threaded cores are parked to enable threads to run on individual physical cores for maximum performance. This helps alleviate core contention when two logical cores run on one physical core. AFAIK parking should not effect your ability to turbo as the parked cores are a result of OS scheduling ie the OS ignores assigning a thread a parked core, however un-parked cores can still transition to the higher c-states to enable traditional turbo to work.
     
  5. T1mur

    T1mur Notebook Guru

    Reputations:
    0
    Messages:
    55
    Likes Received:
    3
    Trophy Points:
    15
    Yes, but because threads are constantly being moved between cores you hardly see enough cores in C3/6/7 unless they are actively parked by the OS (which means nothing else than the OS/scheduler not making use of the cores and then putting them to go to C3/6/7). And Windows' power profiles only park 50% of your logical cores while moving threads around for temperature balancing (which is not a bad thing), so you hardly ever get the full single-core Turboboost.

    Example: My i7 Quad does x31 with all 8 logical cores (4 physical + 4 HT), x32 with 4 cores and x34 with 2 cores. Windows usually only never allows more than 4 cores to go to C3/6/7 when several threads are running medium to high load, so you never get to x34.

    By setting up a custom power-profile you can make Windows to put all (but 1) core to parking and thus more easily reach x34 on a single physical core (=2 logical cores).

    I'm not saying that this is an advantage, but the whole core parking vs. thread moving thing does have an effect on your current multiplier.
     
  6. Dufus

    Dufus .

    Reputations:
    1,194
    Messages:
    1,336
    Likes Received:
    548
    Trophy Points:
    131
    A thread isn't moved around for temperature balancing, it's just a product of the OS scheduling. A thread gets a time-slice from the OS, typically 15.6ms unless clock resolution has been increased. Depending on priority that thread after it's slice (quantum) gets to the back of the queue and once it's rescheduled ends up on any of the 4 cores. It's more beneficial for it to end up on the same core as the L1/2 cache may still be populated with some of the data used by that thread (less cache misses). You might find W7 OS will actually prefer the same core if it's free.

    Having the thread move around still leaves plenty of time for the other cores to transition to higher c-states if they become free so not such a big impact on turbo IMO.


    It's 4 cores 8 threads. Both threads of a physical core are equal. ie it's not a core thread and a HT thread.
    Processor C-states come in 2 flavours, core and package. 2 threads of the same physical core have to have the same shared core state. If you request C3 for one of those threads while the other thread is in C0 the core will remain at C0 and there will be no C3 state obtained.

    Your right, most likely sacrificing performance just to reach a higher multi by only using half your processing power unless that is you only have one or two main threads running. You could probably set the affinity mask to achieve much the same.

    Anyway, enough from me as it's getting too OT. Sorry about that Uncle.
     
  7. Xonar

    Xonar Notebook Deity

    Reputations:
    1,457
    Messages:
    1,518
    Likes Received:
    13
    Trophy Points:
    56
    I have disabled HT just to check and regardless on any application that uses 4 cores, BF3, SC2, Prime95, etc, same thing happens.
     
  8. T1mur

    T1mur Notebook Guru

    Reputations:
    0
    Messages:
    55
    Likes Received:
    3
    Trophy Points:
    15
    Running a single thread of Prime95 on an another-vise mostly idle system sees Windows shifting around the Prime thread between cores all the time, with Windows unparking/parking cores accordingly.

    While I might miss some point, I don't see why Windows would do that for any other reason than load/temperature balancing. As you explain yourself it would be more beneficial to keep the single thread running on a single core for cache reasons.

    My statement was that the maximum possible Turboboost multiplier is affected by Windows' behavior of only parking half the logical cores and moving threads around non-parked cores. Sure non-parked cores will transition to C3/6/7 and thus allow Turboboost to increase the multiplier. But parked cores are *always* in C3/6/7, because parking means nothing else than the OS Scheduler not making use of a core.

    That's why I wrote "logical" cores. If every second "core" is parked (unparked: CPU 0/2/4/6, parked: CPU 1/3/5/7) then you can say that only "HT" cores are parked.

    Thanks for the explanation, but since I helped Unclewebb improving this part of Throttlestop I guess I already knew that. ;)

    You misunderstood me here. Setting up Windows to allow more cores to be parked is not the same as setting up affinity. At least not unless you enforce these cores to *stay* parked, which I do not. I just allow a higher maximum of cores to be parked, while Windows can still unpark all cores if necessary. If this has any beneficial effect in a real-world scenario is another story.

    More important: I have to correct what I wrote about "average TDP", I meant "average TPL". Sorry for the confusion.
     
  9. jlells01

    jlells01 Notebook Geek

    Reputations:
    0
    Messages:
    76
    Likes Received:
    0
    Trophy Points:
    15
    Uncle Webb,

    Just wanted to say thank you for such a great, useful piece of software:
    CPU-Z Validator 3.1

    Default is 1.263v, and I'm at 0.963v (a 0.3v drop!) while being totally stable (12hrs. Prime95 blend/50 runs IBT).

    Thanks again!
     
  10. Dufus

    Dufus .

    Reputations:
    1,194
    Messages:
    1,336
    Likes Received:
    548
    Trophy Points:
    131
    Okay, lets try to put this in an order to help explain.

    Firstly there is a distinction between cores and threads using HT. With a system that has 2 threads per core it's 4 cores and 8 hyperthreads. So saying only HT cores are parked is incorrect. Better to say one thread of each core is parked.


    For you to have a parked core both threads of that core need to be parked, not just one. Do you really mean thread here instead of core? If you have one thread of a core parked and the other unparked the core will be at the lowest c-state of the 2 threads. ie if logical CPU0 and CPU1 share the same physical core and CPU1 is parked while CPU0 is in c-state C0 the core will be in c-state C0.


    There may be something already running on the preferred thread. The OS doesn't know how long a software thread is going to run, nano seconds or hours, other than trying to use the previous history of the software thread. If the hardware thread is occupied by another application then wouldn't it be more efficient to assign a different hardware thread rather than wait? How are you monitoring these switches between threads? Probably by software that needs direct access to specific threads/cores in which case this often influences the scheduling behavior.

    Here's an example assigning Linpack to one software thread and default affinty of using any cores. By using 20 second sampling time there is less influence the result.

    [​IMG]

    [​IMG]

    [​IMG]

    Notice how much of the 20 seconds is spent on core 3. ;)
     
Loading...

Share This Page