Skylake / Kaby Lake Hyper-threading bug

Discussion in 'Hardware Components and Aftermarket Upgrades' started by Assembler, Jun 26, 2017.

Thread Status:
Not open for further replies.
  1. Glzmo

    Glzmo Notebook Deity

    Reputations:
    475
    Messages:
    822
    Likes Received:
    86
    Trophy Points:
    41
    @XMG has the XMG U727 2017 BIOS been updated with this microcode update yet? My Laptop has the 1.05.05 BIOS version with KBC/EC Firmware Revision 1.05.02 and ME FW 11.6.10.1196 it came with. Does this BIOS already have the fixed microcode or is there an updated BIOS available yet and where (can't find anything newer than my already installed BIOS in the mysn downloads section)? It doesn't seem I can disable Hyperthreading in my BIOS either (BIOS options such as hyperthreading, XD bit, Virtualization, etc. should really be available in any release).

    Update: My CPU Microcode Update Revision appears to be 48 according to HWInfo.

    It might be a good idea if other vendors chimed in on the status of their Skylake and Kabylake based systems regarding this issue as well.
     
    Last edited: Jun 28, 2017
    JorgeManuelSilva91 and hmscott like this.
  2. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    These are both excellent questions to ask each and every vendor / maker, to get the status of the fix in the BIOS release cycle, and to get the Hyperthreading enable/disable option back into the BIOS.
     
    Last edited: Jun 27, 2017
  3. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    If you can recreate the symptoms by running that software again, and then disable hyperthreading, it would be interesting to know if it was the cause.

    The instability isn't just "crashing", it can be any corruption caused at the time of an event, sometimes it will not have any effect, sometimes it will.

    I've often debugged anomalous behavior that is so rare I need thousands of servers to get a "clue" as to the cause, then when I have a good theory and I can recreate the problem at will, the underlying hardware failure can be isolated.

    I've found improperly installed memory this way many times. And, failing battery backup on RAID cards. Failing ID chips in motherboards (Sun). And, finding and isolating varying levels of firmware compliance for failing OOB support.

    Memory seems to run everything fine except different OS features report varying data back "differently" - size or value expected isn't exactly correct, but passes vetting of the database sanity checks.

    Or, an application that runs various simulations comes up with "odd" answers as compared to running on previous systems - validating running on new hardware and OS.

    Firmware variances across 10k machines can have similar effects to what is seen with errata fixes missing across some machines, but not others.

    When you are running a large number of machines 24 / 7, 1 in a million situations can come up hourly :)

    It's a matter of experience and perception. Better safe than sorry, better to check to see if hyperthreading off solves strange anomalous behavior - it can be a quick test to take it off the checklist - and mind.
     
    Papusan and tilleroftheearth like this.
  4. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    The Kabylake CPU's are supposed to be affected, but the only info we have on a fix is for the 6th Gen CPU's, I only found SKL150 for Skylake, search this doc for SKL150 for more info:
    https://www3.intel.com/content/dam/...s/desktop-6th-gen-core-family-spec-update.pdf

    The debian, Intel, and other notes so far say there is no fix for the 7th Gen. Kabylake, but it is affected.

    I would imagine that the Kabylake X CPU's are a clone of the existing Kabylake CPU's, so they would also have the same issue.

    IDK if the Skylake-X CPU's are counted in the same errata for the Skylake CPU's, but since they are so new I would assume the frequent BIOS updates should get them covered quickly after release.

    All good questions :)
     
  5. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    Here is someone that was able to come up with a repeatable failure test, that didn't fail when Hyperthreading was disabled:

    https://hardforum.com/threads/skyla...oken-hyper-threading.1938149/#post-1043078134

    "I found a set of conditions that each time I perform it, my machine will lock up.

    I just tried without HT enabled and it works fine.
    So looks like this bug affects me with my 6700K.

    The bizarre condition is:
    Win7-64
    Running torrent client in a sandbox.
    Sandbox and torrent client have limited system access via Comodo Firewall HIPS (not sure if this is relevant but it may help trigger it).

    If I am downloading a torrent and try to play a card game (windows built in games) to pass the time, my machine will hard lock. Every time.

    If I open the card game first or stop the torrent or let the torrent finish it works without issue.

    I thought it might be related to locked down security issues but as its something basic I can cope with I didnt pursue a solution.

    I tested opening a card game with a torrent downloading with HT disabled and it works fine.

    So that explains that."

    Even better, he found that the latest BIOS for his motherboard fixed the problem :)

    "Replying to my post to confirm that updating to the latest motherboard bios has fixed the problem.
    No crash with HT enabled now.
    My motherboard is the Asus Maximus VIII Hero.
    Previous bios version 2001 29/08/16
    New bios 3401 07/04/17

    If the problem was caused by this bug, the CPU fix is in the latest bios."
    https://hardforum.com/threads/skyla...yper-threading.1938149/page-2#post-1043079608
     
    Last edited: Jun 27, 2017
    tilleroftheearth likes this.
  6. XMG

    XMG Company Representative

    Reputations:
    695
    Messages:
    1,722
    Likes Received:
    2,078
    Trophy Points:
    181
    I am leaving this post as a placeholder, so that I can make a more comprehensive reply and deal with all the questions in one place. To do this, I am checking the situation with BIOS versions and associated microcode for both the Skylake and KB CPU systems in order to cover all bases.

    But for now I will say two things:

    1/ if people are reporting a problem with software running with HT enabled and that disabling it means that the software runs better or without problems, this can't be used to conclude that the root cause with HT on is the exact problem that we're discussing on this thread. It could be due to this problem, but the link needs to be proved before we can argue that someone has definitely replicated the exact same issue.

    2/ it's extremely rare and only a handful of people have been able to replicate it (literally a handful). It is a bug, Intel have confirmed this via other media, and there is a fix.

    Will update once 100% of the information is gathered in order to try and clarify the status for everything.

    UPDATE http://forum.notebookreview.com/thr...er-threading-bug.806317/page-10#post-10558834
     
    Last edited: Jul 5, 2017
  7. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    And, I think it just goes to show that disabling HT is still, after so many years, a good step to take when debugging odd situations like the one Nenu from HardForum.com posted.

    As he posted, in Nenu's specific case, doing a BIOS update that likely includes the Intel Skylake HT microcode patch fixed his repeatable problem, the same as turning off HT in the BIOS - now he can run with HT Enabled.

    Maybe it's just too good to be true to find a person that can demonstrate the problem (to himself) and that he found the solution in the BIOS update. He was running a year old BIOS, maybe there were other BIOS fixes during that year that fixed his specific issue?

    Maybe if you have an odd problem on your Skylake / Kabylake system that so far debugging hasn't solved, you could try disabling HT in the BIOS and see if the problem goes away.

    Or, find and update your BIOS with the latest Errata patches, and not worry about it any more ;)
     
    Last edited: Jun 27, 2017
  8. JorgeManuelSilva91

    JorgeManuelSilva91 Newbie

    Reputations:
    5
    Messages:
    9
    Likes Received:
    12
    Trophy Points:
    6
    Last edited: Jun 27, 2017
  9. Kent T

    Kent T Notebook Virtuoso

    Reputations:
    265
    Messages:
    2,961
    Likes Received:
    753
    Trophy Points:
    131
    Also, another good rule of thumb on virtual machines. You need to allow the host machine to be 2x-3x the need of the software and OS you're using virtually, this is on CPU speed and RAM. The VMWare driver doing updates it needs is also wise. Full voltage Core i7 at higher clock speeds is a good CPU for Virtual machines. I also recommend 12-16 GB of RAM or more, fewer issues that way.
     
  10. hmscott

    hmscott Notebook Nobel Laureate

    Reputations:
    6,904
    Messages:
    20,284
    Likes Received:
    25,106
    Trophy Points:
    931
    Sorry man, you mis-read the point of the thread. o_O

    Thanks for the VM tips though, I'm sure someone will find them useful in another thread focused on VM's.

    This thread is about getting Skylake / Kabylake (when available) CPU microcode updates into Windows, not into a virtual machine.

    Only the last post, just before your's, mentioned a method using a tool from VMware to use in the host machine to update CPU microcode.

    Useful when VMware discovers a bug in a CPU and Intel provides a fix - the tool can apply the fix immediately instead of waiting for a BIOS update from the motherboard maker.

    Read from the 1st post, and find out about the errata available to fix a CPU Hyperthreading bug :D
     
    Kent T and JorgeManuelSilva91 like this.
Thread Status:
Not open for further replies.

Share This Page