Clevo P370sm-a fan and graphics card problem

Discussion in 'Sager and Clevo' started by KuroSan, Feb 22, 2018.

  1. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    hello,

    i have a few problems with my XMG P724. in another thread i tried to find help with my dead gtx 980m, but maybe it was in the wrong section.

    i tried to fix my gtx 980m as it seems to be a mosfet problem. now the card is working again, but not normal.
    first time a day i switch on the computer everythings works, it boots normal, card is recognized.
    the card takes everything i throw at it, benchmarks, games, rendering tests, without any problems.

    but when i reboot the machine there are 3 different things that can happen:

    1. : computer doesn't boot at all - black screen, 22 beeps, then shutdown
    2.: computer boots like normal, but card isn't recognized, code 43 in device manager, no external display
    3.: computer boots up like normal, but after 20 seconds or so, when os is loaded, 22 beeps and shutdown.

    when the computer works normal, sometimes the fans don't spin up and the card overheats, but all readable sensors are there and working, so i assume, there is another non readable sensor only the vbios can read which is malfunctioning.

    maybe someone in this forum has the knowledge and experience with this kind of defects and can tell me more about the involved components, especially mosfet power managment in relation to temperature and vrm 3rd phase driver and what the motherboard ec does.

    i have read all the technical documents an datasheets but for the clevo side there is not much to find.


    thank you
     
  2. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    7,520
    Messages:
    47,198
    Likes Received:
    12,742
    Trophy Points:
    931
    I went the whole hog and got all 6 places for VRM chips filled but I did not see this kind of behaviour. What chip did you replace it with? Where did you source it?

    [​IMG]

    For reference.
     
    Last edited: Feb 22, 2018
  3. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    Hi,

    I replaced the Mosfets as shown in your reference picture, i bought them from a seller in hong kong from ebay. but the vrm is on the backside of the card - uP1642...
    in the datasheet of this thing is mentioned a temperature reading circuit which controls mosfet power(80% load or high temperature causes throttling - with 3 mosfets - with 6 this doesn't happen as long as you don't raise powerlimit).

    http://international.download.nvidia.com/openvreg/openvreg-type2-plus-1-specification.pdf

    i think when i reboot the card starts in a protection mode and the card stays on boot voltage and sometimes while booting ec loses signal to vrm(uP1642) temperature sensor.

    but i don't understand the communication with the ec and with the power managment chip through video bios. my next step is to replace the uP1642 i bought from ebay GB because i think it could be a false reading which causes the fan problem and the protection mode.

    maybe you know something about the ec vbios checks, i couldn't find anything for clevo or something more specific about power up sequence of the computer.

    thank you

    sorry for bad english, it not my native language.
     
  4. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    Ok, I have to give up on this one, VID trace ripped off the pcb with my 2nd attempt to remove the QFP24 ic. first attempt to solder new chip was good, but i had little too much solder on the ground contact so there was a short and with trying to solve that i had bad luck - no way fixing that.

    for anyone who's interested - the problem was cards video bios could not communicate card load(IC INA 3221 - amps over mosfets) and voltage regulator temp( IC uP1642 - monitors itself) to ec, thats why pc sometimes startet but fan didn't spin up under high temps(was reporting wrong conditions to vbios and ec) , and after reboot the signal was completely lost resulting in boot attempt with 22 beeps and then shutdown.

    when code 43 in device manager happend the uP1642 was not able to change from boot voltage to any other by driver requested voltage and was running in safe mode.
    with everything being ok, my card reached 82°C max under hard conditions, when malfunctioning easy and fast over 90 with strange throttling behavior.

    i didn't understand it all, but maybe what i found out is useful for someone with similiar problems.

    the cental part of all this is the up1642 and its temperature sensor and my card was not cooled on the backside - clevo decided the backplate for too heavy or something.
    that's why i think this thing was messed up a bit and tried to repair.

    thank you
     
  5. Meaker@Sager

    Meaker@Sager Company Representative

    Reputations:
    7,520
    Messages:
    47,198
    Likes Received:
    12,742
    Trophy Points:
    931
    I had the chips professionally soldered on, the connections need to be not just making contact but very good to ensure better load balancing and low wastage.
     
  6. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    yes, soldering the mosfets wasn't that hard, but on my card they weren't dead, as i found out later. the powermanagment on the backside was broken, that's why my card could not stable undervolt, even with an asics score of nearly 78%. only 1 step down was possible under load and even that was not really stable(from 1.025V to 1.012V).

    I had problems with this card from day one. in the middle of a game it startet overheating because fan went quiet with 80°C - then going up, and no more power throttling - strange behavior.

    this thing caused the problem all the time, but it needed some kind of "reverse engeneering" to understand it.

    https://www.upi-semi.com/en-article-upi-362-1472
     
  7. Danishblunt

    Danishblunt Notebook Virtuoso

    Reputations:
    578
    Messages:
    3,909
    Likes Received:
    1,553
    Trophy Points:
    231
    That is indeed a strange behavior. I was under the impression that the EC of the Clevo would determine the fan speeds according to the temps that the card report.
     
  8. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    yes, i thought the same, but what was triggering the 22 beep alarm must be some kind of other sensor on the card. it's "THALERT"-function triggered by nvidia card.

    it's a contact from mxm slot to ec, and notebook behaves the same when the card is not inserted. but it was starting with picture on screen, then after 10 to 20 seconds gave alarm and shut down.

    temp sensor from gpu is readable without driver, showed around 40°C before shutdown. from my understanding, gpu sensor will trigger templimit function, but the other one shown in datasheet of uP1642 and nvidia documentation will trigger powerlimit function - my card showed sometimes "no load" limit at 99% load and 85 to 87°C.

    i tested my book with an HD7970M - everything worked as it should.

    as i said i didn't understand it all, that's why i asked for help in this forum.
     
  9. Danishblunt

    Danishblunt Notebook Virtuoso

    Reputations:
    578
    Messages:
    3,909
    Likes Received:
    1,553
    Trophy Points:
    231
    As far as I know he 30second beeping + shutdown actually happens when the EC cannot read the temp sensors of the card. That's why cards like the MXM RX 480 and RX 580 boot fine, show pictures etc. but will make the system beep and shutdown because the EC cannot read the temp sensors.

    I'm not sure as how the whole card is build in order to tell you how exacly everything fits together. Maybe people like @Prema or @Khenglish can tell you more about it.
     
  10. KuroSan

    KuroSan Notebook Enthusiast

    Reputations:
    3
    Messages:
    38
    Likes Received:
    4
    Trophy Points:
    16
    exactly, but it seems not to be the gpu core sensor alone. in the service manuals are different vga temp lines from mxm mentioned.

    in case of the AMD RX cards there is no sensor the ec can read - sm bus is involved or maybe they use Ic2 instead. i'm sure this is hardware related as it was with the older HD 7xxx cards in alienware and clevo machines.

    anyway, my card is dead now and when i'm in the mood maybe i try to fix the broken trace on the pcb. but thank you for the interest.
     
Loading...

Share This Page