Questions and Answers :
Windows :
Nvidia driver 461.09 causes wu's to stall/run indefinitely, 460.79 works fine
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Apr 17 Posts: 5 Credit: 361 RAC: 0 |
As the title says, nvidias new gpu driver is breaking milkyway opencl workunits. Reinstalling the previous driver, version 460.79, fixes the issue. Another users 1080ti and my 2070 super both had the same problem and the same fix, so there are likely other nvidia gpus affected by the bug too. While testing/confirming it was the driver, I noticed that cpu usage was identical between the two driver versions, but with the newer driver gpu load never went above idle. They don't error out, at least not overnight, so it just shows an ever increasing remaining time estimate in boinc. I suppose it would be possible to see what is going on by using the nvidia visual profiler tool? I'm on mobile data at the moment, so I just want to check that the tool would work to profile opencl before downloading it and trying to figure out how it all works. Happy to do it if the logs could be helpful to the devs here. |
Send message Joined: 9 Dec 11 Posts: 38 Credit: 1,497,896,956 RAC: 0 |
Not the first time a driver update broke a DC project. I think it was last year any NVIDIA driver past a certain point caused problems for F@H. You got it right, roll back the driver and keep truckin :) |
Send message Joined: 14 Apr 17 Posts: 5 Credit: 361 RAC: 0 |
Yeah I had a look into the F@H thing, apparently there was a bug that went unfixed by nvidia for 2+ years and while the project had a workaround it affected performance quite a bit. I'm probably going to move on to SRBase, after reading a bit more it seems a little silly to be using this gpu for milkyway when old amd cards are just as fast or faster. And while this pc spends more time crunching than gaming, but it is primarily a gaming pc so I'd prefer to keep the drivers up to date. Still happy to do some profiling or extra troubleshooting if it would be helpful. But I guess if most people are running older hardware, or only crunching, then the driver issues aren't really a problem for the project in general. |
Send message Joined: 10 Mar 11 Posts: 9 Credit: 16,497,101 RAC: 0 |
2021/01/26 update to version 461.40 driver fixed the same issue for me. But tonight I upgraded to Windows 10 Pro from Home and all work units now Compute error. Detaching and reattaching still error out every work unit so I've stopped the project until I find the solution. Life's short; make fun of it! |
Send message Joined: 10 Mar 11 Posts: 9 Credit: 16,497,101 RAC: 0 |
As in typical RTFM fashion, I uninstalled the nVidia drivers and reinstalled them allowing Boinc to continue happily crunching once again. Life's short; make fun of it! |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
As in typical RTFM fashion, I uninstalled the nVidia drivers and reinstalled them allowing Boinc to continue happily crunching once again. You don't have to uninstall the old ones if you reinstall them using the same version, it will just overwrite them and you will be good to go after a reboot. Windows has a VERY bad habit of messing things up for crunchers and gamers by thinking THEIR drivers are better, they aren't, but it doesn't matter to them. |
Send message Joined: 30 Dec 12 Posts: 7 Credit: 10,011,100 RAC: 0 |
I converted one of my computers from Linux to Win 10 a couple of days ago. At the time I installed the 461.09 drivers. My GPU has been happily crunching MW tasks since. The drivers were loaded after installing windows and I have not let windows update anything, so I suspect it was some windows update causing the problem not the Nvidia drivers. |
Send message Joined: 1 Dec 10 Posts: 82 Credit: 15,452,009,012 RAC: 0 |
It is definitely the nVidia drivers 461.09. I have 3 hosts all Win 10 which will downclock my overclock and then the units take an age. The units were taking about 1 min with the overclock until the glitch which is caused by simply being connected to a monitor/TV. I use an HDMI switch for the 3 hosts and once a reboot is instigated, switch to another machine and repeat the process until all 3 machines have been done. I can then switch the switch off as soon as the last host starts to reboot and all 3 hosts behave. The previous drivers and the 8 before all work without the glitch but the units take 1 min 15 secs |
Send message Joined: 10 Mar 11 Posts: 9 Credit: 16,497,101 RAC: 0 |
I was noticing something glitchy as some units were crunching for 4-12 hours and stuck at around 40-70% complete so I realize there was a driver issue and came here to the forums for a solution. Life's short; make fun of it! |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
They also had a couple bad batches of workunits, that could have figured in as well |
Send message Joined: 1 Dec 10 Posts: 82 Credit: 15,452,009,012 RAC: 0 |
They also had a couple bad batches of workunits, that could have figured in as well These batches would have been problematic regardless of the nVidia driver used. |
©2024 Astroinformatics Group