Questions and Answers :
Unix/Linux :
download failures on GPU tasks
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 17 Nov 09 Posts: 4 Credit: 111,166,377 RAC: 0 ![]() ![]() |
I built out a new Mint 19.1 rig with AMD processor and Nvidia 1030 GPU. All was well until the new servers started back up. Everything seems in order but I now get download failures for GPU jobs. Here is what the logs show: Fri 22 Mar 2019 12:18:39 PM CDT | Milkyway@Home | Started download of milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101 Fri 22 Mar 2019 12:20:41 PM CDT | | Project communication failed: attempting access to reference site Fri 22 Mar 2019 12:20:41 PM CDT | Milkyway@Home | Temporarily failed download of milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101: transient HTTP error Fri 22 Mar 2019 12:20:41 PM CDT | Milkyway@Home | Backing off 00:16:09 on download of milkyway_1.46_x86_64-pc-linux-gnu__opencl_nvidia_101 Fri 22 Mar 2019 12:20:43 PM CDT | | Internet access OK - project servers may be temporarily down. It seems to show that the issue is on the server side but I am not getting these errors on my other Mint machine, but it is running Mint 18. Any suggestions on what the issues may be or where to look? There are no related errors in the syslogs. A back story on this machine is that I had issues on the initial install due to a sour SSD drive and bad sectors. It seems to be resolved. I also played around with some of the extra library packages specific to Boinc. I may have inadvertently created a conflict. I have removed everything except the basic Boinc install (manager and client) Thanks, Ron |
Send message Joined: 14 Oct 16 Posts: 4 Credit: 25,072,475 RAC: 0 ![]() ![]() |
I'm also having problem starting up some new machines (not running m@h before): I get a lot of tasks, but they all are listed as "downloading" - apparently waiting for one single file: milkyway_1.46_x86_64-pc-linux-gnu that just refuses to get downloaded. It stands waiting for hours and hours, still at 0%. It just has to be some server issue. Hope the admins can attend it soon, so I can get my 5 new machines to run some m@h soon. //Gunnar |
Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0 ![]() ![]() |
In another thread someone mentioned they went to the "https://milkyway.cs.rpi.edu/milkyway/download/" folder and manually downloading and placing in the MilkyWay folder on your machine. |
Send message Joined: 14 Oct 16 Posts: 4 Credit: 25,072,475 RAC: 0 ![]() ![]() |
Yes, I've read it, and I actually also successfully tested it on one of my computers: After having downloaded, and chmod:ed, and placed the file in the correct folder, it started by itself after a while. However, this is certainly not the correct procedure to solve the problem! I still wonder what stopped Boinc from downloading it in the first place? Is it some kind of certificate problem with the https-server? //Gunnar |
![]() Send message Joined: 17 Nov 09 Posts: 4 Credit: 111,166,377 RAC: 0 ![]() ![]() |
Just a followup to my original post. I decided to stop all Boinc tasks and do a complete rebuild of the problem host. I re-formatted the SSD drive and loaded a brand new image of Mint 19.1 Cinnamon. Updated the Nvidia driver and installed only the base Boinc client and manager. I also did reboots between each component install to be sure there were no conflicts or issues. Once the apps were installed, I added some of the projects I support and let Boinc run. As before, I got a download error from Milkyway but none of the other projects. I did go ahead and try the manual download of the offending project file as mentioned in the previous post. This did correct the initial problem and I am now processing Milkyway tasks. So, for now, I am up and running on my new rig. As Mint 19 is a new long term release and has some changes from earlier versions, I can only conjecture that the issue has something to do with the combination of Mint 19.1 and the new servers on Milkyway. I also suspect that other combinations of a new install of a client OS may not suffer from the same issues. Until someone can do proper testing or can fix the issue on Milkyway's end, I'll just do what tasks I can and wait. |
Send message Joined: 14 Oct 16 Posts: 4 Credit: 25,072,475 RAC: 0 ![]() ![]() |
You do not need to worry about just Linux Mint being a noteworthy part of the problem - I'm having the same problem with Xubuntu 14.04, 16.04.06, and also 18.04.02! I suspect it has something to do with the servers, or maybe that the servers using some combination of https and other settings that makes the client unable to download. I don't know if the project admins are viewing these forums daily or if we should PM them? //Gunnar |
![]() Send message Joined: 17 Nov 09 Posts: 4 Credit: 111,166,377 RAC: 0 ![]() ![]() |
I have to agree with your assessment. I found a similar error on my Win10 machine this morning. It appears that when a new task is being requested/sent, and it requires a new "config" file it fails. As this all seemed to have started with the new servers, that would seem to be where the error is. There is a note that the admins are working on the work units but nothing regarding this kind of failure. It seems likely that this issue is only with new clients that have not been running prior to the server upgrade. My old client is working like a champ. As for messaging the admin, that seems like a good idea. I have not done so as I usually am late to the game on such things so I become redundant. Ron |
©2025 Astroinformatics Group