Message boards :
News :
Scheduled Maintenance Concluded
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 13 · Next
Author | Message |
---|---|
Send message Joined: 13 Oct 16 Posts: 112 Credit: 1,174,293,644 RAC: 0 |
Hey Everyone, Working good now. Great job! Seems to be way more efficient as well!!! |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Peter, I am hopeful that a factor of 5x fewer WU requests will improve server stability, that was the motivation for this upgrade. I also plan to trim the WUs table either late this week or sometime next week during another scheduled maintenance period which should also improve the WU generation further. Jake |
Send message Joined: 6 Oct 14 Posts: 46 Credit: 20,017,425 RAC: 0 |
Hi Jake, I see app cpu priorty now "normal" will it be change to "below normal" at next build ? When I open a video its make cpu spikes. |
Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0 |
Application works ok right now. Except for reverting to 0% (but that's not immediate issue). So I believe that it should be proper to increase min time to contact MilkyWay@Home server to 5 minutes - or at least definitely much more than the current 30s (or so). |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 635,998,708 RAC: 0 |
New V1.43 working on Windows XP. Tasks being validated. One cosmetic issue with progress counter. It goes: 0%->20%->0%->40%->0%->60%->0%->80%->0%->100% Seems like it should go 0%->100% five times OR 0%->100% once. Jake - Buried in all of these posts is the issue of LINUX GPU bundled tasks failing validation. Is that on your plate? |
Send message Joined: 10 Feb 09 Posts: 52 Credit: 16,291,447 RAC: 202 |
On my Windows 10 (and RX 260X): Cpu usage 0.46 With dedicated cpu core, 266 seconds Without dedicated cpu core, 330 seconds ALL Validation inconclusive... :-( |
Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0 |
I just love that it takes less than 5 times of the single, non-bundled WU. But credits are x5. There are slight problems, but I think we will make past them. My WUs are already validated 50/50 :) |
Send message Joined: 19 Feb 08 Posts: 350 Credit: 141,284,369 RAC: 0 |
Great work, Jake, works fine on three pc's now. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Everyone, I did see the Linux GPU ones were getting a few invalid results, but I thought that was just do to the significant number of errors on the release day. Is that still an issue now that the error stats are reduced? (Can anyone confirm there are invalid results on Linux from WUs assigned today?) I will work on getting a fix for the cosmetic issues for the next scheduled server maintenance in a week or so. I also need to get a Mac application working and released. I'll make a new news thread with a date to expect the next updates. Thank you all for your help with debugging and words of encouragement. The MilkyWay@home community is the best. Jake |
Send message Joined: 14 Nov 14 Posts: 9 Credit: 214,644,261 RAC: 0 |
Hello Jake, So far my 750Ti's are doing okay, thanks for all your hard work. Just got home so I'll fire up my 7970 GPUs and see how they do. Rich |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 635,998,708 RAC: 0 |
Jake - I just ran another task on a LINUX machine to make sure I wasn't writing about an old issue. Received a Validate error. Here is the link. http://milkyway.cs.rpi.edu/milkyway/result.php?resultid=1887487375 |
Send message Joined: 27 Jun 09 Posts: 12 Credit: 148,038,330 RAC: 0 |
Task on are too short. Wu validate correctly. 4xWU ->130 sek http://milkyway.cs.rpi.edu/milkyway/results.php?hostid=618442&offset=0&show_names=0&state=4&appid= |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hi wb8ili, Thank you so much for giving me a result to look at. Looks like these should be listed as a computation error. I will work on rereleasing that application later tonight once I get a fix. Thanks for the report. Jake |
Send message Joined: 18 Jul 10 Posts: 76 Credit: 635,998,708 RAC: 0 |
Hi Jake - I hope that means you aren't just going to make it a "computational error" but actually fix whatever is causing "the problem". By the way, did you now realize that making a major program update on a Friday afternoon is never a good idea? As a retired engineer involved in computer systems, when I saw that, I was thinking this is not going a good weekend for Jake. On the other hand, if you get paid overtime and need the money, Friday afternoon is a good time! |
Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0 |
Jake, Would it be possible to create a subproject for bundles of 25/50/100 ? Right now we have subprojects: MilkyWay@Home MilkyWay@Home N-Body Simulation And MilkyWay@Home is clearly CPU & GPU. Wouldn't it be better to multiply it a little: MilkyWay@Home CPU (single WU) MilkyWay@Home GPU (Bundle of 5/(20?)WU - for lower end GPUs) MilkyWay@Home GPU (Bundle of 50/(100?)WU - for high end GPUs) Although we might have a problem with larger amount of some Hosts spamming computational errors, and overall "unable to validate". This was seen some time ago when some Hosts were "rejecting" several thousands WU per hour. Bundle of 5 still takes only 2minutes (when computing 4 at the same time, so essentially 30s for 5 old WUs). Thus I think that a subproject that will bundle more in a WU would still be a valid idea. Especially since bundle of 5 takes 120-130s on my PC, while single WU took ~26-30s (when computing 4 at a time). There is some improvement, thus bundle of 20/50/100 would potentially increase our throughput even further. Also as I have mentioned increasing "min time to contact" from 30s to 5min would also decrease load on server. |
Send message Joined: 4 Feb 11 Posts: 86 Credit: 60,913,150 RAC: 0 |
Actually, 32-bit applications waste plenty of time in the memory system that 64-bit applications can often avoid because AMD doubled the number of registers in the register files from 8 to 16 while creating AMD64. Some programs will be able to fit all of their speed-critical data into the registers like AQUA@home used to do when it was active with its 64-bit application while the 32-bit version kept having to shuffle data into and out of the memory causing it to be significantly slower. However, the speedup a GPU provides overcomes the problems that 32-bit x86 provides like memory system overhead and conversion of 32-bit calls to OpenCL to 64-bit calls to OpenCL, but every little bit of speed helps. |
Send message Joined: 13 Feb 09 Posts: 51 Credit: 72,633,257 RAC: 1,196 |
OK, for what it's worth, I downloaded 1.43 Nvidia apps for WinXP and Win10 awhile ago. Results so far: WinXP Validated - 2 Validation Inconclusive - 4 Win10 Validated - 0 Validation Inconclusive - 2 Several still awaiting validation for both systems. Run times appear reasonable, but CPU times for the XP sysem are running around 25%. Looks better for the Win10 64 bit system, but not enough have run to get good numbers. |
Send message Joined: 5 Jul 11 Posts: 990 Credit: 376,143,149 RAC: 0 |
Jake, Other projects (eg SETI, Einstein) have much larger work units sent to GPUs, each WU lasts for 15 minutes to an hour on my Radeon R9 290. Is there a reason MW ones are much smaller and have to be bundled? |
Send message Joined: 30 Apr 14 Posts: 67 Credit: 160,674,488 RAC: 0 |
OK, for what it's worth, I downloaded 1.43 Nvidia apps for WinXP and Win10 awhile ago. Results so far: For 12h of work I got 3 "unable to validate", about 1200 "validated", and a little above 200 "validation inconclusive". As for CPU, since the fast Modification Fit, app took: 1 core for first few (3-4s) seconds, then only GPU, but needed 1 CPU core at the and for 5-6s. This essentially creates a situation when similar CPU is needed for a WU as GPU (on my system at least) :) That's why I run few at the same time, for GPU not to rest at all. I run 4, and at first I try to start only 2. After few seconds, I start next 2. That way their CPU/GPU cycle will not be identical - thus GPU will be saturated constantly. Other projects (eg SETI, Einstein) have much larger work units sent to GPUs, each WU lasts for 15 minutes to an hour on my Radeon R9 290. Is there a reason MW ones are much smaller and have to be bundled? It just a methodology. It isn't necessarily that bigger is better. I thought that one of those bundled WUs, but right now I can't find that info. For example ClimatePrediction@Home have WUs that take 10 or even more DAYS. That doesn't mean they're great, they do have checkpointing and they upload their data periodically, but it's still quite a lot of time. Similar is for some subprojects in PrimeGrid. However in there a single error causes 2-3 days worth of GPU processing going to waste, since it's not possible to upload partial result. Thus there need to be some reason in WU size :) Although we might have a problem with larger amount of some Hosts spamming computational errors, and overall "unable to validate". This was seen some time ago when some Hosts were "rejecting" several thousands WU per hour. As for this problem that I pointed out with bigger WUs. There is a solution - to send bundles only to "proven" hosts - "proven" means more than 1000 "Consecutive valid tasks". This is already tracked in "Hosts" -> "Details" -> "Application details". Ofc. 1000 can be changed to any reasonable amount. This would also increase my relative queue size - previously I have had 80 tasks, each taking 30/4 = 7.5s (since I process 4 tasks at the same time). This totaled to 600s, 10 minutes. Right now my queue is: 80 tasks, each taking 2min01sec. Thus my queue gives me 2420s = 40min20s. That's a lot better. However in the event of Server problems, that only gives me 40minutes of work. Bigger bundles will allow us to be prepared for any Server connection problems - either due to local, or to remote problems. |
Send message Joined: 30 Apr 09 Posts: 101 Credit: 29,874,293 RAC: 0 |
Jake Weiss wrote: Hey Everyone, Thank you. :-) I would love to let run a x64 app on my x64 OS/hardware... :-) But I don't know if the last x64 app was a x64 app..., because: MilkyWay@Home v1.39 (opencl_ati_101) <stderr_txt> <search_application> milkyway_separation 1.39 Windows x86 double OpenCL </search_application> MilkyWay@Home v1.43 (opencl_ati_101) <stderr_txt> <search_application> milkyway_separation 1.43 Windows x86 double OpenCL </search_application> With the 1.43 app (Win8.1 x64): AMD R9 Fury X: MSI Afterburner: 3 WUs/GPU: Memory Usage: 143 MB NV GT 730: GPU-Z: 2 WUs/GPU: Memory Usage (Dedicated): 110 MB Memory Usage (Dynamic): 73 MB In past the project delay was 60 seconds. Then it was changed to/now it's: Project requested delay of 91 seconds [sched_op] Deferring communication for 00:01:31 If it would be needed to change this settings, please think to the very fast PCs which are around (like mine with 4* R9 Fury X VGA cards ;-) - that they could be fed/saturated 24/7... :-) |
©2024 Astroinformatics Group