Message boards :
News :
server issues
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 May 14 Posts: 73 Credit: 356,131 RAC: 0 |
Hey all, We are currently having a few issues with the server. Everything seems to be up and running yet no work units are being sent out. If any of you see any erroneous behavior on your end please let us know. Thanks, Sidd |
Send message Joined: 4 Sep 12 Posts: 219 Credit: 456,474 RAC: 0 |
If we are - finally - to pay some attention to the server, could I remind you of three messages where I've posted about the BOINC server code being outdated? Message 63188 - unfinished web update, corrupts < and > in [ pre ] and [ code ] blocks. Message 63274 - php warning when 'don't move stickies to top' is selected. BOINC message 62439 - recent ATI cards aren't recognised as being OpenCL capable. And you'll know about the connection errors and timeouts since I started drafting the above. |
Send message Joined: 27 Mar 15 Posts: 1 Credit: 9,381,903 RAC: 0 |
I know I am not getting any work units. Other than that and the n-body issue everything seems fine. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey guys, Looks like we have the server getting some work units out again. As for the other issues, now that the spring semester is over maybe I can get some help from Travis on fixing some of the persistent server issues. The timeouts and connection errors were the result of us working on the server trying to unstick the runs. You can expect those to continue a bit as we try to fix the nbody runs, but after today they shouldn't happen as frequently. Sorry these issues are taking so long to resolve. Jake W. |
Send message Joined: 5 Aug 11 Posts: 1 Credit: 3,232,133 RAC: 0 |
It has been the same for days, I received the same work unit, completed it twice, watched it at 100% for the longest time running, and without ready to report, after about 16 or so hours each time I reset the project, and after the twice, it has only given me message of communication deferred no matter how many times I attempt to update project. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
The work units getting stuck is a different issue than what we are having with the server. No worries though, Sidd thinks he found the issue with the nbody client and will be working on a solution to that. The server issue resulted in no work units being sent out for separation and nbody, but that issue has been resolved. There are still a few other unresolved issues on the server so expect a few restarts over the next few days. Jake W. |
Send message Joined: 2 May 15 Posts: 1 Credit: 87,313,200 RAC: 0 |
The Problem ist not solved. I' didn't recieve WU's for ATI just before. Had to update the project. Its very instable since the Power outage. |
Send message Joined: 2 Oct 14 Posts: 43 Credit: 55,103,888 RAC: 1,267 |
Something else seems strange. I am getting responses that some of my results (14) are inconclusive. I see no errors for my other BOINC project and all results are verified correct. Also, the cobblestones on my profile do not match what BOINC charts on my machine. According to BOINC, I have 2.02 billion cobblestones. yet my profile shows slightly less than 2.0 billion. That is a 2 percent difference. Now the count for this post in the signature area shows slightly over 2 billion. Thought you might like to know. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Wisesooth, I'm not sure what is causing your credit discrepancy. I think your profile on our website will only show the credits you earned through MilkyWay@Home while BOINC will add in credits from other projects. That is just a guess though and maybe someone else can give you more insight on that. As for the inconclusive problem, that is standard operating procedure for our project. Our server does not award credits until after your results have been validated against the results of other users. Inconclusive just means the server is waiting to hear back from other users who are still crunching the workunit so it can be validated. As far as I understand, this is different than many other projects where they use different validation strategies. Jake W. |
Send message Joined: 12 Feb 11 Posts: 1 Credit: 972,109 RAC: 0 |
For the last several weeks I seen projects listed as "100% complete". Why do insist on listing completed projects? |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Richard, I am currently looking into updating our BOINC libraries for both the client and server. I think this will fix many of the issues people have been running into, both the errors you mentioned here and errors mentioned in other places on the forum. I don't know how long it will take us to get these libraries updated, but it is being looked into. For the sake of some transparency, the major hurdle we have is that the current libraries we use were customized for our project. This means simply building the newest version of the BOINC libraries and deploying that on the server and client will likely cause other unforeseen problems. Hopefully we can get a stable version working soon though. Sorry for the delayed response. Jake W. |
Send message Joined: 8 Oct 10 Posts: 1 Credit: 204,226,540 RAC: 0 |
same issue here. "job" states 100% complete: 'stuck', no change in last 3-days; i suspended it so that i could free-up resources for other "idle-work-projects" on this machine. no other MW tasks received. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
same issue here. As long as it is suspended you won't get other work as MW thinks you already have plenty. Personally after 1 day of no progress I would have aborted the unit and moved on to another one. Crunching is one thing, wasting time is another. |
Send message Joined: 8 May 09 Posts: 3339 Credit: 524,010,781 RAC: 0 |
Something else seems strange. I am getting responses that some of my results (14) are inconclusive. I see no errors for my other BOINC project and all results are verified correct. The only 2 errors I can see now, I am just a cruncher not an admin here have a validation error. You can't compare one Boinc project to another as each project writes their own application files and each has it's own set of priorities and things it is looking for. This project could be much more picky in the details as opposed to your other project. Also, the cobblestones on my profile do not match what BOINC charts on my machine. According to BOINC, I have 2.02 billion cobblestones. yet my profile shows slightly less than 2.0 billion. That is a 2 percent difference. Now the count for this post in the signature area shows slightly over 2 billion. Thought you might like to know. Try this page for you stats: http://stats.free-dc.org/stats.php?page=userbycpid&cpid=da1cb0a1901cbb23c6241de969db356e It shows you at 2.8 million combined cobblestones, with MW at just over 2 million. |
Send message Joined: 26 Jan 09 Posts: 12 Credit: 53,679,035 RAC: 0 |
Seem to have a lot of 'validation inconclusive' nvidia opencl units - about 50 over the last couple of days since the wu's started coming in again. Not sure whether that's an anomaly, but it seems unusual enough to let you guys know ... |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,565,482 RAC: 0 |
Seem to have a lot of 'validation inconclusive' nvidia opencl units - about 50 over the last couple of days since the wu's started coming in again. Not sure whether that's an anomaly, but it seems unusual enough to let you guys know ... This is normal. The results are simply waiting to be confirmed by other crunchers. |
Send message Joined: 26 Jan 09 Posts: 12 Credit: 53,679,035 RAC: 0 |
Ermm...pretty much the whole batch shows as 'validate errors'. In fact everything back to the server dropout shows as a validation error. All but two out of 188 tasks show as validation errors. That seems much less normal ... To be fair, I put a new card in a week or so back - a new Asus GTX 750. But that model ran fine on another machine and is handling SETI with no problems. Comments? |
Send message Joined: 18 Jul 09 Posts: 300 Credit: 303,565,482 RAC: 0 |
Seem to have a lot of 'validation inconclusive' nvidia opencl units - about 50 over the last couple of days since the wu's started coming in again. Not sure whether that's an anomaly, but it seems unusual enough to let you guys know ... This is normal. The results are simply waiting to be confirmed by other crunchers. Ermm...pretty much the whole batch shows as 'validate errors'. In fact everything back to the server dropout shows as a validation error. All but two out of 188 tasks show as validation errors. That seems much less normal ... You asked about "validation inconclusive". Validate errors are a different thing and can be caused by any number of issues. Sit tight and one of the crunchers who know far more about nvidia than I will be able to assist you. |
Send message Joined: 2 Oct 14 Posts: 43 Credit: 55,103,888 RAC: 1,267 |
Thanks for the info, Jake. You asked about server issues at our end following the scheduled power outage. Earlier today, I saw that user update requests were not handling tasks ready to report. I checked the "server status" button and found that about 7 or server tasks were down with errors. All of them seemed to be related to a database corruption with MySQL. I suspect that you may have corrupted indexes. I know little about the "secret sauce" that lubricates the grid, but I had some past experience with MySQL that were not very pretty. The servers may think they are running, but not be aware that they are running in circles. Seti@home had a similar problem months ago. They had to rebuild their database from scratch. Hope this is helpful. |
Send message Joined: 25 Feb 13 Posts: 580 Credit: 94,200,158 RAC: 0 |
Hey Wisesooth, That was me doing some server maintenance. I put up some new runs on Tuesday using some obscure settings to see if they helped get the runs to finish faster. Turns out they broke the work unit generator for modfit. So that was just me rebooting everything to get it running again. Sorry I maybe should have made a news post. Jake W. |
©2024 Astroinformatics Group