PDA

View Full Version : since upgrading to 7.0.25



artemis8
04-24-12, 07:47 PM
I upgraded boinc manager to 7.0.25(x64)
since then I can't get any rosetta or SETI tasks
I've reset the projects. The event log just says:

Not reporting or requesting tasks
Scheduler request request completed.
Sometimes it says temporarily failed download of (somename).png. on Seti, but just the above two lines on Rosetta.

I can get subsetSum tasks so it's not like I get none.
But I suspend all projects except Rosetta and SETI and hit update and still it doesn't get any new tasks. Both under server status say they have tasks available.
Any ideas?

zombie67
04-24-12, 07:56 PM
I upgraded boinc manager to 7.0.25(x64)
since then I can't get any rosetta or SETI tasks
I've reset the projects. The event log just says:

Not reporting or requesting tasks
Scheduler request request completed.
Sometimes it says temporarily failed download of (somename).png. on Seti, but just the above two lines on Rosetta.

I can get subsetSum tasks so it's not like I get none.
But I suspend all projects except Rosetta and SETI and hit update and still it doesn't get any new tasks. Both under server status say they have tasks available.
Any ideas?

What happens when you set all other CPU tasks to SUSPEND? Will it pull down rosetta tasks then?

artemis8
04-25-12, 01:31 AM
What happens when you set all other CPU tasks to SUSPEND? Will it pull down rosetta tasks then?

When I suspend all other cpu tasks it still gets nothing. Looks like wcg might be doing the same thing :(

DrPop
04-25-12, 01:55 AM
When I suspend all other cpu tasks it still gets nothing. Looks like wcg might be doing the same thing :(

Hmmm...I remember having a problem like this on some project a while back. Did you actually remove the project, and then re-attach to it? You might actually have to remove the project, then reboot your rig, and then re-attach to the project on the fresh boot. This will cause it to try downloading that file it got stuck on last time. Worth a shot. ;)

Fire$torm
04-25-12, 02:29 AM
The (somename).png thing is a support file for S@H but if all those files are not DL'd the project will not send you work.

Not sure about the Rosetta. I'm running it now with 7.0.25. Try shutting down BOINC, Go to the BOINC Data/project folder and delete the setiathome.berkeley.edu and boinc.bakerlab.org_rosetta folders then reboot and see what messages the projects give you.

zombie67
04-25-12, 07:23 AM
Just to be sure....So you have these set?

Activity -> Run always
Activity -> Network activity always available

If either are set to Suspend, nothing will happen while you are sitting at your machine watching it.

Slicker
04-25-12, 08:20 AM
As of BOINC 7.x, they did away with short and long term debts and switched to using hysteresis. I'm pretty sure the reason was that the algorithm was a lot more complicated and we all know K.I.S.S. is the opposite of what DA does. That and everyone thinks they sound smarter when they say "hysteresis" compared to "short term debt". You can check out some of the algorithms here: http://en.wikipedia.org/wiki/Hysteresis#Types_of_hysteresis

In order to become a "recommended" version, most bugs have been fixed. Note that flaws in the design don't count as bugs but are considered differences of opinion and will never be fixed, so what is "broke" for some may be by design and never get fixed. It usually takes about 40-60 builds to get the new features working properly after a major release. 7.0.25 means only build 25 so far. Therefore, I don't trust it and won't run it unless I own a AMD 79xx card and need to run OpenCL. I don't, so I won't.

artemis8
04-25-12, 08:33 AM
Just to be sure....So you have these set?

Activity -> Run always
Activity -> Network activity always available

If either are set to Suspend, nothing will happen while you are sitting at your machine watching it.

Activity is fine, and other cpu projects are running, just not getting tasks from Rosetta, SETI, or WCG. I have removed and reattached rosetta and seti to no avail. I hit update and it makes the request, gets nothing and then communication deferred (4 minutes on rosetta, and 5 minutes on SETI).

zombie67
04-25-12, 09:24 AM
Do you have any local settings which might be messing things up?

Tools -> Computing Preferences -> Clear

After doing that, try suspending all projects except one (say Rosetta). And then leave it alone for 30 minutes.

Also, add these to your cc_config.xml. It will help provide more detail on what is being asked for by your client, and what the project is responding with.:



<cc_config>
<log_flags>
<sched_ops>1</sched_ops>
<sched_op_debug>1</sched_op_debug>
</log_flags>
</cc_config>

artemis8
04-25-12, 09:52 AM
Thanks for the suggestions, after I get home from work I'll try some more things and let you all know. I appreciate it!
At least it's not idle, it's still crunching SAT@home, for the moment.

Slicker
04-25-12, 10:22 AM
The following are all from the boinc mailling lists today:


One of the downsides of scheduling the cache is that except for maybe the CPDN, single threaded jobs projected hours seem to spread across all cores, so work fetch freezes when there's a big one.


I'm continuing to see this on a regular basis with v7.0.26. If you look at http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=50707, any task 'sent' between say 01:00 and 06:00 UTC will have been a scheduler request while networking was suspended - so it's happened for the last six consecutive nights. (it's also happened during the daytime as well, but less frequently).

Looking at the bugs which have come to light since I reported this, I would imagine there will be pressure to promote the 'next release' to recommended as quickly as possible.

So, Rom, would it be possible to drop me a client build (Win64) so I can investigate this in parallel, while the other issues are being worked on?



I had a host that I upgraded to 7.0.25, had it set to cache 0 and 10.
2 core cpu, so it ran down to last 2 work units, when it finished one of them and uploaded/ reported it then it requested more work, cool that is the way it should work if I read right.
the remaining work unit was a very long running RNA, 300 plus hours, and it was running H.P. status, about 8 days left to go. when it got work it download 9 YAFU workunits.
now yafu workunits are of the (MT) type with a 2 day deadline. that host did not stop running the RNA work unit, one core stayed busy and let the other core just set there doing nothing.
the 9 yafu workunits hit deadline, boinc aborted them having hit deadline and not started, cool. now the fun part, when it reported the 9 completed aborted/missed deadline/not started back to the yafu server boinc requested more work from yafu, and got 9 more yafu workunits, 2 day deadline. and the same thing. yafu workunits expired, aborted, reported, got 9 more, loop.
stopped it after the 3rd time. I'm just letting you know, that I saw this.
if this does not go here please move it. thanks.

While you may know more by setting the debug flags, they are just a diagnostic tool and it sounds to me like you aren't the only one still having problems. I'll say this again: If you don't need 7.x.x, you may want to downgrade to a stable 6.12.x or earlier version. If, like zombie67, you want to be the first to reach 100K etc. on every project, you just might need the latest versions for certain projects and will need to keep track of the issues and how to diagnose and fix the problems. The vast majority of us just want stability and that NEVER happens with the first version of a major release. EVER. Think about it. It took 25 tries (7.0.25 means 25 builds and bug fixes of version 7.0) just to get a version reasonably stable to release and it was released knowing there were still issues that would "be fixed in the next version".

So, lets consider what changed in 7.0.x:
I don't like having to hunt for the event log instead of having it on a tab (or was this a 6.12 change also?). I liked being able to zero the debts since I really don't care about the past time spent on projects. I don't like the cache changes for the scheduler and how the same fields now mean different things in different versions. I don't like the high priority changes and how it switches between multiple high priority tasks and then never completes any of them - or how it marks a task as high priority when it is 50% complete, has 4 hours remaining and 4 days until the deadline. I don't like how OpenCL support was added as a sub-type of AMD or nVidia rather than as a separate processor type because OpenCL on CPU (real easy way to create mt tasks) won't work and the scheduler has to do double the work (first check CAL and then check OpenCL and then decide which to use). I don't like how GPU tasks are killed vs suspended. I could care less about the basic screen changes. Yep, it will be version 9.0 before they either get all those things fixed or convince the majority that it works as designed even though the design is flawed.

artemis8
04-25-12, 11:13 PM
I rolled back, got plenty of rosetta tasks and the seti uploader/downloader is now down so that's why I can't get any from there. I'll just stick with tried and true for a while.

coronicus
04-25-12, 11:52 PM
I rolled back, got plenty of rosetta tasks and the seti uploader/downloader is now down so that's why I can't get any from there. I'll just stick with tried and true for a while.

Hate you went through so much trouble but thank you for sharing.. I was on the fence about upgrading but at this point i will be staying on 6.xx for now...