-
Re: Donate never ending jobs
Quote:
Originally Posted by
Bryan
Congrats on taking over top dog in Donate ... anytime you outproduce stoneageman you are doing well !!
Not there yet. And if I do get to my goal of being close to 5 Mill, he will probably improve as well. Don't care to chase him or anyone really, just want to ring what I can out of what I got. I am sure you know that urge well.
Last night was a set back. The boinc client aborted on both AMDs, Intel is just fine. One aborted shortly after I last looked at it at 1:30 am and the other 7 a.m. So I lost a good 10 hours, 5 mill may not be any time soon. Why is it always the AMDs that crap out? You would think the graphics driver would work better on them.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Mike029
I'm loving the read on this EmSti.
Would you mind starting another thread with your progress? You have lots of good info here on the 7970 cards for this project. (BTW, you are making me itch for an new card) We have a few members that may be getting ready to jump at a few 7970's and would love to keep this info and your experiences handy. ;)
Once I get it down, I will try to remember and to do that. Not much point yet, because there is a post on the Donate form that a new application is on the way that is suppose to fix some of the issues. No date the last I looked.
-
My guess would be the combined Catalyst driver. AMD uses it for both the GPUs and the CPU chipset on the mobo. So when the GPU crashes, I'm betting it takes down all of Catalyst with it.
Of course the Intel rigs do not have this issue; they are using entirely different driver for the chipset...
Sent from my MB860 using Tapatalk 2
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Bryan
All 4 of mine have the 3 10" fans. But if you take a look at the side and top there are mounting holes for 120mm fans. The review said they had dropped the temp big time by changing the 10" side fan to 4 120mm. I'll have to see if I can find that article.
With fans, bigger isn't better, just quieter. In order for a small fan to blow 65 cfm of air it has to spin really fast. (My 24 core 1U webserver sounds like a freaking jet engine with 8 x 25mm fans. I think they spind around 8000 RMS at full throttle. By comparison, the larger fan can spin its bigger blades at 1500 RPMS and move the same amount of air. Most fans, regardless of their size, move about 45-65 cfm. So, replacing the one 250mm with four 120mm fans should increase the airflow considerable as that's like going from only 1 case fan to 4.
When buying a fan, the goal should be the most air movement with the fewest decibels. If you keep your computers in your crawl space and don't have to listen to the fans whirring, then focus only on cfm.
LEDs look cool, but remember, it needs to BE cool, not just look cool. Your GPU and CPU want more air, not a dazzling light show. It's like a good paint job on a NHRA dragster. Oohs and aahs on the paint job won't pay the bills. It's the time on the clock, or lack thereof, that counts.
Consider the layout of your case and the placement of the fans. Pulling cool air into the case at the bottom and pushing out hot air near the top may give better results than just sucking in air with all fans and hoping that the PSU fan will handle pushing out all the hot air.
-
Re: Donate never ending jobs
Boy is my face red....Bad version of MSI afterburner, newer version understands 7970 better.
I have been reading up on the Asic quality and what it means, will post more later. But in that reading I came across MSI Afterburner 2.2.0 beta thread and their work on getting Afterburner to managed 7970.
More speed here I come.
-
Re: Donate never ending jobs
What follows is a piecing together on Unwinder posts on MSI AfterBurner 2.2.0 Beta thread :
A few interesting observations on Thaiti GPUs. First, it looks like similar to Fermi, Thaiti GPUs are binned and there are 4 types of Tahiti GPUs with different ASIC leakage level. So depending on it there are 4 possible default 3D voltages: 1.17500V, 1.11250V, 1.05000V and 1.02500V.
Currently all voltage control tools use hardcoded default voltage (1.175V) for 7970, so pressing "Reset"/"Defaults" buttons in any voltage control tool will put any GPU to 1.175V regardless of leakage level.
As far as I can understand BIOS code, this value is ASIC leakage level fused during ASIC binning and manufacturing. An ideal GPU with no leakage current is 00003ff0, lower values indicate higher leakage and higher default voltage.
AMD uses the following table to select VID depending on ASIC quality:
up to 2F90 (up to 75% quality) - 1.1750V
up to 34D0 (up to 80% quality) - 1.1125V
up to 3820 (up to 85% quality) - 1.0500V
up to 3A90 (up to 90% quality) - 1.0250V
You shouldn't create a hysteria around this value, quality is way too artifical name for that to say that card is bad or good. Take it as an indicator of default voltage, nothing more, nothing less, it doesn't matter what "quality" is if you're running at stock clocks. Lower voltage (high leakage) chips are better overclockers because of more room for voltage tweaking but they eat more power comparing to higher voltage chips when running at the same voltage.
--------------------------------------------------------------------------------
Alexey Nicolaychuk aka Unwinder, RivaTuner creator
-
Re: Donate never ending jobs
That is very interesting, thanks. Definitely good info for the max O/C! :D
-
Re: Donate never ending jobs
Thank you EmSti. Good stuff there.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Boy is my face red....Bad version of MSI afterburner, newer version understands 7970 better.
No need for AB Beta ---> http://www.setiusa.us/showthread.php...ll=1#post39709
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Fire$torm
Right, beta discussion thread, but I went and downloaded the latest release 2.2.1 which was released a week or so ago. The beta thread made me understand the errors of my ways.
It's working well so far. The cards with low asic quality (those with high default voltage 1175) I left at stock voltage. Those with high asic I upped the voltage a little but not to 1175, tweaking a bit here and there for heat and stability. I told Afterburner not to syncronize the settings. Now all 5 cards are running at 1125 core clock. It been 4 hours so far and only one computation error when I was messing around on my office PC (i.e. my fault not the setup). No crashes or boinc restarts (knock on wood). But the test always seems to be getting through the night and the first 24 hrs.
For those that are counting, at 1125 and the app_info setup being use (see earlier in the thread) each 7970 is producing 2 Donate WUs every 18:05. That about 30 secs faster that I was getting at 1090 (yesterday's stable speed). No idea what it does at reference speed of 925. :cool:
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Right, beta discussion thread, but I went and downloaded the latest release 2.2.1 which was released a week or so ago. The beta thread made me understand the errors of my ways.
Roger that. Question: Is "Unofficial Overclocking" still permitted in that version? That is, can you OC past CCC limits?
-
Re: Donate never ending jobs
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.
As soon as I went to Zydor's app_info.xml (see post #3 of this thread) the problem went away on 2 of the machines (I also went to CCC 12.4 at the same time, manually removing amdocl and opencl dlls from \windows\system32 if they are not from 4/5 before reinstalling 12.4). After that it came down to one card that would abort the gpu drivers when it hit 83c. The drivers restart of course, but test615.exe would keep right on going forever. Just happen to see it occur one day. If you suspend the hung tasks and resumed them, they would start from the beginning and finish fine. Fixed the temp problem and no more nevers.
Hey...looking at that post #3, I have knocked off 3 full minutes of the WU cycle. The dual 7970s are now producing 4 Donate WUs every 18:05 and back then it was 21 minutes. :o)
-
Re: Donate never ending jobs
Yeah, I have used the app_info.xml, and 12.4. Oh well, I will struggle on until I hit my 100M MM, and then just bail.
And yeah, I know restarting them will let them finish. But that doesn't make up for the lost time. I am curious why BOINC won't DL more tasks while the task is suspended. Or maybe the problem is on the server side. Either way, suspending a task doesn't solve the problem of wasted time.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.
I have noticed that suspending then resuming the task after a few minutes will get the wu to complete. Since you can have up to three actions per rule in BOINC Tasks, you could set the 2nd action to start a Windows batch file/Linux script to wait several minutes and then call boinccmd to resme the task.
-
Re: Donate never ending jobs
Hmmm...is this related to the "test-something.exe" stopping problem? Puma gets that every now and then for no reason that I can see - doesn't matter if I have the GPUs O/Ced or not, doesn't seem to have any schedule, just random. What's even more strange is sometimes it looks like it stops Donate WUs from running...other times it doesn't seem to matter and it just starts new WUs! :confused: I don't know what to think. This is two 5870s running 12.3 drivers...
-
Re: Donate never ending jobs
Quote:
Originally Posted by
DrPop
Hmmm...is this related to the "test-something.exe" stopping problem? Puma gets that every now and then for no reason that I can see - doesn't matter if I have the GPUs O/Ced or not, doesn't seem to have any schedule, just random. What's even more strange is sometimes it looks like it stops Donate WUs from running...other times it doesn't seem to matter and it just starts new WUs! :confused: I don't know what to think. This is two 5870s running 12.3 drivers...
Yep, Test615.exe is the Donate app/wrapper.
-
Re: Donate never ending jobs
Getting closer to goal, last 2 days over 4 Mill credits. I started upping voltage and clocks yesterday. Errors went up for awhile and blackdragon was locking up for awhile. Lost a few hours during those lock ups and when I shut it down to adjust fans. But I think they are all stable at the next level now. Might need some minor voltage tweaks yet. 7970 cards now at 1165 MHZ and each is finishing 2 wus every 17:35, so another 25 sec off.
-
Re: Donate never ending jobs
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D
Attachment 912
-
Congrats! Keep it up! SUSA wants to be 1st to 1 Billion.
Sent from my ADR6300
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D
Attachment 912
Waaaaaahoooooooo baby! YEAH! :)) :D That is what we like to see, go get 'em S.USA!!!**==
-
Re: Donate never ending jobs
:) Congratulations Emsti, very good job. Keep on keeping... Unfortunately Donate doesn't work with XP 64 or I would be a real treath to you all. :D
Bad Jokes Duke :-B
-
Re: Donate never ending jobs
Nice job getting to the top!
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D
Attachment 912
Oorah! **==
-
Re: Donate never ending jobs
Nah, fame is so fleeting! Today he got his butt handed to him big time ... a whole 13k :))
Nyaah nah nah nah nah!
Okay so I got lucky when the project was down it didn't slow me down as much as the big boys :D
Tomorrow be sure and wave as you blow by EmSti :-h
-
Re: Donate never ending jobs
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.
One alternative would be to run a batch file/script to pause the project and then say 2~5 minutes later resume the project. Since a stalled task isn't dead, it will complete. This way no need to babysit the project.
I don't know how to compose a Linux script but a batch file I can do in a few minutes. Task Scheduler takes care of starting the batch file.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.
I don't have the problems some do, but I'm just running the old stuff... 58xx and 69xx. On occaison I get the hung WU but it isn't all that common and when I do I I usually lose several hours of a GPU. I don't babysit the computers.
What I found on MY systems is:
1. The app_info works excellently with single cards. Gives 5% or better increase in credits.
2. On a dual card system (6990) it caused bunches of 590 credit WU (I7 2600k). Even though my crunch time went down per WU I got far less credits. 26 590 credit WU over a 16 hour period. When running 1 WU per GPU I almost NEVER see 590 credit WU.
3. On a dual card + single card (5970 + 5870) system I got bunches of hung WU. A hung WU is RARE on that system when running 1 WU per GPU.
4. A Q6600 machine with 2 5870s barfed almost immediately with only 2 cores reserved. I didn't try freeing all 4.
On #2 and #3 I was reserving 1/2 of a V8 ... it might have worked with more CPU reserved or it may be a function of the "dual" GPUs. I don't know but it isn't worth worrying about. I'm heat limited at this point so increasing the GPU loading by 5% isn't beneficial! All but 2 cards are downclocked.
I'm getting 4M+ per day (normally) so it pays far better than anything else.
Z you might try shutting down CPU crunching on your 7970 machine and try a 24 hour experiment. Before I did the Collatz Pentathlon thingie I was running Donate to the tune of 4.3M per day (winter I could clock higher). Although I did see hung WU they were VERY rare. After the Pentathlon I went back to Donate and for a couple of days I had a lot of trouble. It finally dawned on me the difference between before and after was the amount of CPU I was using for crunching. Prior I was reserving 4 threads on the V8s and after I was crunching 6 CPU wu and doing Donate. I reserved 2 more threads (1/2 the I7s) and my problems went away. Anyway it might be worth a try on one of your systems!
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Bryan
Nah, fame is so fleeting! Today he got his butt handed to him big time ... a whole 13k :))
Nyaah nah nah nah nah!
Okay so I got lucky when the project was down it didn't slow me down as much as the big boys :D
Tomorrow be sure and wave as you blow by EmSti :-h
I did expect my total to be down because I worked on 2 of them for hours. Redid air flow on 2 pcs and redid the TIM on one of the hot GPUs (long ugly story I will share after the nightmares stop). After I got BlackDragon sorted out, I thought, I started up Donate right in the middle of problems they had last night and almost freaked. Until I noticed all of my machines were up and down, processing, not processing, dooooooooinnnnnnngggggg nothing, repeat. I had to wait hours to know I hadn't totally FOBARed things.
That being said and totals expected to be down, I was surprise when the daily update on bonicstat that I was top for the day again. I had to search for a bit to find you were looking at Free-DC I think, they showed the 13k diff). Free-DC shows a large amount for me yesterday than boinstats, so I guess it depends on 24 hour period they look at. I was going to post a pic of both, but I guess I have hit my space limit and I can't seem to clear the previous uploads (I need help, PM me please).
I am running even faster now, I am looking forward to seeing my totals for the next 48 hours and if the new setups are stable. 4 of the 5 7970s are now doing 2 Donate works units at about 17:05. I also shaved 30 secs off the 6970 times. the 7970s now running at
1185, 1185, 1185, 1175, 1170 Mhz. 2 of the owns at 1185 are still running cool, they may be capable of 1200, but i am don't want to push any harder for awhile.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Bryan
I reserved 2 more threads (1/2 the I7s) and my problems went away. Anyway it might be worth a try on one of your systems!
I agree I think reserving the thread(s) is key. I am becoming more and more conviced a lot of the project's problems are timing problems between the GPU and CPU. Reserving a CPU seems to cut down on that. The main problem I get lately (not daily) is the Bonic client crashing when Donate and one of the CPU projects finish at the same time. Which is why I wrote the program to restart the client. Lots of computation errors yesterday, but that was me playing with voltage and speed. Much more settled today.
-
Re: Donate never ending jobs
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.
My experience with Donate also. Seems to also have a delay as it decides to restart the job also. I have learned to time reboots around the % of time complete.
-
Re: Donate never ending jobs
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B
Squeezing every little stone out of those puppies. :p
-
Re: Donate never ending jobs
Quote:
Originally Posted by
EmSti
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B
No problemo, you'll hit it tomorrow. Their server had trouble for a couple of hours this afternoon ... I lost 300k and you would have lost more than that! Assuming the project server stays healthy you should blow right through it tomorrow!
-
Re: Donate never ending jobs
Guys, to me that what counts is not so much of what we as volunteers are doing but what the projects is NOT doing. Whilst we as users have some considerable knowledge in either experience or in deed programming, you just need to look at the Message Boards of most sites and the Admins are saying nothing or at best disregarding any problems as to say that there is NO problem, you must be stupid for I do not have that problem here... Not even to acknowledge that there is a problem or not even a thank you for using our project. This hurts me more than anything else in DC Computing. Are we so dumb that we accept the apparent fact that they do not care? If I am wrong, than please tell me how it is communicated to us, for I do not see this in most projects. Some do not even bother to talk to us because there is no Message Board that is active and they do not care if results of our daily score is promulgated world wide or not.
I think it is time we voice our opinion, if you find that a project doe snot conform to a standard that it should be following than stop using that project. There are Plenty of others that do the right thing. Please do not come back that we should be aware that as a new project in Alpha phase they need time to work things out.. All I can say to this is that we have a lot of knowledge amongst us as users that they should listen to us and use the Message Board to their method of communicating to us. If the problem is so serious that effects all users than they should acknowledge its existence and say so openly in the News section outlining what action they have taken to correct the problem. I am sick and tired of wasting my money on running my computers on a project that does not acknowledge my presence or listen to my concerns running their program. If they are not interested in our input than they should realize their demise....
I am sorry to say that although I have followed this project from the beginning, I am now under the impression that they are using us for they are not interacting with any of the users in their Message Boards, for to confirm this just look how many problems are posed to them and how many are in fact replied to.
There has been instances reflecting this stance by others yet no reply from the project.. It is your choice what you intend to do...
I think it is time that we told all projects not just this one that we need recognition and acknowledgement of our problems and if we do not get that simple thank you and acknowledgement than YOU should choose another project that cares... It is up to you.. Do what your own conscious desires...
-
Re: Donate never ending jobs
Quote:
Originally Posted by
zombie67
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.
Yes, that is correct. All in all Donate is a decent project that pays very well. And the reality is that Donate is in the same league as quite a few other projects. They can be very quirky and a pain in the ass. But, if you want credit or MM in that project then you have to accept what it throws at you.
So better idea, use a BOINCTasks rule set to abort the wu if runtime exceeds max_runtime + 10~15%. Still no baby sitting involved and credit output will exceed any other project. On a good day with Donate I have hit 1 Mil+ from that project alone using three 5830s ---> http://stats.free-dc.org/stats.php?p...oj=don&name=89
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Fire$torm
So better idea, use a BOINCTasks rule set to abort the wu if runtime exceeds max_runtime
How do you set it up for ABORT? I only show SUSPEND task.
-
Re: Donate never ending jobs
Woot! \:D/
Reached my goal. According to Free-DC, I did 5,109,429 credits today, finally reaching my goal of tuning the 6 GPUS to the 5 Mill level. Yeah for me. I think 2 or 3 of the GPUs can go higher, but I am going to give it a rest for awhile and work harder on house projects.
Other impressive numbers: Looks like Cruncher Pete has stomped on the Donate gas pedal and reached 5,266,177 (king of the hill again). I think I counted 19 GPUs in the mix (of course it's hard to tell if all of those were actually doing Donate or just in the machine doing other things), that is just plain impressive. ^:)^ Are the tahiti cards playing nice now?
Bryan is also flying high at 4,563,877. :-bd
I believe those are the highest numbers I have seen on Donate from all three since I started my little adventure. :-h
-
Re: Donate never ending jobs
Quote:
Originally Posted by
Bryan
How do you set it up for ABORT? I only show SUSPEND task.
Use the run program action in the rule to start a batch file or script that aborts the Donate cache using boinccmd's --project <URL> reset command.
Edit: This way only one command needs to be issued.