PDA

View Full Version : Donate never ending jobs



EmSti
05-19-12, 01:04 PM
Been having a problem with Donate and never ending wus, that I eventually have to abort (they run for hours). I captured a picture of it and wanted to post a picture to the donate@home forum. I didn't see a way to upload it, but looks like I can link to a picture. I thought I would try loading it here and link to it from there, sorry if that frowned on. Note the GPU usage and and progress on the Wu being pointed too.

884

This is the Donate Forum thread if you are interested:

http://donateathome.org/forum_thread.php?id=105&sort=5

Maxwell
05-19-12, 06:13 PM
I haven't run into that problem in quite a while. I used to, but since I've run 12.3, nothing has hung...

EmSti
05-19-12, 06:52 PM
I have tried zydor's suggested app_info.xml and moved Boinc setting for cpu % to 88% (remove 1 of 8 cpus from boinc) and so far it seems to be helping. It will need to run for awhile to know for sure. The effect of these changes is only 4 of the 8 cpus on each machine is used by cpu projects, cutting each machine CPU output in half. On the plus side, I am producing more Donate wus per hour (2 7970s went from 4 wus in 21 minutes to 4 in 19 minutes). Bad for CPU challenges, good for RAC and Donate. If it stays stable without hangs, I will try moving it back to 100%.

Zydor's app_info.xml which is placed in the projects folder contain's - (with these changes, each GPU will do 2 wus at a time at a higher GPU usage with an overall time savings).
<app_info>
<app>
<name>test</name>
</app>
<file_info>
<name>test615.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>test</app_name>
<version_num>615</version_num>
<avg_ncpus>0.9</avg_ncpus>
<max_ncpus>1</max_ncpus>
<plan_class>ati14ati</plan_class>
<coproc>
<type>ATI</type>
<count>0.5</count>
</coproc>
<file_ref>
<file_name>test615.exe</file_name>
<main_program/>
</file_ref>
</app_version>
</app_info>

spingadus
05-19-12, 07:59 PM
Been having a problem with Donate and never ending wus, that I eventually have to abort (they run for hours). I captured a picture of it and wanted to post a picture to the donate@home forum. I didn't see a way to upload it, but looks like I can link to a picture. I thought I would try loading it here and link to it from there, sorry if that frowned on. Note the GPU usage and and progress on the Wu being pointed too.



This is the Donate Forum thread if you are interested:

http://donateathome.org/forum_thread.php?id=105&sort=5

Yup, I started that forum thread that you linked in March. I haven't gone back because the admins haven't fixed the issue and I like others don't want to babysit our machines. It would be cool if it works for you Emsti. For me, I'm just going to wait until the fix is in.

DrPop
05-19-12, 08:27 PM
Hmmm...don't the 7970s actually get more credit per day on DiRT? Just wondering, because I don't own one...(yet???) :D Fry's has them on sale for their big anniversary sale for $447, but I know Kim would kill me if I pulled the trigger right now. :rolleyes::o:p

EmSti
05-19-12, 08:47 PM
Hmmm...don't the 7970s actually get more credit per day on DiRT? Just wondering, because I don't own one...(yet???) :D Fry's has them on sale for their big anniversary sale for $447, but I know Kim would kill me if I pulled the trigger right now. :rolleyes::o:p

Maybe, don't know. Not a fan of DiRT, don't intend to crunch it unless there is a challenge. Even then I may put all GPUs in as Hydra and not EmSti.

DrPop
05-19-12, 09:03 PM
Oh, OK. No worries - I was just told the 7970s could get over 1Mil per day on DiRT, so I was curious compared to what you were seeing per day on Donate? The best ATI/AMD cards I have are 5870...Donate is apparantly much better for those. :)

Fire$torm
05-20-12, 09:04 AM
Maybe, don't know. Not a fan of DiRT, don't intend to crunch it unless there is a challenge. Even then I may put all GPUs in as Hydra and not EmSti.

I'm just being curious, what is wrong with DistrRTgen?

Edit: BTW, the only reoccurring issue I have with Donate is that it will randomly kills my Windows desktop by crashing explorer.exe. Basically, the Donate app is not very stable.

EmSti
05-20-12, 11:36 AM
I'm just being curious, what is wrong with DistrRTgen?

Edit: BTW, the only reoccurring issue I have with Donate is that it will randomly kills my Windows desktop by crashing explorer.exe. Basically, the Donate app is not very stable.

Basically I am not a fan of using a lot of free horse power to help people crack passwords. The language on their site is nice about help security experts, but the freely available tables are used to crack passwords freely. RainbowCrack and a few other programs exist to do just that. In my opinion, helping make what were secure password setups less secure isn't security. Granted this could be done with massive horsepower via other means, I just don't care to put my GPUs to the effort. My personnel preference is to crunch projects that I feel provide some benefit (of course the key is I feel, so it depends on what speaks to me).

Donate for instance interests me in 2 ways. I am intrigued by the bitcoin concept and more importantly that the funds are use to fund medical research (gpugrid). For me RAC is secondary.

BTW, On the topic of the thread, the app_info.xml change has stabilized Donate for me, I ran overnight without getting a neverending job. However, the "509 problem" is there. The 509 is the wus run extra long and produce empty result so you only get 590 credit instead of 5,900. 10% for 3 to 4 time the computing time. I did notice that this always occurs in pairs. Both wus that are sharing the same GPU will go out to lunch together. I think they have a state timeing problem in the app. Freeing up more CPU didn't stop it.

Fire$torm
05-20-12, 03:32 PM
OK EmSti. Understood.

Cruncher Pete
05-20-12, 07:55 PM
Basically I am not a fan of using a lot of free horse power to help people crack passwords. The language on their site is nice about help security experts, but the freely available tables are used to crack passwords freely. RainbowCrack and a few other programs exist to do just that. In my opinion, helping make what were secure password setups less secure isn't security. Granted this could be done with massive horsepower via other means, I just don't care to put my GPUs to the effort. My personnel preference is to crunch projects that I feel provide some benefit (of course the key is I feel, so it depends on what speaks to me).

Donate for instance interests me in 2 ways. I am intrigued by the bitcoin concept and more importantly that the funds are use to fund medical research (gpugrid). For me RAC is secondary.

BTW, On the topic of the thread, the app_info.xml change has stabilized Donate for me, I ran overnight without getting a neverending job. However, the "509 problem" is there. The 509 is the wus run extra long and produce empty result so you only get 590 credit instead of 5,900. 10% for 3 to 4 time the computing time. I did notice that this always occurs in pairs. Both wus that are sharing the same GPU will go out to lunch together. I think they have a state timeing problem in the app. Freeing up more CPU didn't stop it.

My sentiments exactly and therefore I second that motion. Having said that, I also have to admit that I am also a credit whore and as such I tend to crunch projects that gives me the best outcome first and than I will crunch the others as a secondary target to meet a specific need.

EmSti
05-22-12, 01:01 AM
Frinken GPUS. Too much work lately. I think I found a combination of cards in the pcs that makes Donate stable in 2 of the 3 machines (well as stable as it seem you can get Donate). Turns out the one problem is the MSI r7970 lightning card. It been impossible to get it to work along side of any of the other cards without cutting the output of one of the other 7970 cards by half. Runs fine by itself. Maybe a Bios level difference issues (it is different from the rest). 12.3 or 12.4 Catalyst didn't matter. It is currently in Hydra6 with a disbled 6970. It will not crunch a Donate WU all the way through with the 6970 card enabled (non-crossfire of course). I didn't have any issues when a GigaBtye 7970 was in there. <sigh>

Also, I have not confirmed, but I highly suspect some of the issues are related to MSI Afterburner or Setting EnableUlps and EnableUlps_NA to 0, which allows MSI AfterBurner to monitor the GPU temp on the second GPU. When I removed Afterburner, stomped on Catalyst (remove, boot driver sweeper, boot, you know the drill), I finally got the GPUs in the 2 machines to run at their full potential. The reason I suspect those 2 items is I finally got FireDragonto work full out, when I remembered I didn't reboot since I installed Afterburn and made the regex changes. Did the reboot and 2nd GPU started to dog. Did the whole, strip and replace bit without those 2 items and its been good since even with reboots. I have no desire at the moment to figure out if its one or both.

I swear I feel like I have reinstalled Catalyst 100 times today, trying different combos of things. Wasted an entire day working on this. :mad:

DrPop
05-22-12, 01:39 AM
Dang man, sorry to hear this! What a headache! I am not suggesting you do this yet (because I know next to nothing about 7970) but with the 5870 you could flash the bios from another brand. It think it is TechPowerup that has the BIOS collection from all the gamers, along with a GUI BIOS flashing and editing utility for the Radeons. Anyway, if you are sure it's a BIOS issue, may *possibly* be worth looking into. I'd probably read up on what the gamers are doing first, though. A lot of times they find the brand that gives the best O/C BIOS and then flash other cards to that one. I am sure there will be some good info out there...

Wish this stuff would just work like CPU crunching does (for the most part). The sad part is, there is no way for a team to remain even remotely competitive unless it chases high credit GPU projects as well. What to do? Because I feel you on this one... :(

Sent from my MB860 using Tapatalk 2

EmSti
05-22-12, 10:49 AM
Dang man, sorry to hear this! What a headache! I am not suggesting you do this yet (because I know next to nothing about 7970) but with the 5870 you could flash the bios from another brand. It think it is TechPowerup that has the BIOS collection from all the gamers, along with a GUI BIOS flashing and editing utility for the Radeons. Anyway, if you are sure it's a BIOS issue, may *possibly* be worth looking into. I'd probably read up on what the gamers are doing first, though. A lot of times they find the brand that gives the best O/C BIOS and then flash other cards to that one. I am sure there will be some good info out there...

Wish this stuff would just work like CPU crunching does (for the most part). The sad part is, there is no way for a team to remain even remotely competitive unless it chases high credit GPU projects as well. What to do? Because I feel you on this one... :(

Sent from my MB860 using Tapatalk 2

You are correct it is TechPowerup and I checked yesterday for MSI Bios. Since the lightning isn't a reference card, I am hesitant to use one that hasn't been built for the card. No word anywhere that I can find on someone upgrading the lightning.

EmSti
05-22-12, 05:01 PM
Yeah making progress. BlackDragon and FireDragon are still running strong with dual 7970s. Hydra6 has been interesting. I got the MSI R7970 lightning to run overnight by down clocking from the factory settings. The first problem is heat. Hydra6 can't keep it cool enough. The driver aborts at 83c but it doesn't stop the Donate job, it pretends it's still running. I happen to be looking when the message briefly showed on the screen. So down clocking to 7970 ref specs (925/1375) and it runs at 76c. Not the best, but a start and it has proven stable. The GigaByte card handled it better (which is why it was in Hydra6 to begin with).

Next problem, the 6970, when enabled, wasn't doing jack and showed as "Disabled Adapter" in Catalyst 12.4. GPU-Z showed it as running at only 250Mhz. Did some searching and found someone reporting this http://devgurus.amd.com/thread/159062. Same issue, different cards and unanswered. But it got me to thinking, what if it wanted a monitor on the card (that old problem). Connected it over to a TV VGA input and sure enough Catalyst enabled management of it and GPU-Z saw it goto 940 Mhz. Woot!. I have had 7970 and 6970 in this box for awhile without a second monitor, so color me surprised by this one. Guess I will need to find old posts for building a fake VGA dongle.

Of course now heat problem is even worse. I am using TThrottle to control Boinc on the GPUs to keep the temp down until I come up with another plan. But at least for the moment I have all 6 GPUs running. Time will tell if they stay stable.

Now, solving the air flow and heat....~X( Take a look at the Hydra6 pictures, what else can a man do?

EmSti
05-22-12, 07:42 PM
The saga continues...

I now suspect TIM on the R7970 lightning. Bad Tim bad. I have a floor fan pointed into Hydra6 and the card is running 100% at 82c (its own fans going full). Now I need to get up enough nerve to take the card apart and investigate or send it in for repair.

This is like having my own blog. 8-x

DrPop
05-22-12, 07:45 PM
Nice! Thanks for the update. Interesting that the old "need a dummy plug" thing reared its head again. Wonder if I still have one somewhere? There is a real simple way to make them out of the correct resistors (3) and a VGA plug. If I remember right YoDude on here had a bunch of them he made up awhile back.

Yeah, heat is becoming more and more the limiting factor on credits per day. I don't know if there is any other option than going full size tower case or open air on a bench with a house fan pointed at it. Even Cheetah and Puma are running hot and they are big cases with lots of airflow - easily on par with the HAF 932 I had, and the Thor V2 case is definitely larger and with more room than that.
My top card is always way hotter than the bottom one. Have you tried putting the 6970 in the top slot (and backing off the clocks on it if necessary), and then put the better 7970 on the bottom? This would allow you to at least get full blast out of the more powerful 7970 while only having to sacrifice a little from the slower card - if at all.
Just some ideas, but I think we're all in about the same boat... :)

Sent from my MB860 using Tapatalk 2

EmSti
05-22-12, 08:05 PM
Have you tried putting the 6970 in the top slot (and backing off the clocks on it if necessary), and then put the better 7970 on the bottom? This would allow you to at least get full blast out of the more powerful 7970 while only having to sacrifice a little from the slower card - if at all.
:)

Sent from my MB860 using Tapatalk 2

Good call, I will try that before ripping it apart. Now that you mention it, the last time I tried this card in Hydra6 it was better at the bottom. I don't recall it ever running this hot. I wonder if moving it around has loosened things. After dinner I will pull it inspect and swap spots. Silly that I didn't think to try the swap earlier.

DrPop
05-22-12, 08:29 PM
Good deal, please keep us updated! Also, taking the cooler off the GPU is not too hard - I did that with an older HD5830 several months ago, and do you have any Arctic Silver or similar stuff to use? If you are good at applying the very thin layer, and surfaces are tight - cooler to chip, then you will easily see a drop of 5°C over the stock TIM the companies use...

Sent from my MB860 using Tapatalk 2

EmSti
05-22-12, 08:58 PM
I use PK-1 these days. It seems to work better than Artic Silver and doesn't seem to need break-in time. Which is better for machine like these which are always on, they don't get the cooling cycle needed for break-in. The only worry I have on redoing the card is the warranty. The card isn't even 2 months old yet (purchased 4/3). Ok off to move the cards around, again.

Fire$torm
05-23-12, 01:14 AM
.........Guess I will need to find old posts for building a fake VGA dongle.

Errr emmmm FAQ anyone? ---> http://www.setiusa.us/faq.php?faq=gpu#faq_dummyplug

Note the second solution titled "The 5 Second Dummy Plug"

EmSti
05-23-12, 01:29 AM
Errr emmmm FAQ anyone? ---> http://www.setiusa.us/faq.php?faq=gpu#faq_dummyplug

Note the second solution titled "The 5 Second Dummy Plug"

Thanks, I have had a chance to look for that yet.
It's only 5 sec if you don't count the time to the elctronic store or on-line shopping and for me that ends up being expensive and time consuming. I have poor impulse controls. @-)

DrPop
05-23-12, 01:29 AM
@EmSti - I will have to look that stuff up, then! Think I have half of an Arctic Silver tube left, though, so may be a little bit before I need it.

@F$ - :)) Nice. ;)

Sent from my MB860 using Tapatalk 2

EmSti
05-23-12, 01:44 AM
\:D/ Now we do the dance of joy <where is a damn dancing banana when you need one>

I took DrPop's suggestion and went further. I thought what was really needed was a bottom chasis fan. No place for one. Unless I remove the power supply and use its intake holes for a chasis fan. Then I remembered another thread were I believe Fire$torm was given credit for moving the power supply to the drive bays. So, I ripped hydra6 apart. Used bungy cords to fasten the PSU in the bay, rerouted all wires. Put the DVD player on the parts pile. Add a fan to the bottom of the case. Swapped the GPUs. The cool thing about the PSU in the upper drive bays is you can use it as an exit fan also. I used cardboard around the front opening to make sure is was pulling air from in side the case and sending it out. The power cord sticks out the front, but who cares, its already got a growth out the back (see previous hydra6 pics).

The results... 7970 is running at 72c and the 6970 at 77c. I did tighten the cooler bolts on the 7970 and to my surprise they were a little loose, so that may have helped also. I am happy to add that all card are running full out and o.c. I couldn't be happier. Correction, if all are still running in the morning and through tomorrow I will buy myself a beer. <where's a beer drinking icon when you need one>

I will update pics and Hydra6 specs, some other time.

EmSti
05-23-12, 09:03 AM
:-bd \:D/\m/

Fire$torm
05-23-12, 01:01 PM
:-bd \:D/\m/

Great job EmSti! WTG. Team SETI.USA FTW **==

DrPop
05-23-12, 01:06 PM
That is an interesting project. Sounds like it freed up a lot of airflow in the mid-tower case. What I wonder now is if the bottom fan makes a difference in the full tower cases? Both Cheetah and Puma's cases have a slot for the fan down there, I think 120mm - just in front of where the PSU goes, but I have never put one in, wonder if it would make a difference or if it would interfere with the air flow from big side fan?...hmmm...:confused::-??

EmSti
05-23-12, 01:12 PM
That is an interesting project. Sounds like it freed up a lot of airflow in the mid-tower case. What I wonder now is if the bottom fan makes a difference in the full tower cases? Both Cheetah and Puma's cases have a slot for the fan down there, I think 120mm - just in front of where the PSU goes, but I have never put one in, wonder if it would make a difference or if it would interfere with the air flow from big side fan?...hmmm...:confused::-??

I put the biggest bottom fan in I can. All three have them now that I modified H6.

Mike029
05-23-12, 02:22 PM
Thanks, ...for me that ends up being expensive and time consuming. I have poor impulse controls. @-)


Man I know the feeling. LOL

Fire$torm
05-23-12, 02:29 PM
That is an interesting project. Sounds like it freed up a lot of airflow in the mid-tower case. What I wonder now is if the bottom fan makes a difference in the full tower cases? Both Cheetah and Puma's cases have a slot for the fan down there, I think 120mm - just in front of where the PSU goes, but I have never put one in, wonder if it would make a difference or if it would interfere with the air flow from big side fan?...hmmm...:confused::-??

Well I know many of the current full towers from Corsair, Cooler Master and other manufacturers have bottom fan mount points so it most likely helps.

DrPop
05-23-12, 03:24 PM
Hmmm...I will have to try it this weekend then. Both Cheetah and Puma have mounts for at least a 120mm, maybe even a 140mm there, so who knows? Worth a shot, I guess. Summer is coming...:cool:

EmSti
05-24-12, 01:07 AM
Next goal, to try and get as close to 5 mill a day as I can out of Donate. At about 4 million now. Looking at Zydor's output on donate, I should be able to get more. I estimate he is getting over 900k a day from each of his 7970s and I have 5 plus a 6970 which is doing about 60% of the 7970. It funny, it is not so much the credits as he has set the bar high and I want my machines to get there. Pride not credits on I project I like has me fixated. Out of my way I am on a mission ...

Need to figure out how to adjust voltage without afterburner or evga precision....it's odd i have learned so much, but still know so little.

On the knowledge sharing front, with the 7970 cards and GPU-Z from techpower up you can can read your Asic quality of the GPU which basically predicts how good your GPU is and indicates the power, heat and overclocking ability. The higher the score the more overclocking room you will have in therory. FireDragon has one at 65.5 and another at 80.3, Black Dragon has 2 at about the same (all 4 the same GigaByte model with the same Bios). Hydra6 has 79.2. Hydra6 overclocks to 1125 without changing voltage (925 is reference), fricken assume. Its 7970 does 2 donate Wus every 18:05 secs. The other 7970 cards in CrossFire can only do 1090 and 18:35. But I bet if I pair up the 80+ cards I will get more out of that CrossFire pair. When I have time I will try it.

If folks don't mind, I am going to keep blogging this for future reference.

Bryan
05-24-12, 01:25 AM
I'm assuming you are referring to a single card of Zydor's and not his daily output. He has NEVER come close to 5M a day!

http://boincstats.com/en/stats/131/user/detail/43/bestxdays

:D

Actually I should say he hit it ONE time but hasn't been over 4M a day since.

EmSti
05-24-12, 01:31 AM
I'm assuming you are referring to a single card of Zydor's and not his daily output. He has NEVER come close to 5M a day!

http://boincstats.com/en/stats/131/user/detail/43/bestxdays

:D

Actually I should say he hit it ONE time but hasn't been over 4M a day since.

Sorry I may have been misleading...I figure he is doing over 900k per 7970. I have 5.6 of them. 900k x 5.6 = 5 mill. My goal is to get to 5 mill. or as close as I can. I have already passed is daily output and I am closing in on your Donate output.

EmSti
05-24-12, 01:36 AM
For reference this is what I was looking at:

http://boincstats.com/en/stats/131/user/list/12/0/0

Bryan
05-24-12, 01:39 AM
Sorry I may have been misleading...I figure he is doing over 900k per 7970. I have 5.6 of them. 900k x 5.6 = 5 mill. My goal is to get to 5 mill. or as close as I can. I have already passed is daily output and I am closing in on your Donate output.

Cuz I had to downclock everything due to heat :)) I've got to go get some 12" fans and put to the side of the machines. It was nice a month ago when I was running Donate I would just open the window and with it down in the 20s and 30s I had no heat problems :))

BTW, there was a guy who posted a HAF 932 review on the SUSA forum last year. He had taken a 932 w/ 2 or 3 highend GPUs and made a bunch of temperature measurements. Then they did some changes to the 932 that REALLY helped.

IIRC, they changed the 10" side fan to 4 120mm fans (screw holes already there) and changed the 120mm CPU output fan to a 140mm. It dropped the temp amazingly. I'm not saying get a 932, I'm saying if you can find that article some of what they did might applicable to any tower case!

PS I have 4 932s so when you find the thread please let me know the link so I can reread it :))

EmSti
05-24-12, 01:49 AM
PS I have 4 932s so when you find the thread please let me know the link so I can reread it :))

I have http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&DEPA=0&Order=BESTMATCH&N=100006644&isNodeId=1&Description=amd+932&x=0&y=0

And replaced its left panel will a replacement part from cooler master store for this:

http://www.newegg.com/Product/Product.aspx?Item=N82E16811119160&nm_mc=KNC-GoogleAdwords&cm_mmc=KNC-GoogleAdwords-_-pla-_-NA-_-NA

That side fan made a world of difference. The 7970's O.C. in the 932 with side fan running at 62c and 72c. Guess which is the top card :). Edit: Currently is 81f in my office, so those cards are doing well.

If I find the thread I will let you know.

With Hydra6, I am determined to see how far I can push that case. I almost bought a HAF XM to replace it, but stubornness won.

EmSti
05-24-12, 01:51 AM
BTW this is the link to the replacement door:

http://www.cmstore-usa.com/haf-932-left-side-panel-oem-package/

Bryan
05-24-12, 09:47 AM
All 4 of mine have the 3 10" fans. But if you take a look at the side and top there are mounting holes for 120mm fans. The review said they had dropped the temp big time by changing the 10" side fan to 4 120mm. I'll have to see if I can find that article.

Congrats on taking over top dog in Donate ... anytime you outproduce stoneageman you are doing well !!

Mike029
05-24-12, 10:03 AM
I'm loving the read on this EmSti.
Would you mind starting another thread with your progress? You have lots of good info here on the 7970 cards for this project. (BTW, you are making me itch for an new card) We have a few members that may be getting ready to jump at a few 7970's and would love to keep this info and your experiences handy. ;)

EmSti
05-24-12, 11:04 AM
Congrats on taking over top dog in Donate ... anytime you outproduce stoneageman you are doing well !!


Not there yet. And if I do get to my goal of being close to 5 Mill, he will probably improve as well. Don't care to chase him or anyone really, just want to ring what I can out of what I got. I am sure you know that urge well.

Last night was a set back. The boinc client aborted on both AMDs, Intel is just fine. One aborted shortly after I last looked at it at 1:30 am and the other 7 a.m. So I lost a good 10 hours, 5 mill may not be any time soon. Why is it always the AMDs that crap out? You would think the graphics driver would work better on them.

EmSti
05-24-12, 11:09 AM
I'm loving the read on this EmSti.
Would you mind starting another thread with your progress? You have lots of good info here on the 7970 cards for this project. (BTW, you are making me itch for an new card) We have a few members that may be getting ready to jump at a few 7970's and would love to keep this info and your experiences handy. ;)

Once I get it down, I will try to remember and to do that. Not much point yet, because there is a post on the Donate form that a new application is on the way that is suppose to fix some of the issues. No date the last I looked.

DrPop
05-24-12, 11:10 AM
My guess would be the combined Catalyst driver. AMD uses it for both the GPUs and the CPU chipset on the mobo. So when the GPU crashes, I'm betting it takes down all of Catalyst with it.
Of course the Intel rigs do not have this issue; they are using entirely different driver for the chipset...

Sent from my MB860 using Tapatalk 2

Slicker
05-24-12, 11:17 AM
All 4 of mine have the 3 10" fans. But if you take a look at the side and top there are mounting holes for 120mm fans. The review said they had dropped the temp big time by changing the 10" side fan to 4 120mm. I'll have to see if I can find that article.

With fans, bigger isn't better, just quieter. In order for a small fan to blow 65 cfm of air it has to spin really fast. (My 24 core 1U webserver sounds like a freaking jet engine with 8 x 25mm fans. I think they spind around 8000 RMS at full throttle. By comparison, the larger fan can spin its bigger blades at 1500 RPMS and move the same amount of air. Most fans, regardless of their size, move about 45-65 cfm. So, replacing the one 250mm with four 120mm fans should increase the airflow considerable as that's like going from only 1 case fan to 4.

When buying a fan, the goal should be the most air movement with the fewest decibels. If you keep your computers in your crawl space and don't have to listen to the fans whirring, then focus only on cfm.

LEDs look cool, but remember, it needs to BE cool, not just look cool. Your GPU and CPU want more air, not a dazzling light show. It's like a good paint job on a NHRA dragster. Oohs and aahs on the paint job won't pay the bills. It's the time on the clock, or lack thereof, that counts.

Consider the layout of your case and the placement of the fans. Pulling cool air into the case at the bottom and pushing out hot air near the top may give better results than just sucking in air with all fans and hoping that the PSU fan will handle pushing out all the hot air.

EmSti
05-24-12, 03:36 PM
Boy is my face red....Bad version of MSI afterburner, newer version understands 7970 better.

I have been reading up on the Asic quality and what it means, will post more later. But in that reading I came across MSI Afterburner 2.2.0 beta thread and their work on getting Afterburner to managed 7970.

More speed here I come.

EmSti
05-24-12, 05:02 PM
What follows is a piecing together on Unwinder posts on MSI AfterBurner 2.2.0 Beta thread :

A few interesting observations on Thaiti GPUs. First, it looks like similar to Fermi, Thaiti GPUs are binned and there are 4 types of Tahiti GPUs with different ASIC leakage level. So depending on it there are 4 possible default 3D voltages: 1.17500V, 1.11250V, 1.05000V and 1.02500V.
Currently all voltage control tools use hardcoded default voltage (1.175V) for 7970, so pressing "Reset"/"Defaults" buttons in any voltage control tool will put any GPU to 1.175V regardless of leakage level.

As far as I can understand BIOS code, this value is ASIC leakage level fused during ASIC binning and manufacturing. An ideal GPU with no leakage current is 00003ff0, lower values indicate higher leakage and higher default voltage.

AMD uses the following table to select VID depending on ASIC quality:

up to 2F90 (up to 75% quality) - 1.1750V
up to 34D0 (up to 80% quality) - 1.1125V
up to 3820 (up to 85% quality) - 1.0500V
up to 3A90 (up to 90% quality) - 1.0250V


You shouldn't create a hysteria around this value, quality is way too artifical name for that to say that card is bad or good. Take it as an indicator of default voltage, nothing more, nothing less, it doesn't matter what "quality" is if you're running at stock clocks. Lower voltage (high leakage) chips are better overclockers because of more room for voltage tweaking but they eat more power comparing to higher voltage chips when running at the same voltage.

--------------------------------------------------------------------------------

Alexey Nicolaychuk aka Unwinder, RivaTuner creator

DrPop
05-24-12, 06:58 PM
That is very interesting, thanks. Definitely good info for the max O/C! :D

Mike029
05-24-12, 09:06 PM
Thank you EmSti. Good stuff there.

Fire$torm
05-24-12, 09:12 PM
Boy is my face red....Bad version of MSI afterburner, newer version understands 7970 better.

No need for AB Beta ---> http://www.setiusa.us/showthread.php?3331-AMD-no-longer-allowing-quot-Unofficial-quot-OC-with-Catalyst-12-2-and-above&p=39709&viewfull=1#post39709

EmSti
05-24-12, 09:44 PM
No need for AB Beta ---> http://www.setiusa.us/showthread.php?3331-AMD-no-longer-allowing-quot-Unofficial-quot-OC-with-Catalyst-12-2-and-above&p=39709&viewfull=1#post39709

Right, beta discussion thread, but I went and downloaded the latest release 2.2.1 which was released a week or so ago. The beta thread made me understand the errors of my ways.

It's working well so far. The cards with low asic quality (those with high default voltage 1175) I left at stock voltage. Those with high asic I upped the voltage a little but not to 1175, tweaking a bit here and there for heat and stability. I told Afterburner not to syncronize the settings. Now all 5 cards are running at 1125 core clock. It been 4 hours so far and only one computation error when I was messing around on my office PC (i.e. my fault not the setup). No crashes or boinc restarts (knock on wood). But the test always seems to be getting through the night and the first 24 hrs.

For those that are counting, at 1125 and the app_info setup being use (see earlier in the thread) each 7970 is producing 2 Donate WUs every 18:05. That about 30 secs faster that I was getting at 1090 (yesterday's stable speed). No idea what it does at reference speed of 925. :cool:

Fire$torm
05-24-12, 10:04 PM
Right, beta discussion thread, but I went and downloaded the latest release 2.2.1 which was released a week or so ago. The beta thread made me understand the errors of my ways.

Roger that. Question: Is "Unofficial Overclocking" still permitted in that version? That is, can you OC past CCC limits?

zombie67
05-25-12, 12:21 AM
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.

EmSti
05-25-12, 01:15 AM
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.

As soon as I went to Zydor's app_info.xml (see post #3 of this thread) the problem went away on 2 of the machines (I also went to CCC 12.4 at the same time, manually removing amdocl and opencl dlls from \windows\system32 if they are not from 4/5 before reinstalling 12.4). After that it came down to one card that would abort the gpu drivers when it hit 83c. The drivers restart of course, but test615.exe would keep right on going forever. Just happen to see it occur one day. If you suspend the hung tasks and resumed them, they would start from the beginning and finish fine. Fixed the temp problem and no more nevers.

Hey...looking at that post #3, I have knocked off 3 full minutes of the WU cycle. The dual 7970s are now producing 4 Donate WUs every 18:05 and back then it was 21 minutes. :o)

zombie67
05-25-12, 02:15 AM
Yeah, I have used the app_info.xml, and 12.4. Oh well, I will struggle on until I hit my 100M MM, and then just bail.

And yeah, I know restarting them will let them finish. But that doesn't make up for the lost time. I am curious why BOINC won't DL more tasks while the task is suspended. Or maybe the problem is on the server side. Either way, suspending a task doesn't solve the problem of wasted time.

Fire$torm
05-25-12, 02:49 AM
Has a solution for the never ending tasks ever been found? For me with 5870s and 7970, I get them frequently. I lose several hours every day on each of them, and that is with frequent checking. I tried using BOINCtasks to suspend a hung task, which does work. But for some reason, I can't get replacement tasks to download. So the hung task is suspended, and the GPU sits idle.

I have noticed that suspending then resuming the task after a few minutes will get the wu to complete. Since you can have up to three actions per rule in BOINC Tasks, you could set the 2nd action to start a Windows batch file/Linux script to wait several minutes and then call boinccmd to resme the task.

DrPop
05-25-12, 11:25 AM
Hmmm...is this related to the "test-something.exe" stopping problem? Puma gets that every now and then for no reason that I can see - doesn't matter if I have the GPUs O/Ced or not, doesn't seem to have any schedule, just random. What's even more strange is sometimes it looks like it stops Donate WUs from running...other times it doesn't seem to matter and it just starts new WUs! :confused: I don't know what to think. This is two 5870s running 12.3 drivers...

Fire$torm
05-25-12, 01:17 PM
Hmmm...is this related to the "test-something.exe" stopping problem? Puma gets that every now and then for no reason that I can see - doesn't matter if I have the GPUs O/Ced or not, doesn't seem to have any schedule, just random. What's even more strange is sometimes it looks like it stops Donate WUs from running...other times it doesn't seem to matter and it just starts new WUs! :confused: I don't know what to think. This is two 5870s running 12.3 drivers...

Yep, Test615.exe is the Donate app/wrapper.

EmSti
05-26-12, 10:03 PM
Getting closer to goal, last 2 days over 4 Mill credits. I started upping voltage and clocks yesterday. Errors went up for awhile and blackdragon was locking up for awhile. Lost a few hours during those lock ups and when I shut it down to adjust fans. But I think they are all stable at the next level now. Might need some minor voltage tweaks yet. 7970 cards now at 1165 MHZ and each is finishing 2 wus every 17:35, so another 25 sec off.

EmSti
05-27-12, 01:57 PM
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D

912

Al
05-27-12, 02:10 PM
Congrats! Keep it up! SUSA wants to be 1st to 1 Billion.

Sent from my ADR6300

DrPop
05-27-12, 02:51 PM
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D

912

Waaaaaahoooooooo baby! YEAH! :)) :D That is what we like to see, go get 'em S.USA!!!**==

Duke of Buckingham
05-27-12, 05:41 PM
:) Congratulations Emsti, very good job. Keep on keeping... Unfortunately Donate doesn't work with XP 64 or I would be a real treath to you all. :D

Bad Jokes Duke :-B

spingadus
05-28-12, 11:16 AM
Nice job getting to the top!

Fire$torm
05-28-12, 09:05 PM
Yippy, king of the hill for a day at Donate. \:D/
It changes day to day, grabbing snapshot while it is there. It is a first for me. :D

912

Oorah! **==

Bryan
05-29-12, 12:13 AM
Nah, fame is so fleeting! Today he got his butt handed to him big time ... a whole 13k :))

Nyaah nah nah nah nah!

Okay so I got lucky when the project was down it didn't slow me down as much as the big boys :D

Tomorrow be sure and wave as you blow by EmSti :-h

zombie67
05-29-12, 12:19 AM
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.

Fire$torm
05-29-12, 01:04 AM
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.

One alternative would be to run a batch file/script to pause the project and then say 2~5 minutes later resume the project. Since a stalled task isn't dead, it will complete. This way no need to babysit the project.

I don't know how to compose a Linux script but a batch file I can do in a few minutes. Task Scheduler takes care of starting the batch file.

Bryan
05-29-12, 01:11 AM
How is this done? I would have to check in on my tasks every 15 minutes, to restart the stalled tasks. And even then, that would be 15 minutes lost. Babysitting tasks to completion is a 24/7 job with this project. And if you consider all the idle time, it pays crap.

I don't have the problems some do, but I'm just running the old stuff... 58xx and 69xx. On occaison I get the hung WU but it isn't all that common and when I do I I usually lose several hours of a GPU. I don't babysit the computers.

What I found on MY systems is:

1. The app_info works excellently with single cards. Gives 5% or better increase in credits.
2. On a dual card system (6990) it caused bunches of 590 credit WU (I7 2600k). Even though my crunch time went down per WU I got far less credits. 26 590 credit WU over a 16 hour period. When running 1 WU per GPU I almost NEVER see 590 credit WU.
3. On a dual card + single card (5970 + 5870) system I got bunches of hung WU. A hung WU is RARE on that system when running 1 WU per GPU.
4. A Q6600 machine with 2 5870s barfed almost immediately with only 2 cores reserved. I didn't try freeing all 4.

On #2 and #3 I was reserving 1/2 of a V8 ... it might have worked with more CPU reserved or it may be a function of the "dual" GPUs. I don't know but it isn't worth worrying about. I'm heat limited at this point so increasing the GPU loading by 5% isn't beneficial! All but 2 cards are downclocked.

I'm getting 4M+ per day (normally) so it pays far better than anything else.

Z you might try shutting down CPU crunching on your 7970 machine and try a 24 hour experiment. Before I did the Collatz Pentathlon thingie I was running Donate to the tune of 4.3M per day (winter I could clock higher). Although I did see hung WU they were VERY rare. After the Pentathlon I went back to Donate and for a couple of days I had a lot of trouble. It finally dawned on me the difference between before and after was the amount of CPU I was using for crunching. Prior I was reserving 4 threads on the V8s and after I was crunching 6 CPU wu and doing Donate. I reserved 2 more threads (1/2 the I7s) and my problems went away. Anyway it might be worth a try on one of your systems!

EmSti
05-29-12, 01:24 AM
Nah, fame is so fleeting! Today he got his butt handed to him big time ... a whole 13k :))

Nyaah nah nah nah nah!

Okay so I got lucky when the project was down it didn't slow me down as much as the big boys :D

Tomorrow be sure and wave as you blow by EmSti :-h


I did expect my total to be down because I worked on 2 of them for hours. Redid air flow on 2 pcs and redid the TIM on one of the hot GPUs (long ugly story I will share after the nightmares stop). After I got BlackDragon sorted out, I thought, I started up Donate right in the middle of problems they had last night and almost freaked. Until I noticed all of my machines were up and down, processing, not processing, dooooooooinnnnnnngggggg nothing, repeat. I had to wait hours to know I hadn't totally FOBARed things.

That being said and totals expected to be down, I was surprise when the daily update on bonicstat that I was top for the day again. I had to search for a bit to find you were looking at Free-DC I think, they showed the 13k diff). Free-DC shows a large amount for me yesterday than boinstats, so I guess it depends on 24 hour period they look at. I was going to post a pic of both, but I guess I have hit my space limit and I can't seem to clear the previous uploads (I need help, PM me please).

I am running even faster now, I am looking forward to seeing my totals for the next 48 hours and if the new setups are stable. 4 of the 5 7970s are now doing 2 Donate works units at about 17:05. I also shaved 30 secs off the 6970 times. the 7970s now running at
1185, 1185, 1185, 1175, 1170 Mhz. 2 of the owns at 1185 are still running cool, they may be capable of 1200, but i am don't want to push any harder for awhile.

EmSti
05-29-12, 01:32 AM
I reserved 2 more threads (1/2 the I7s) and my problems went away. Anyway it might be worth a try on one of your systems!

I agree I think reserving the thread(s) is key. I am becoming more and more conviced a lot of the project's problems are timing problems between the GPU and CPU. Reserving a CPU seems to cut down on that. The main problem I get lately (not daily) is the Bonic client crashing when Donate and one of the CPU projects finish at the same time. Which is why I wrote the program to restart the client. Lots of computation errors yesterday, but that was me playing with voltage and speed. Much more settled today.

zombie67
05-29-12, 01:37 AM
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.

EmSti
05-29-12, 01:42 AM
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.

My experience with Donate also. Seems to also have a delay as it decides to restart the job also. I have learned to time reboots around the % of time complete.

EmSti
05-29-12, 07:51 PM
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B

Mike029
05-29-12, 09:16 PM
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B

Squeezing every little stone out of those puppies. :p

Bryan
05-30-12, 12:11 AM
Oh so close to the 5 mill a day total on Donate. According to Free-DC today I got 4,829,439. Only 3.42% to go. :-B

No problemo, you'll hit it tomorrow. Their server had trouble for a couple of hours this afternoon ... I lost 300k and you would have lost more than that! Assuming the project server stays healthy you should blow right through it tomorrow!

Cruncher Pete
05-30-12, 04:16 AM
Guys, to me that what counts is not so much of what we as volunteers are doing but what the projects is NOT doing. Whilst we as users have some considerable knowledge in either experience or in deed programming, you just need to look at the Message Boards of most sites and the Admins are saying nothing or at best disregarding any problems as to say that there is NO problem, you must be stupid for I do not have that problem here... Not even to acknowledge that there is a problem or not even a thank you for using our project. This hurts me more than anything else in DC Computing. Are we so dumb that we accept the apparent fact that they do not care? If I am wrong, than please tell me how it is communicated to us, for I do not see this in most projects. Some do not even bother to talk to us because there is no Message Board that is active and they do not care if results of our daily score is promulgated world wide or not.

I think it is time we voice our opinion, if you find that a project doe snot conform to a standard that it should be following than stop using that project. There are Plenty of others that do the right thing. Please do not come back that we should be aware that as a new project in Alpha phase they need time to work things out.. All I can say to this is that we have a lot of knowledge amongst us as users that they should listen to us and use the Message Board to their method of communicating to us. If the problem is so serious that effects all users than they should acknowledge its existence and say so openly in the News section outlining what action they have taken to correct the problem. I am sick and tired of wasting my money on running my computers on a project that does not acknowledge my presence or listen to my concerns running their program. If they are not interested in our input than they should realize their demise....

I am sorry to say that although I have followed this project from the beginning, I am now under the impression that they are using us for they are not interacting with any of the users in their Message Boards, for to confirm this just look how many problems are posed to them and how many are in fact replied to.

There has been instances reflecting this stance by others yet no reply from the project.. It is your choice what you intend to do...

I think it is time that we told all projects not just this one that we need recognition and acknowledgement of our problems and if we do not get that simple thank you and acknowledgement than YOU should choose another project that cares... It is up to you.. Do what your own conscious desires...

Fire$torm
05-30-12, 02:57 PM
Just FYI: Yes, you can suspend/resume a "hung" task. And it will complete with credits. But that is no better than aborting a task, and starting a new one. A hung task that is suspended/restarted restarted from zero time, and takes the normal amount of time to complete.

Yes, that is correct. All in all Donate is a decent project that pays very well. And the reality is that Donate is in the same league as quite a few other projects. They can be very quirky and a pain in the ass. But, if you want credit or MM in that project then you have to accept what it throws at you.

So better idea, use a BOINCTasks rule set to abort the wu if runtime exceeds max_runtime + 10~15%. Still no baby sitting involved and credit output will exceed any other project. On a good day with Donate I have hit 1 Mil+ from that project alone using three 5830s ---> http://stats.free-dc.org/stats.php?page=user&proj=don&name=89

Bryan
05-30-12, 04:09 PM
So better idea, use a BOINCTasks rule set to abort the wu if runtime exceeds max_runtime


How do you set it up for ABORT? I only show SUSPEND task.

EmSti
05-30-12, 07:34 PM
Woot! \:D/

Reached my goal. According to Free-DC, I did 5,109,429 credits today, finally reaching my goal of tuning the 6 GPUS to the 5 Mill level. Yeah for me. I think 2 or 3 of the GPUs can go higher, but I am going to give it a rest for awhile and work harder on house projects.

Other impressive numbers: Looks like Cruncher Pete has stomped on the Donate gas pedal and reached 5,266,177 (king of the hill again). I think I counted 19 GPUs in the mix (of course it's hard to tell if all of those were actually doing Donate or just in the machine doing other things), that is just plain impressive. ^:)^ Are the tahiti cards playing nice now?

Bryan is also flying high at 4,563,877. :-bd

I believe those are the highest numbers I have seen on Donate from all three since I started my little adventure. :-h

Fire$torm
05-30-12, 07:46 PM
How do you set it up for ABORT? I only show SUSPEND task.

Use the run program action in the rule to start a batch file or script that aborts the Donate cache using boinccmd's --project <URL> reset command.

Edit: This way only one command needs to be issued.