-
2 gpu problems
cross posting this here from https://srbase.my-firewall.org/sr5/f...p?id=1561#7446
"I have been successfully crunching milkyway, einstein, collatz.. on a dual HD 7970 machine.
I just got some TF wus, it starts 2, it says one is on device 0 and the other device 1 but only one of the cards has a load on it. the other is idle.
I recently updated my driver so to confirm it wasn't that I suspended SRbase and ran some einstein and both cards are utilized.
The 2 WUs completed successfully, the estimated time, ~1600 seconds, was significantly longer than the actual run time 753 seconds and 758 seconds
As the einstein finished on one card SRbase started 1 wu on that card. That one finished in 250 seconds. I watched the other einstein finish (dev 1), and SRbase did not start a WU on that card. While dev 1 is idle SRbase on dev0 finished, and started another solo.
Im sure someone has tackled this before, I just need someone to point me to the fix."
-
Re: 2 gpu problems
Not too concerned about this, as the 7970s don't look very efficient on SRbase TF.
-
Re: 2 gpu problems
I don't believe yoyo supports 2 gpus, sort of like Poem used to be.
Edit: Sorry, SRBase
-
Re: 2 gpu problems
This is SRbase but you might still be right. I've moved everything to einstein to push for 100m
-
Re: 2 gpu problems
SRbase supports only a single GPU. You will need to edit your cc_config.xml to ignore one of the GPUs for that project. Instructions here:
https://srbase.my-firewall.org/sr5/f...stid=5765#5765
-
Re: 2 gpu problems
I'm not sure about now but SRbase TF previously only ran on GPU 0. If you had more than 1 GPU, you had to setup multiple clients and use app_config.xml files to control which GPU it used
Client A running TF on GPU 0
Code:
<app_config>
<app>
<name>TF</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>TF</app_name>
<plan_class>cuda100</plan_class>
<cmdline>-d 0</cmdline>
</app_version>
</app_config>
Client B running TF on GPU 1
Code:
<app_config>
<app>
<name>TF</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>TF</app_name>
<plan_class>cuda100</plan_class>
<cmdline>-d 1</cmdline>
</app_version>
</app_config>