PDA

View Full Version : Problem with Poem app_info



Al
06-25-12, 10:45 PM
I got a working app_info from Mad Matt. Works great on my q6600 with a single 560ti.

When I tried it on my 1035t with 2 - 570's I got errors that the file referenced in the app_info "poemcl_1.3_windows_intelx86__opencl_nvidia_100" could not be found. Sure enough, it wasn't there an I couldn't get it to download even by resetting or detaching. I did see these lines in the event log, which seemed like it might be part of the problem;

6/25/2012 2:58:18 PM | | NVIDIA GPU 0: GeForce GTX 570 (driver version 301.42, CUDA version 4.20, compute capability 2.0, 1280MB, 1173MB available, 1405 GFLOPS peak)
6/25/2012 2:58:18 PM | | NVIDIA GPU 1: GeForce GTX 570 (driver version 301.42, CUDA version 4.20, compute capability 2.0, 1280MB, 1169MB available, 1405 GFLOPS peak)
6/25/2012 2:58:18 PM | | OpenCL: NVIDIA GPU 0 (not used): GeForce GTX 570 (driver version 301.42, device version OpenCL 1.1 CUDA, 1280MB, 1173MB available)
6/25/2012 2:58:18 PM | | OpenCL: NVIDIA GPU 1 (not used): GeForce GTX 570 (driver version 301.42, device version OpenCL 1.1 CUDA, 1280MB, 1169MB available)

Not sure why they were UNUSED, but reinstalled the nvidia driver and I haven't seen that message since. I had to copy over the poemcl_1.3_windows_intelx86__opencl_nvidia_100 file from the 560 system to the system housing the 2 - 570s...never would download. Now I can download gpu wus for Poem, but 2 run on one gpu and the other 2 say they're running, but there is no gpu load at all. For the moment I have 2 running on 1 - 570 and 3 - Einstein's running on the other. It's working for now, but it's still not right. Anyone have any additional ideas on what could be causing this? Both boxes run Win7, BM 7.0.28 and Nvidia driver 301.42...one works with this app_info and the other doesn't. I really don't think the app_info file is the problem, so I could sure use some other ideas.


<app_info>
<app>
<name>poemcl</name>
<user_friendly_name>POEM++ OpenCL version</user_friendly_name>
</app>
<file_info>
<name>poemcl_1.3_windows_intelx86__opencl_nvidia_100</name>
<executable/>
</file_info>
<app_version>
<app_name>poemcl</app_name>
<version_num>1</version_num>
<avg_ncpus>1.0</avg_ncpus>
<max_ncpus>1.0</max_ncpus>
<flops>2.1e10</flops>
<plan_class>opencl_nvidia_100</plan_class>
<file_ref> <file_name>poemcl_1.3_windows_intelx86__opencl_nvidia_100</file_name>
<main_program/>
</file_ref>
<coproc>
<type>NVIDIA</type>
<count>0.5</count>
</coproc>
<gpu_ram>256.000000</gpu_ram>
</app_version>
</app_info>

kmanley57
06-26-12, 12:09 AM
Does this line really have a space in it?

<name>poemcl_1.3_windows_intelx86__opencl_nvidia_1 00</name>
^

Al
06-26-12, 04:22 AM
Does this line really have a space in it?

<name>poemcl_1.3_windows_intelx86__opencl_nvidia_1 00</name>
^

No

kmanley57
06-26-12, 11:22 AM
Looking at the scheduler reply file, I think you want version 103.

This is what my Scheduler reply has:

<version_num>103</version_num>

Now my machine with the ATI cards uses <version_num>1</version_num>.

Al
06-26-12, 01:23 PM
Looking at the scheduler reply file, I think you want version 103.

This is what my Scheduler reply has:

<version_num>103</version_num>

Now my machine with the ATI cards uses <version_num>1</version_num>.

Changed to to 103 and it get's 1.03 tasks now. It runs 2 tasks on gpu 0 at 85% gpu utilization. GPU 1 continues to say it is running but elapsed time jumps from "1" second to "---", back and forth, never increasing the % completed. If I suspend those 2 tasks, Einstein starts running 3 on the #1 gpu. When poem has completed its 2 wus, gpu 0 goes to Einstein and stays there. Very weird. Any other ideas and what the problem may be? Anyone...Anyone...

Al
06-26-12, 03:23 PM
While the tasks are flipping from 1% to --- here is the event log entry

6/26/2012 3:19:33 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:33 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:33 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14
6/26/2012 3:19:35 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:35 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:35 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14
6/26/2012 3:19:37 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:37 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:37 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14
6/26/2012 3:19:40 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:40 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:40 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14
6/26/2012 3:19:42 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:42 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:42 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14
6/26/2012 3:19:44 PM | Poem@Home | Task poempp_gpucrystal_1340603378_144242941_0 exited with zero status but no 'finished' file
6/26/2012 3:19:44 PM | Poem@Home | If this happens repeatedly you may need to reset the project.
6/26/2012 3:19:44 PM | Poem@Home | Restarting task poempp_gpucrystal_1340603378_144242941_0 using poemcl version 103 (opencl_nvidia_100) in slot 14

I have reset the project with no change. Mean anything to anyone?

kmanley57
06-26-12, 05:48 PM
Well!!!

I had this message earlier today and all I did was remove the app_info file I had used, reset the project and it did not give it. So I think it IS something in the file, but I have not looked at app_info/Scheduler Reply further to figure out WHAT is wrong. Sorry! Not currently running Poem to look more.

I found out one of my NVidia cards will not run two WU's on it! So I will have to adjust the app_info to deal with this. But since it is two different cards, what can you expect! :rolleyes:

Al
06-26-12, 09:38 PM
Well!!!

I had this message earlier today and all I did was remove the app_info file I had used, reset the project and it did not give it. So I think it IS something in the file, but I have not looked at app_info/Scheduler Reply further to figure out WHAT is wrong. Sorry! Not currently running Poem to look more.

I found out one of my NVidia cards will not run two WU's on it! So I will have to adjust the app_info to deal with this. But since it is two different cards, what can you expect! :rolleyes:

Maybe that's the problem. Both are 570's but from different manufacturers. Tomorrow I'm going to move the 2nd card to another computer and try running it there by itself. If it won't run by itself, then that's the problem. I think I'll also copy these posts and put on the poem board to see if they have any ideas. Thanks.

Duke of Buckingham
06-30-12, 08:38 AM
:mad: I am having problems with POEM app and cc_config. I had restarted the system and everything went crazy. the xml files are all giving error and the system is not seing I install (and reinstalled) the 12.3 Catalystic. Well all went crazy. I will reinstall windows again in a small while.:confused:

Duke:p