PDA

View Full Version : AQUA: Mass testing of the ROQS application will begin



RSS
12-24-10, 04:47 PM
We have been performing test runs on the ROQS (Redfield Open-system Quantum Simulation) application for the past few days. We will be deploying a large number of test units today (24-Dec-2010). Please bear with us while we remove the kinks.

More... (http://aqua.dwavesys.com/forum_thread.php?id=692)

Fire$torm
12-24-10, 07:29 PM
Ima crunching away...

trigggl
12-25-10, 12:18 PM
Ima crunching away...

What run time and credits are you seeing?

Beerdrinker
12-25-10, 01:04 PM
What run time and credits are you seeing?


Run time about 1 hours with 300-ish credits. Still using all cores on CPU....

Fire$torm
12-25-10, 02:07 PM
What run time and credits are you seeing?



Task______Work unit_Computer__Run time(sec)__CPU time(sec)__Credit
6069460 --- 5204337 --- 64219 --- 02,626.30 --- 16,178.01 --- 321.99
6069452 --- 5204329 --- 64219 --- 02,607.16 --- 16,201.94 --- 321.99
6068204 --- 5203084 --- 64219 --- 02,613.98 --- 16,237.27 --- 321.99
6068201 --- 5203081 --- 64219 --- 02,596.92 --- 16,183.92 --- 321.99
6066977 --- 5201862 --- 64219 --- 02,623.04 --- 16,194.01 --- 321.99
6066970 --- 5201855 --- 64219 --- 02,597.17 --- 16,195.88 --- 321.99
6065740 --- 5200631 --- 64219 --- 02,628.89 --- 16,204.12 --- 321.99
6065732 --- 5200623 --- 64219 --- 02,611.75 --- 16,212.28 --- 321.99
6064432 --- 5199328 --- 64219 --- 02,622.41 --- 16,272.29 --- 321.99
6064416 --- 5199312 --- 64219 --- 02,627.56 --- 16,145.71 --- 321.99
6063240 --- 5198137 --- 64219 --- 02,619.42 --- 16,205.12 --- 321.99
6063234 --- 5198131 --- 64219 --- 02,618.19 --- 16,231.45 --- 321.99
6057428 --- 5192356 --- 64219 --- 02,687.39 --- 16,459.97 --- 321.99
6057415 --- 5192343 --- 64219 --- 02,704.05 --- 17,176.54 --- 321.99
6057407 --- 5192335 --- 64219 --- 02,605.44 --- 16,272.60 --- 321.99

trigggl
12-25-10, 04:55 PM
What OS? [more characters]

Fire$torm
12-26-10, 12:25 AM
What OS? [more characters]

I'm running Win7 Home Premium x64 [More characters indeed :P]

Mike029
12-26-10, 03:17 PM
300 stones for 4 cores in an hr? How is that stacking up against other jobs? Aqua still the place for cpu?

Fire$torm
12-26-10, 03:47 PM
300 stones for 4 cores in an hr? How is that stacking up against other jobs? Aqua still the place for cpu?

The credits per hour or day for this sub-project are a fraction of the other Aqua sub-projects. Me no like. :mad:

trigggl
12-26-10, 06:18 PM
On a 64-bit dual core linux, it's taking hours (as in, more than 6) and giving me a measly 600. Let me know when they decide to give credit for these things.

Mike029
12-27-10, 02:23 PM
I'm noticing that my quad core is jiddery when running this aqua. In the settings I have it set to use 100% of the processors and 80% of cpu like always now the mouse get stuck and takes a sec to move. For the gpu, I'm running collatz which should not be an issue. Anyone have any ideas?

Fogle
04-10-11, 03:26 PM
Arugh, I've been running a Aqua WU now for 22 hours and it's only 51% done. I hate to abourt and lose 22 hours of crunching, but I don't know if it will ever finish. What should I do? :confused:

Maxwell
04-10-11, 04:17 PM
Arugh, I've been running a Aqua WU now for 22 hours and it's only 51% done. I hate to abourt and lose 22 hours of crunching, but I don't know if it will ever finish. What should I do? :confused:
Let it keep running. Aqua WUs take a notoriously long time. And you've be well compensated. Don't panic.

Fogle
04-10-11, 07:32 PM
Well I got a computational error after 23 hours, I would advise our team to stay away till this get to be more stable. I lost a lot of production on this project.:mad:

dan
04-10-11, 08:37 PM
I have 28 hours with 6 threads now and it's only 1.14% done. My other box is at 5% and 35 hours of processing.

JerWA
04-10-11, 10:52 PM
% done and ETA are irrelevant to the work it's doing, it's a wrapper app so BOINC has no idea.

As for the errors check your logs and see if you're hitting the disk usage issue. Then go tell AQUA about it, so they can actually fix it.

I'm still not sure what I did differently as I've not dropped a ROQS unit since early last month, and that's the only one ever (and was unrelated to the current errors it seems).

Edit: Out of curiosity are you guys time sharing AQUA and other apps? Pausing a lot? Exiting BOINC? Restarting? etc?

The two machines I have running this full time do not time share (due to scheduler problems with multi-core apps), never pause, reboot, etc or at least very very rarely.

dan
04-11-11, 05:37 AM
% done and ETA are irrelevant to the work it's doing, it's a wrapper app so BOINC has no idea.

As for the errors check your logs and see if you're hitting the disk usage issue. Then go tell AQUA about it, so they can actually fix it.

I'm still not sure what I did differently as I've not dropped a ROQS unit since early last month, and that's the only one ever (and was unrelated to the current errors it seems).

Edit: Out of curiosity are you guys time sharing AQUA and other apps? Pausing a lot? Exiting BOINC? Restarting? etc?

The two machines I have running this full time do not time share (due to scheduler problems with multi-core apps), never pause, reboot, etc or at least very very rarely.

I've had Aqua tasks that take between 50 and 100 hours, but they usually pay around 20,000+ credits. I've never seen one at 20 hours give 1.14% complete.

What does pausing, rebooting etc have to do with performance? Context switching does incur a performance hit, but nothing real big, like your suggesting.

Dan

Mike029
04-11-11, 08:32 AM
% done and ETA are irrelevant to the work it's doing, it's a wrapper app so BOINC has no idea.

As for the errors check your logs and see if you're hitting the disk usage issue. Then go tell AQUA about it, so they can actually fix it.

I'm still not sure what I did differently as I've not dropped a ROQS unit since early last month, and that's the only one ever (and was unrelated to the current errors it seems).

Edit: Out of curiosity are you guys time sharing AQUA and other apps? Pausing a lot? Exiting BOINC? Restarting? etc?

The two machines I have running this full time do not time share (due to scheduler problems with multi-core apps), never pause, reboot, etc or at least very very rarely.
Some of my boxes are used from time to time by the family and shut down at night. That may be causing the wu failures. I'll have to try this out running Aqua

JerWA
04-11-11, 11:30 AM
I've had Aqua tasks that take between 50 and 100 hours, but they usually pay around 20,000+ credits. I've never seen one at 20 hours give 1.14% complete.

What does pausing, rebooting etc have to do with performance? Context switching does incur a performance hit, but nothing real big, like your suggesting.

Dan
Not performance related. But pausing, or stopping, an app means it has to save state. Plenty of apps over the years have had problems doing this (including some that wouldn't at all, so if you interrupted it in any way it would reset to 0% and start over). Just wondering if the problem with the file growing over it's size limit could be related to the app doing this process wrong and leaving behind a little more junk every time it happens.

Total conjecture, was just curious. :)

Oh, and the % complete thing isn't related to the time at all. % and ETA require the app to tell the BOINC framework certain things. If the app isn't BOINC integrated enough to do that, you get no progress and ETA is BOINC making wild guesses without any data to back it up. This app isn't communicating with BOINC very well yet, so the % done is absolutely unrelated to how much work the app has done. Keep in mind that this isn't "just another AQUA" app, usually their apps are way better behaved than this one. Until they put it into the app, the % is not telling you anything at all. Most of my machines sit at 1% for more than 80% of the time it will actually take for the WU to finish. Then they update to something like 42%, which is wrong, and then finish.

DrPop
04-11-11, 03:16 PM
Alright, well looks like it's an AQUA project problem with those WUs. Lame. I lost quite a few days worth of credits on this myself. Sorry for steering anyone wrong when they obviously need to work a few kinks out of it yet. I guess we need to switch to other AQUA sub projects or back to Primaboinca.

rgathright
04-11-11, 03:56 PM
UPDATE!

Looks like version 0.18 fixes our problems?

http://aqua.dwavesys.com/forum_thread.php?id=788&nowrap=true#10859
(http://aqua.dwavesys.com/forum_thread.php?id=788&nowrap=true#10859)

Alright, well looks like it's an AQUA project problem with those WUs. Lame. I lost quite a few days worth of credits on this myself. Sorry for steering anyone wrong when they obviously need to work a few kinks out of it yet. I guess we need to switch to other AQUA sub projects or back to Primaboinca.

dan
04-11-11, 09:22 PM
My 1.14% Aqua task is now at 2.275%. Plus it's now eating 5.3 GB of memory. Owch!!!

Reset both computers running Aqua. 100hrs on 6 threads each is waisted!!!

rgathright
04-13-11, 08:05 AM
My 1.14% Aqua task is now at 2.275%. Plus it's now eating 5.3 GB of memory. Owch!!!

Reset both computers running Aqua. 100hrs on 6 threads each is waisted!!!

Welcome aboard!

I feel your pain, it was so tough watching all my unreported credits go to waste with ANDROINC.

DrPop
04-13-11, 12:59 PM
Please try detaching and reattaching and run the Ver. 19 WUs. This should have fixed their problems.

JerWA
04-13-11, 01:48 PM
Or you can just abort any .17 or .18 WUs you have, it will automatically d/l the .19 app when it gets a .19 WU.

Fire$torm
04-13-11, 03:11 PM
Please try detaching and reattaching and run the Ver. 19 WUs. This should have fixed their problems.

Not trying to berate you Doc (Lies Lies Lies.... :)) J/K). I've posted this elsewhere - Detaching and reattaching can sometimes mess up your stats. I have found it is better to abort all tasks for the project you wish to reset, shut down BOINC, find its folder and delete it. When you restart BOINC the project will automatically DL all necessary files including new apps if available. The absolute best way to guarantee a full reset.

Side note: It is sometimes necessary to reset all project debts to Zero using the <zero_debts>1</zero_debts> command option in your cc_config.xml file. Similar to my instructions above, you will need to shut down BOINC, add the command to the cc_config.xml file using any text editor like notepad or gedit (for Linux) and restart BOINC. Then do a manual update for all projects in your list, shut down BOINC, remove the Debts command from cc_config.xml and finally restart BOINC.

The default installtion locations for BOINC Manager are as follows:

BOINC Client Directory: C:\Program Files\BOINC
BOINC Data Directory: C:\Documents and Settings\All Users\Application Data\BOINC

If the cc_config.xml does not exist then you can create one with a text editor.
Here is mine for reference:

<cc_config>
<options>
<start_delay>30</start_delay>
<network_test_url>http://www.yahoo.com/</network_test_url>
<dont_check_file_sizes>1</dont_check_file_sizes>
<report_results_immediately>1</report_results_immediately>
</options>
</cc_config>

rgathright
04-13-11, 04:28 PM
Nice write up Fire$torm!

I am running 0.19 right now on my boxes and hope to see better results tomorrow.

We are helping them so much with this BETA project. Everyone, please keep posting on the Androinc forums to show your support for the system admins there. We get a lot of positive visibility for this.

dan
04-13-11, 05:55 PM
The default installtion locations for BOINC Manager are as follows:

BOINC Client Directory: C:\Program Files\BOINC
BOINC Data Directory: C:\Documents and Settings\All Users\Application Data\BOINC


One quick addendum. In win 7 Bonic data is in C:\ProgramData\BOINC. ProgramData is a hidden folder also.

Dan

Fire$torm
04-13-11, 07:22 PM
One quick addendum. In win 7 Bonic data is in C:\ProgramData\BOINC. ProgramData is a hidden folder also.

Dan

Thanks. I forgot about Win7. I always unhide my system folders right after an OS install so I tend to forget that too.

DrPop
04-13-11, 07:36 PM
Ah, thanks F$. I didn't know about that method - seems like a better route. I'll do that on the next one. :D