Results 1 to 4 of 4

Thread: Test4Theory (LHC2): several news items + get your account already!

  1. #1
    Diamond Member
    zombie67's Avatar
    Join Date
    October 24th, 2010
    Reno, NV

    Test4Theory (LHC2): several news items + get your account already!


    We are very happy to announce our advance from Alpha-test level to Beta.

    This means no more need for an invitation code to join the project, so please tell your friends that we are now open to everyone. We aim to accept about 1000 active Beta testers, with about 2000 PC's running.

    During the Beta period we will be introducing a graphic display to show volunteers what is happening with the CERN jobs that their Virtual Machine is running.

    We are also working on a longer-term enhancement to allow volunteers to get more directly engaged with the science - stay tuned!

    Congratulations and thanks to all our Alpha testers for your patience and help, and welcome to Beta...

    The LHC@home 2.0 team
    So get your account created now!



    Dear All,

    Following unexpected posts on major news sites on 11th of August we have started seeing the exponential growth of number of connected virtual machines. Every minute several new virtual machines were popped up. Just to give you a rough idea, before, when we were only on major tech blogs and news sites the shape of the curve which was showing the number of active virtual machines looked like this, after BBC and MSNBC wrote about LHC@home the shape of the curve in no time became like this.

    Some of connected virtual machines apparently had problems with the application software (causes are still under investigation), and so they were getting jobs from our queues, were failing to run them, were reporting them back and were asking for more jobs. CernVM Co-Pilot (the framework that we use to distribute jobs from CERN into Virtual Machines and gather the results back) normally takes about a couple of seconds to generate a job request in a Virtual Machine, send it over to the server, pick the waiting job from the queue, prepare it, send it for execution, and start running it in the Virtual Machine. So, as you can imagine, with about 1000 active virtual machines, out of which several were 'rogue' and were basically doing nothing but draining the job server, our queues became empty very fast. We quickly ended up with queues, which were draining faster than our scripts could possibly feed them.

    Normally, this would not have not been a problem at all: Virtual Machines were not supposed to jump on servers all at once when the queues were empty. There is a built-in mechanism which would make them exponentially back off, precisely for the reason of not overloading our servers. Everything would have been good, if we did not have a bug which prevented our exponential back off algorithm from working, and instead turned all our virtual machines into cannons which were firing at our server hundreds of requests in a second. Because of that we had to turn our servers down and empty the BOINC server queue (has nothing to do with the Co-Pilot queue). This meant that BOINC clients would not pop up new Virtual Machines anymore.

    It took a while to figure the problem out, the updated code was pushed into CernVM File System repository about 2AM on August 12th (GMT +2), and the server was configured to prevent agents with the buggy code from connecting at about 10 AM on August 12th (because the bug is still there and they would still be flooding us). We put an announcement on the forums asking users to reboot (to make sure they pick up the updated code). After that the system started working again. Currently there are about 300 concurrently active machines. These are the users who got virtual machines before we emptied BOINC server queue and who rebooted them after our announcement. We are currently planning to slowly start adding new Work Units to BOINC server queue (100 at a time), which means that virtual machines should start to pop up on registered users' machines soon.

    Our initial goal was to recruit about 2,000 volunteers (remember, we just wanted to do a Beta test) so that several hundreds of them would be active all the time. We carried out alpha tests with about 300 registered volunteers (which would peak up to 100 online volunteers). As you will soon read in the other, more general announcement that we are about to publish on our main page, we are already very close to having 8,000 registered volunteers. We will try to slowly ramp that number up to 10,000 after which we plan to stabilize for a while before future increases.

    We would like to ask everyone to remember that we are still in Beta testing phase. Which means that outages like this are likely to happen again, in fact we do expect them to happen again. To discover and eliminate bugs we together with you are intentionally pushing the system well beyond its limits.

    Last but not least, we would like to thank you all of you again, for your enthusiasm, help, patience and understanding!!!

    on behalf of LHC@home 2.0 Team.

    So get your account created now!
    "Don't confront me with my failures, I had not forgotten them" - Jackson Browne

    Avatar source

  2. #2
    Past Administrator
    DrPop's Avatar
    Join Date
    October 13th, 2010
    SoCal, USA

    Re: Test4Theory (LHC2): several news items + get your account already!

    This is only 30Min after your post, and Account creation is already disabled.

  3. #3
    Diamond Member
    zombie67's Avatar
    Join Date
    October 24th, 2010
    Reno, NV

    Re: Test4Theory (LHC2): several news items + get your account already!

    Quote Originally Posted by DrPop View Post
    This is only 30Min after your post, and Account creation is already disabled.
    No surprising. Did you see the two graphs?

    The news about them going public was all over the place in normal news places. LHC generates a lot of interest.
    "Don't confront me with my failures, I had not forgotten them" - Jackson Browne

    Avatar source

  4. #4
    Platinum Member
    Mumps's Avatar
    Join Date
    October 28th, 2010
    Milwaukee, WI

    Re: Test4Theory (LHC2): several news items + get your account already!

    Well, so much for maintaining a #1 spot there.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts