What's with the batches of very short deadlines? |
Message boards : Number crunching : What's with the batches of very short deadlines?
Author | Message |
---|---|
JerWA Send message Joined: 1 Jul 09 Posts: 7 Credit: 131,358 RAC: 0 |
Seems to be getting more and more prevalent that I'm getting a "normal" batch of WUs with 10+ day deadlines, and then throughout the day as the queue re-fills it gets bursts of 24 hour deadline WUs. Eventually by the end of the day I've got nothing but those short deadline WUs. Normally I wouldn't care, but I share CPU time across multiple projects, including a GPU project that BOINC thinks is a CPU project so it needs to "trigger" the WU to start then it offloads to the GPU. My CPU is quad core, and I usually run it so that there is never task switching, each project gets a core. When I get more than 2 or 3 of these short deadline Enigma units, however, the BOINC Manager panics, stops everything else (including the GPU tasks that only use 5 seconds of CPU time and even then don't max the core), and runs nothing but Enigma on all 4 cores until those WUs are gone. Again, irritating (and underhanded if intentional), but not a deal breaker. Except when it just keeps refilling my WU queue with those WUs, starting a cycle of never ending scheduler mayhem. Granted, I've only been running this project since the start of the month because it's my teams project of the month, but it's been nothing but a headache requiring constant monitoring and micro-managing to keep everything playing nicely. Having to manually suspend tasks so that other applications can run as intended, and then having to resume tasks one at a time is extremely obnoxious and time consuming. Is there any chance that this BOINC scheduler mayhem could be stopped pretty please? Seems a bit unfair to monopolize resources by bypassing the normal scheduling, since these "high priority" WUs break resource share, timed application switching, work queuing for all projects, and debts as BOINC tries to even out the mayhem caused by those WUs. Especially when the standard WU has a 10+ day deadline. What's so magical about these other WUs they have to be done in 24 hours? Thanks for your time. |
quel Send message Joined: 19 May 09 Posts: 34 Credit: 32,923,471 RAC: 0 |
The high priority units are general flagged that way because enigmaathome is a boinc wrapper around the main project. It checks out WUs from the M4 project and has to get them back by a deadline. What generally happens is that a WU times out (user dropped the project, didn't compute it in time, etc.) and it becomes urgent to get the unit done or it is wasted cpu time by the time it gets back to M4 server and is rejected as a duplicate. |
JerWA Send message Joined: 1 Jul 09 Posts: 7 Credit: 131,358 RAC: 0 |
I know the project is just a wrapper, but I find it unlikely that all of these WUs are that close to timeout. The 24 hours is arbitrary, as it is based on when the work is created. I.e. last night I watched it getting new WUs and every one of them was deadlined 24 hours exactly. To the minute. So it's not like the deadline was fixed, it was being made up based on when the WU was created or when it was sent, something. So why do most of them get 10+ days, and then groups of them 24 hours from time of creation. If the scheduler here is aware enough of some situation that warrants "high priority" handling, then it should also be made aware of the distribution of such units so that it doesn't stack them one on top of the other for hours on end. Since the time of my previous post every single Enigma WU I've gotten on this machine (ironically the only one time-sharing) has been on a 24 hrs from creation deadline. Conversely the computer sitting 2 feet away has a WU queue of 20 or so, all due 08/01. If all of these WUs are being automatically short deadlined because other people didn't finish them before their arbitrary deadline, maybe that's a hint that something is wrong with the deadlines. Like I said previously, I get enough of them that aborting them or babysitting them individually is the only way to keep the manager running correctly. I can't be alone in this, though I imagine most people would have just dropped the project that was causing problems... this one. I'm trying to avoid that in an effort to support my team, and that's why I've got another project active again, it helps keep the short Enigma WUs in check, but even then it's just a delaying tactic. At least once a day I have to go in, clear things out, suspend the rest, and watch every WU until it's back in balance. When Enigma was the only thing running, sharing time only with the GPU app, it would d/l enough short deadlined units that nothing but Enigma would run, even with a resource share of 1 compared to the GPU projects share of 200. I add back one more project to help keep that from happening, but like I said it still does, just takes longer before it digs that bad of a hole. |
mdoerner Volunteer developer Volunteer tester Send message Joined: 30 Jul 08 Posts: 202 Credit: 6,998,388 RAC: 0 |
There is only one project, THIS PROJECT!!!! You gotta problem with that??!???!? ;-) Mike Doerner |
JerWA Send message Joined: 1 Jul 09 Posts: 7 Credit: 131,358 RAC: 0 |
And in other unrelated equally useful news... Thanks for your feedback. Anyone with some constructive input (and not about the smiley)? |
TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0 |
Most of the high priority workunits were resends caused by the download server failure which happened not so long ago. The deadline was getting closer and closer so the server started sending them out as high priority. There were around 240k of such workunits in the DB, I've removed them manually. There's also significant number of 'normal' high priority results, I'll try to tweak the server code a bit to handle the resend tasks deadline in a better way. Right now the code sets high priority for a resend task without checking how close the deadline is. M4 Project homepage M4 Project wiki |
Message boards :
Number crunching :
What's with the batches of very short deadlines?