Error while computing |
Message boards : Number crunching : Error while computing
Author | Message |
---|---|
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
What can I do to avoid error while computing situation? Reason is here: Stderr output <core_client_version>6.12.26</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> wrapper: starting Unrecognized XML in parse_init_data_file: userid Skipping: 0 Skipping: /userid Unrecognized XML in parse_init_data_file: teamid Skipping: 0 Skipping: /teamid Unrecognized XML in parse_init_data_file: hostid Skipping: 26098 Skipping: /hostid Unrecognized XML in parse_init_data_file: result_name Skipping: rxpsb70-p5_0_13708584_1130_0 Skipping: /result_name Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 0.000000 Skipping: /starting_elapsed_time running enigma2_0.76_i686-apple-darwin wrapper: running ../../projects/www.enigmaathome.net/enigma2_0.76_i686-apple-darwin (-R) 2011-06-15 19:29:54 enigma: working on range ... </stderr_txt> ]]> This only happens on Mac, not Windows. Mac host is 26098. Mac estimates that WU will be done in 35 seconds(!) but it takes 50 minutes plus minus few minutes. Event log shows this line: Ke 15 Kes 20:54:53 2011 | Enigma@Home | Aborting task rxpsb70-p5_0_13708582_1114_0: exceeded elapsed time limit 2547.63 (1000000.00G/392.52G) |
Ageless Volunteer moderator Volunteer tester Send message Joined: 11 Sep 07 Posts: 104 Credit: 155,932 RAC: 0 |
Can you please check the value of the Duration Correction Factor on that computer? You can find that in the details of the computer, down near the bottom. Task duration correction factor 1.10601 It should be as my figure around 1. Please check first, I can give instructions on how to reset this value when you return with an answer. Jord. BOINC FAQ Service. |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
DCF is 0,8033. It has been around 1 before without problems. I did upgrade OS and Boinc, but that has nothing to do with this, or does it? Should Boinc set the DCF value itself? |
Ageless Volunteer moderator Volunteer tester Send message Joined: 11 Sep 07 Posts: 104 Credit: 155,932 RAC: 0 |
BOINC normally sets the DCF value itself, and with each task it does it will change the value of it up or down, as it's part of the BOINC learning process into how long tasks actually take. A DCF of 0.8 isn't too bad, but let's reset it anyway. Just to check if that resets the estimated run time for you. First exit BOINC completely: BOINC Manager->Advanced view->File->Exit->Check "Stop running science applications when exiting the Manager?"->OK. Then navigate to your BOINC Data directory, which on OS X is by default at /Library/Applications Support/BOINC/ Edit the file called client_state.xml with a simple text editor. No need for specialized XML editors. The XML BOINC uses is developed especially for BOINC. XML editors won't know how to deal with this. Search in it for Enigma Stop search. read through the lines to the <duration_correction_factor>X</duration_correction_factor> line. Change the number here from 0.8033 to 1.000000 (mind using the decimal point, not a comma!) Make sure not to change anything else! Save the client_state.xml file. Restart BOINC. Now let it fetch work from here. What are the estimates on the work time, now? Jord. BOINC FAQ Service. |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
Now let it fetch work from here. What are the estimates on the work time, now? Estimates are now 44 seconds. Very under estimated. Settings I had when requesting new work: - Number of usable CPUs has changed from 2 to 1. - cache is default (0.25 days) - Boinc says DCF is exactly 1 Guess how many WUs I got? Pe 17 Kes 19:56:02 2011 | Enigma@Home | Scheduler request completed: got 248 new tasks I have now work for at least 4-5 days. I soon know if these will end up "error while computing" state. Maybe DCF should be 8.0 or more? (Sorry about my poor English) |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
No luck still same error :-( I try higher DCF value: it is 20. Now Boinc estimates that WU will be done in 14 minutes 42 seconds. It´s still about 1/4 of real calculating time. |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
DCF value of 20 did not helped. Task is still aborting: Pe 17 Kes 22:31:11 2011 | Enigma@Home | Aborting task rxpsb70-p6_0_13708764_10_0: exceeded elapsed time limit 2547.63 (1000000.00G/392.52G) What next? |
Ageless Volunteer moderator Volunteer tester Send message Joined: 11 Sep 07 Posts: 104 Credit: 155,932 RAC: 0 |
Sorry, I forgot all about this thread. Had some other things on my head. I'll ask TJM to check whether the resource estimates are correct for the Mac. I hope he can spare some time away from his other project. Jord. BOINC FAQ Service. |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
Any new information about this problem? |
TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0 |
Resource estimates are the same for all platforms, as they're hardcoded in the WU template. I think it might be a problem with the core client/manager - is the benchmarked CPU speed correct ? M4 Project homepage M4 Project wiki |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
I think it might be a problem with the core client/manager - is the benchmarked CPU speed correct ? Benchmark results are: Ke 24 Elo 12:44:57 2011 | | Running CPU benchmarks Ke 24 Elo 12:44:57 2011 | | Suspending computation - CPU benchmarks in progress Ke 24 Elo 12:45:29 2011 | | Benchmark results: Ke 24 Elo 12:45:29 2011 | | Number of CPUs: 2 Ke 24 Elo 12:45:29 2011 | | 3103 floating point MIPS (Whetstone) per CPU Ke 24 Elo 12:45:29 2011 | | 7329 integer MIPS (Dhrystone) per CPU They seems to be correct. Boinc version is 6.12.26. If I remember right, this problem started after I upgraded OS from Leopard to Snow Leopard. At the same time I upgraded Boinc. |
TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0 |
I think I know what's going on. The wrapper doesn't return CPU time, because it can't read it on mac. Probably BOINC core client thinks that the tasks are completed very fast and it then underestimates runtime (isn't the runtime stored somewhere in local files ?). The solution would be to rebuild Mac wrapper again, I'll try to get remote access to Mac with dev tools, perhaps I'll be able to fix the problem. M4 Project homepage M4 Project wiki |
Sami Send message Joined: 8 Apr 10 Posts: 8 Credit: 11,933,109 RAC: 0 |
I noticed that my Mac is crunching enigma@home again. So someone has managed to repair what ever the problem was. Thanks :-) |
TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0 |
Nope, nothing was repaired on the app side, maybe just the newer server software fixes something related to this problem. M4 Project homepage M4 Project wiki |
Message boards :
Number crunching :
Error while computing