Boinc 7.0.64 and nci uploads, trickles and requests


Advanced search

Message boards : Number crunching : Boinc 7.0.64 and nci uploads, trickles and requests

Author Message
Claggy
Send message
Joined: 29 Jul 12
Posts: 9
Credit: 149,374
RAC: 0

Message 1736 - Posted: 22 May 2013 | 21:36:07 UTC

I sent a Bug report to the Boinc_alpha list reporting a Minor Bug with with regards to nci apps, trickles and requests,
meaning for every Wu done the project looses 15 minutes of sensor data because Boinc doesn't wait for the upload to complete before asking for work,
or once being refused work, and the upload completes, doesn't immediately ask for work again.
(Meaning there isn't any data collected for about 2 and half hours for every 24 hour peroid)

There has been no response to my Bug report, if someone can replicate it and add their tuppence to the Boinc Alpha list it will be appreciated:

http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha

I have a minor problem with Boinc 7.0.50 (that's when i started running Boinc 7 on this host) and later and Radioactive@home's nci app,
when the app finishes, Boinc 7 does it's trickle up and requests work, but that gets refused as the upload is still in progress,
often after than Boinc 7.0.64 will sit around for up to 15 minutes before asking again:

21/05/2013 05:38:24 | Radioactive@Home | Computation for task sample_1824982_0 finished
21/05/2013 05:38:24 | Radioactive@Home | [dcf] DCF: 1.192714->1.187009, raw_ratio 1.135663, adj_ratio 0.952167
21/05/2013 05:38:27 | Radioactive@Home | Started upload of sample_1824982_0_0
21/05/2013 05:38:27 | Radioactive@Home | [sched_op] Starting scheduler request
21/05/2013 05:38:27 | Radioactive@Home | Sending scheduler request: To send trickle-up message.
21/05/2013 05:38:27 | Radioactive@Home | Requesting new tasks for CPU
21/05/2013 05:38:27 | Radioactive@Home | [sched_op] CPU work request: 1.00 seconds; 0.00 devices
21/05/2013 05:38:27 | Radioactive@Home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
21/05/2013 05:38:27 | Radioactive@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
21/05/2013 05:38:31 | Radioactive@Home | Finished upload of sample_1824982_0_0
21/05/2013 05:38:31 | Radioactive@Home | Scheduler request completed: got 0 new tasks
21/05/2013 05:38:31 | Radioactive@Home | [sched_op] Server version 613
21/05/2013 05:38:31 | Radioactive@Home | No tasks sent
21/05/2013 05:38:31 | Radioactive@Home | This computer has reached a limit on tasks in progress
21/05/2013 05:38:31 | Radioactive@Home | Project requested delay of 7 seconds
21/05/2013 05:38:31 | Radioactive@Home | [sched_op] Deferring communication for 7 sec
21/05/2013 05:38:31 | Radioactive@Home | [sched_op] Reason: requested by project

21/05/2013 05:52:15 | Radioactive@Home | [sched_op] Starting scheduler request
21/05/2013 05:52:15 | Radioactive@Home | Sending scheduler request: To fetch work.
21/05/2013 05:52:15 | Radioactive@Home | Reporting 1 completed tasks
21/05/2013 05:52:15 | Radioactive@Home | Requesting new tasks for CPU
21/05/2013 05:52:15 | Radioactive@Home | [sched_op] CPU work request: 1.00 seconds; 0.00 devices
21/05/2013 05:52:15 | Radioactive@Home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
21/05/2013 05:52:15 | Radioactive@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
21/05/2013 05:52:19 | Radioactive@Home | Scheduler request completed: got 1 new tasks
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] Server version 613
21/05/2013 05:52:19 | Radioactive@Home | Project requested delay of 7 seconds
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] estimated total CPU task duration: 8054 seconds
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] estimated total NVIDIA task duration: 0 seconds
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] estimated total ATI task duration: 0 seconds
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] handle_scheduler_reply(): got ack for task sample_1824982_0
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] Deferring communication for 7 sec
21/05/2013 05:52:19 | Radioactive@Home | [sched_op] Reason: requested by project
21/05/2013 05:52:22 | Radioactive@Home | Starting task sample_1825341_0 using radac version 176 in slot 8

Note: the trickle/request and upload often are in different orders:

21/05/2013 01:16:38 | Radioactive@Home | Computation for task sample_1824236_0 finished
21/05/2013 01:16:38 | Radioactive@Home | [dcf] DCF: 1.205722->1.198897, raw_ratio 1.137476, adj_ratio 0.943399
21/05/2013 01:16:39 | Radioactive@Home | [sched_op] Starting scheduler request
21/05/2013 01:16:39 | Radioactive@Home | Sending scheduler request: To send trickle-up message.
21/05/2013 01:16:39 | Radioactive@Home | Requesting new tasks for CPU
21/05/2013 01:16:39 | Radioactive@Home | [sched_op] CPU work request: 1.00 seconds; 0.00 devices
21/05/2013 01:16:39 | Radioactive@Home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
21/05/2013 01:16:39 | Radioactive@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
21/05/2013 01:16:40 | Radioactive@Home | Started upload of sample_1824236_0_0
21/05/2013 01:16:43 | Radioactive@Home | Finished upload of sample_1824236_0_0
21/05/2013 01:16:43 | Radioactive@Home | Scheduler request completed: got 0 new tasks
21/05/2013 01:16:43 | Radioactive@Home | [sched_op] Server version 613
21/05/2013 01:16:43 | Radioactive@Home | No tasks sent
21/05/2013 01:16:43 | Radioactive@Home | This computer has reached a limit on tasks in progress
21/05/2013 01:16:43 | Radioactive@Home | Project requested delay of 7 seconds
21/05/2013 01:16:43 | Radioactive@Home | [sched_op] Deferring communication for 7 sec
21/05/2013 01:16:43 | Radioactive@Home | [sched_op] Reason: requested by project

21/05/2013 01:29:22 | Radioactive@Home | [sched_op] Starting scheduler request
21/05/2013 01:29:22 | Radioactive@Home | Sending scheduler request: To fetch work.
21/05/2013 01:29:22 | Radioactive@Home | Reporting 1 completed tasks
21/05/2013 01:29:22 | Radioactive@Home | Requesting new tasks for CPU
21/05/2013 01:29:22 | Radioactive@Home | [sched_op] CPU work request: 1.00 seconds; 0.00 devices
21/05/2013 01:29:22 | Radioactive@Home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
21/05/2013 01:29:22 | Radioactive@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
21/05/2013 01:29:26 | Radioactive@Home | Scheduler request completed: got 1 new tasks
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] Server version 613
21/05/2013 01:29:26 | Radioactive@Home | Project requested delay of 7 seconds
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] estimated total CPU task duration: 8123 seconds
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] estimated total NVIDIA task duration: 0 seconds
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] estimated total ATI task duration: 0 seconds
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] handle_scheduler_reply(): got ack for task sample_1824236_0
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] Deferring communication for 7 sec
21/05/2013 01:29:26 | Radioactive@Home | [sched_op] Reason: requested by project

Boinc 6.0.58 used to do it in two requests one after the other, 20 secs later a new Wu would start:

22-Feb-2013 19:31:14 [Radioactive@Home] Computation for task sample_1500438_0 finished
22-Feb-2013 19:31:14 [Radioactive@Home] [dcf] DCF: 1.008261->1.012175, raw_ratio 1.047396, adj_ratio 1.038814
22-Feb-2013 19:31:17 [Radioactive@Home] Started upload of sample_1500438_0_0
22-Feb-2013 19:31:17 [Radioactive@Home] [sched_op_debug] Starting scheduler request
22-Feb-2013 19:31:17 [Radioactive@Home] Sending scheduler request: To send trickle-up message.
22-Feb-2013 19:31:17 [Radioactive@Home] Requesting new tasks for CPU
22-Feb-2013 19:31:17 [Radioactive@Home] [sched_op_debug] CPU work request: 1.00 seconds; 0.00 CPUs
22-Feb-2013 19:31:17 [Radioactive@Home] [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs
22-Feb-2013 19:31:17 [Radioactive@Home] [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
22-Feb-2013 19:31:18 [Radioactive@Home] Finished upload of sample_1500438_0_0
22-Feb-2013 19:31:19 [Radioactive@Home] Scheduler request completed: got 0 new tasks
22-Feb-2013 19:31:19 [Radioactive@Home] [sched_op_debug] Server version 613
22-Feb-2013 19:31:19 [Radioactive@Home] Message from server: No tasks sent
22-Feb-2013 19:31:19 [Radioactive@Home] Message from server: This computer has reached a limit on tasks in progress
22-Feb-2013 19:31:19 [Radioactive@Home] Project requested delay of 7 seconds
22-Feb-2013 19:31:19 [Radioactive@Home] [sched_op_debug] Deferring communication for 7 sec
22-Feb-2013 19:31:19 [Radioactive@Home] [sched_op_debug] Reason: requested by project

22-Feb-2013 19:31:30 [Radioactive@Home] [sched_op_debug] Starting scheduler request
22-Feb-2013 19:31:30 [Radioactive@Home] Sending scheduler request: To report completed tasks.
22-Feb-2013 19:31:30 [Radioactive@Home] Reporting 1 completed tasks, requesting new tasks for CPU
22-Feb-2013 19:31:30 [Radioactive@Home] [sched_op_debug] CPU work request: 1.00 seconds; 0.00 CPUs
22-Feb-2013 19:31:30 [Radioactive@Home] [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs
22-Feb-2013 19:31:30 [Radioactive@Home] [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
22-Feb-2013 19:31:32 [Radioactive@Home] Scheduler request completed: got 1 new tasks
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] Server version 613
22-Feb-2013 19:31:32 [Radioactive@Home] Project requested delay of 7 seconds
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] estimated total CPU job duration: 6922 seconds
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] estimated total NVIDIA GPU job duration: 0 seconds
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] estimated total ATI GPU job duration: 0 seconds
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] handle_scheduler_reply(): got ack for result sample_1500438_0
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] Deferring communication for 7 sec
22-Feb-2013 19:31:32 [Radioactive@Home] [sched_op_debug] Reason: requested by project
22-Feb-2013 19:31:34 [Radioactive@Home] Starting sample_1500753_0
22-Feb-2013 19:31:34 [Radioactive@Home] [cpu_sched] Starting sample_1500753_0 (initial)
22-Feb-2013 19:31:34 [Radioactive@Home] Starting task sample_1500753_0 using radac version 175

I see three possible changes, make Boinc 7 wait for the upload to complete before doing the trickle/request (you wouldn't want to do that with CPDN uploads, just nci uploads),
make Boinc 7 request again straight away like Boinc 6.10.58 would,
or get Radioactive@home to allow two tasks in progress, then request wouldn't get refused.


Claggy

Claggy
Send message
Joined: 29 Jul 12
Posts: 9
Credit: 149,374
RAC: 0

Message 1737 - Posted: 22 May 2013 | 21:44:22 UTC - in response to Message 1736.
Last modified: 22 May 2013 | 22:31:22 UTC

I take that Back, David has just responded:

Wed May 22 14:34:32 PDT 2013:

Fixed.
-- David


Edit:

http://boinc.berkeley.edu/trac/changeset/73bd46c3fa3e3bb4d687c4964a6ddff65ec0e4f8/boinc-v2#

client: don't ask an NCI project for work if current job still uploading

Note: we currently assume NCI projects have only 1 app.
Removing this assumption would be a little work.


Claggy

Post to thread

Message boards : Number crunching : Boinc 7.0.64 and nci uploads, trickles and requests


Main page · Your account · Message boards


Copyright © 2024 BOINC@Poland | Open Science for the future