no longer on the map


Advanced search

Message boards : Number crunching : no longer on the map

Author Message
Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3091 - Posted: 4 Mar 2015 | 12:26:14 UTC

Hi, sensor No. 13304, was on the map and running along nicely, but seems to have disappeared off the map again. I do not know when it happened as I was alerted to it by a friend. I have looked in my account and for me, the local map is showing and my pointer is still in the place I put it, however the main map does not show me.

Any thought would be welcomed.
____________

Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3093 - Posted: 4 Mar 2015 | 16:59:44 UTC - in response to Message 3091.

You haven't sent back any results since '1 Mar 2015 | 14:02:43 UTC' but you have 2 WUs running - Maybe they are both hung as they are fighting for the sensor?


____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3094 - Posted: 4 Mar 2015 | 21:28:16 UTC - in response to Message 3093.

Yes I saw that, although in my Boinc Manager it only shows one actual task. I tried resetting the project but it does not seem to have rectified the problem
____________

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3095 - Posted: 5 Mar 2015 | 0:46:19 UTC

Is the sensor still plugged into the same usb port as before, and I take it you've tried the old reboot-the-pc method of seeing if that'll fix it?

You could try some sort of utility to list the usb ports and devices connected, see if Windows still sees the sensor as being there. This one might do it :
http://download.cnet.com/USBDeview/3000-2094_4-10614190.html

On Linux I used the lsusb command (not much use to you, but you should see similar name & vendor id in the list in Windows too) :

Bus 006 Device 002: ID 16c0:05df Van Ooijen Technische Informatica HID device except mice, keyboards, and joysticks


Not sure what else to suggest. Hope someone else has some ideas for you. I'm hoping it's just Windows being annoying and not the sensor.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3096 - Posted: 5 Mar 2015 | 8:02:15 UTC - in response to Message 3095.
Last modified: 5 Mar 2015 | 8:03:28 UTC

Thank you Gary and Chertsy Al, yes the sensor is still plugged into original USB, and the program you suggested has detected it is plugged in. I tried unplugging sensor resetting project and restarting pc but the rogue workunit sample_4525948_0 is still showing as running when it is not. I think this is the issue as, previously suggested by Chertsy Al as for some reason my 2hr WU are taking considerably longer. I had hoped it would go once I reset etc but it still shows on my account as in progress.
____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3097 - Posted: 5 Mar 2015 | 9:11:01 UTC - in response to Message 3096.

I seem to have re-appeared on the map although the rogue WU is still showing as in progress on my account. I have messaged Szopler about it.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3098 - Posted: 5 Mar 2015 | 10:05:27 UTC

Your work units are getting credits again too, it looks back to normal apart from that odd task. How strange.

I'm pretty sure the sensor in Cork, Ireland (number 13512, belonging to CzYsTy) also disappeared for a while and now is back. If you look at the tasks for that one, it also had something strange happen around March 1st, a work unit with no credits. That one is also running Windows 7. Not that I'm trying to blame Windows 7 (well, I am, but it's the only thing I can see in common at the moment). I wonder if it didn't handle the end of February very well? Lots of others on Windows 7 though, and I'm sure they all didn't disappear, so maybe not.

Hopefully Szopler or one of the others can pinpoint what happened for you!

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3099 - Posted: 6 Mar 2015 | 18:20:42 UTC - in response to Message 3098.

So today at 18.05 I get this message

Hello,

Your sensor connected to host http://radioactiveathome.org/boinc/show_host_detail.php?hostid=13304 is not responding. If you did not remove it on purpose, you may have to check
if it is connected and working properly.

I looked and it seems to be working ie numbers on screen are fluctuating as per normal, lcd is lit. My WU is running ok according to Boinc manager. I guess this message may be relating to the rogue WU that is on my account.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3100 - Posted: 7 Mar 2015 | 13:32:21 UTC - in response to Message 3099.

Today I have this WU that ran for over 7 hrs before I aborted it.(I am set for 2hr WU in preferences)

sample_4551105

created 7 Mar 2015 | 5:00:01 UTC
minimum quorum 1
initial replication 1
max # of error/total/success tasks 1, 1, 1
errors Too many total results

Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3102 - Posted: 7 Mar 2015 | 18:00:53 UTC - in response to Message 3100.

Couple of thoughts ...

Have you made sure that power saving for the USB ports is disabled?
Are you running some crappy AV or internet security (lol!) software?
Is your internet connection permanent, or does it drop out frequently?
Do the problems coincide with using another USB device? Scanner? External HDD?
Is Windows update running automatically? Scheduled virus scan?

Bit odd looking at your failed tasks, could be down to several different causes as there's nothing really obvious. I see you are running AVG actually - I bet that's screwing you over.


____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3103 - Posted: 7 Mar 2015 | 20:21:41 UTC - in response to Message 3102.

It was all working absolutely fine until the 1/3/2015 - I am moving it to another USB port to see if any difference is made. I do have avg but I am loathe to remove it just for Boinc as it has not caused a problem before and is giving me no indication that it is causing a problem now. Any suggestions for an alternative antivirus?
I am also going to change the time server to windows.com (from nist) clock as Gary suggested the change of date may have been an issue.
My internet connection is stable, no dropouts.
I can tell you though that when I have a WU that is running at double the expected time the sensor is showing higher figures than normal ie 0.23 when it is normally at 0.5- 0.6. Please excuse my ignorance in all of this by the way and thank you for the help, I have not heard back from Szopler so far regarding the rogue WU. perhaps I have to wait for that to expire on 15th for things to settle down. Is there a reset for the actual sensor or is unplugging it enough to reset it?

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3104 - Posted: 8 Mar 2015 | 8:17:36 UTC - in response to Message 3103.

haha last post should say 0.05 - 0.06 - anyway, I changed USB port and time and it seems to have settled down overnight. I shall keep a close eye on it and see how it goes. I am wondering why my tasks that took double the specified time to run but have been completed and validated why no credit has been awarded for them?
____________

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3105 - Posted: 8 Mar 2015 | 10:50:32 UTC

Re. not getting any credit for the long running tasks - this is exactly what I was getting right at the very start when I first plugged my sensor in. Default 2-hour WUs taking 5 hours or more, very little cpu, validated ok but no credit. It was something strange in the Linux security setup though, once I'd updated the relevant file to say the sensor was allowed to be plugged in, the run time dropped to 2 hours for subsequent WUs, cpu went up a bit, and they started getting credits. I don't see why Windows should throw the same wobbly after it's been working for a while, though as ChertseyAl said, it could be security related. Does AVG have any USB-specific settings? I think if it was USB power saving, the sensor would go off, and if you're still seeing numbers then power must be getting to it (and yes, unplugging it & putting it back in effectively resets it).

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3106 - Posted: 8 Mar 2015 | 12:51:22 UTC

Just out of interest - is there anything in AVG's "virus vault" (or quarantine), and/or if you look at its history, did it do a scan at about the time(s) your sensor stopped being recognised?

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3107 - Posted: 8 Mar 2015 | 13:04:36 UTC

Ooooohhhhhh, wait a minute, loook at this :

http://answers.microsoft.com/en-us/windows/forum/windows_vista-hardware/what-does-usb-selective-suspend-mean/71fa747b-914f-4d17-b476-cf4bb1bb783c

This sounds quite possibly your problem. If it puts the usb port in a low power mode, the sensor will still work (it doesn't draw much current), but it may interfere with Windows being able to talk to it. The link above tells you how to find the selective suspend setting and turn it off, I think?

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3121 - Posted: 9 Mar 2015 | 21:40:54 UTC - in response to Message 3107.

ok thank you Gary and CherstyAl- I read the article and had a look and yes the power management was ticked to allow windows to control power to USB. I have disabled this for all USB ports as am not sure how to work out which is which relative to device manager. This promptly gave me BSOD which disappeared too quickly for me to read beyond USB driver error.

So no new tasks set, reset project, unplugged sensor, restarted pc, uninstalled usb driver and removed device. Restarted pc and plugged all back in again. It took a good 5 mins before windows found the driver but it did reinstall it. I have now allowed the project to collect a WU and it does state it will take 3 hrs to run, set to 2 hrs so possibly still a problem.

My question is why the random crashes of this project when it runs 24 hrs a day, why would USB power management kick in at such random intervals? I also wonder why it took over a week before this started to happen? Maybe I have other problems here too. No reports in AVG, or windows firewall.

I also wonder if my cpu/gpu graphics card set up is anything to do with the issues. I have dual graphics (not crossfire) comprising of an AMD APU as primary and a HD6670 as a secondary card.

This is so frustrating as have been waiting nearly 3 yrs to get this project up and running.

I appreciate the help from you both.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3122 - Posted: 10 Mar 2015 | 18:51:37 UTC

I can understand your concern, it is a weird one. I've been having a look back through all your work units to see if I can find any pattern or anything out of the ordinary, but am stumped. It seems to be working again at the moment, which is good I suppose.

Just out of interest, did you alter the resource share setting in your preferences for this project (or any others)? I bumped mine up to 150, so it's the highest of my three projects I'm running (the others are at 100 and 50). I was thinking this would keep RadioactiveAtHome running regardless of whatever else was running, but I don't really know if it makes much difference, it was just a thought. I noticed you are running a lot more projects than me, though unless others are using usb devices too, I wouldn't have thought it would be the problem. Just clutching at straws at the moment.

Anyone else, feel free to chip in with your thoughts/comments/suggestions!

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3123 - Posted: 11 Mar 2015 | 0:02:52 UTC - in response to Message 3122.

Hi, getting fed up now as it has happened again today at about 17.30. I have no idea why it is doing it. I cannot place a single event that coincides with this project messing up but it is happening nearly every day now. I am going to remove the project and add it again in a last ditch attempt to sort it out. It would have been nice to have had a reply from project admins or to have had them analyse my WU's which are still running allegedly (I think there are 6 now that think they are running)
____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3126 - Posted: 11 Mar 2015 | 10:15:32 UTC - in response to Message 3123.

Do you guys think I should try Application/Sensor debug mode? So far everything has been based on my pc having the error, as opposed to the sensor itself.
____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3127 - Posted: 11 Mar 2015 | 11:28:46 UTC - in response to Message 3126.

I also notice it is using more than one slot, it seems to switch between 6 and 0, is this normal, I thought projects stayed on the same slot? I have an stderr file that I copied also. I wont post it as of yet as it is of course rather long. If this will help work out the problem let me know and I will post it.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3128 - Posted: 11 Mar 2015 | 13:05:36 UTC

I don't know what the debug mode does. More messages I guess, but haven't tried it.

I noticed ChertseyAl is running loads of projects too, so that in itself can't be a problem. He has set his work unit size to half hour though, maybe that makes a difference? I did dabble with altering the work unit size myself, just to see if it made any difference, but it seems to be about 3 credits per half hour, regardless of the size, so I left mine at 1.6 so it does them in (about) 10 credit chunks every hour and a half or so. I suppose with smaller units, you'll see sooner if there's a problem or not. Also, something is ringing bells that I read somewhere that reducing WU size resolved somebody elses problem, but I can't find the post now or remember what the problem was.

My other thought was that surely somebody else must be having similar problems. I've been looking back through the sensors on "the map" to see if I can find anyone else running Windows 7 on an AMD A8 processor, but so far haven't found one. The only two AMD A8 users so far that I've spotted are you and me! How strange. Anyway, just letting you know I'm still thinking on this one. It would be nice if one of the project's programmers (or was it just Szopler?) could comment on this, even if just to say no idea. Hint, hint.


Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3130 - Posted: 11 Mar 2015 | 17:06:30 UTC - in response to Message 3128.

FWIW, my sensor is currently on an old Vista32 laptop, and it's generally running Radioactive, Primaboinca, Universe, SRBase and WUProp at the moment (only a single core, so only one non-NCI project at a time obviously). It never gets powered down, nothing auto-updates, and there's no AV or internet security (lol) junk on it. Oh, and it's on a wired network connection. I have also run it successfully on an ancient Celeron desktop and an Atom laptop, both XP32, again with no problems, except both had slightly flaky network connections which is why I switched to a machine that's wired straight to my router.

WU length makes no difference either - I've run everything from 30 minutes to 2 days in the past, no problems at either extreme or anywhere in between. I'd suggest that the OP switches to 30 minutes and see what happens.

My money is still on some piece of spurious crapware screwing you over though :)


____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3131 - Posted: 11 Mar 2015 | 17:53:29 UTC - in response to Message 3130.

Gary, the comment you are referring to I believe was someone who had set their WU to something like 20 hrs and the comment was that it should be shorter due to the stderr not being able to hold all the relevant info for them to sort their particular error out .

ChertseyAl - I am on a wired connection and also run many projects, I turned wuprop off though as a possible culprit along with srbase as they just happened to be running when r@h was having issues. The 'stuck' Wu's have now been cleared and are marked as ''abandoned'' (which is a new one on me and made me laugh).

This project does not seem to pick up on the fact that I have a graphics card as well as my APU and I wonder if that is not helping.

I am also getting this

Radac $Rev: 585 $ starting...
Could not find any of the devices listed in sensors.xml: Device communication error
sensors.xml: 7 nodes found
Found sensor [unknown version 609 (2,61)]

is this normal? I have no idea.

Also :

Radac $Rev: 585 $ starting...
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x74F73226
Engaging BOINC Windows Runtime Debugger...

BOINC Windows Runtime Debugger Version 6.11.4

there is obviously a lot more to that stderr message.

If this problem continues I will change time of WU's to 30 mins as you suggest. I am loathe to remove my AV as this is my main pc. Is there a different one I should be using? I did have Mcafee as part of my BT subscription but that has recently been cancelled, hence the 'free' version of AVG on my machine.

Many thanks to both of you for your continued interest and patience.

Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3132 - Posted: 11 Mar 2015 | 19:51:41 UTC - in response to Message 3131.

re: "Found sensor [unknown version 609 (2,61)]"

I've only found one other user with the exact same entry in their stderrs, and those WUs completed OK, so I do know how relevant it is. That was on XP, so it's probably unrelated to the OS. I don't like the look of it though.

The unhandled exception stuff is just the fallout from aborting the WU, don't worry about it.

AV software? They're all garbage (no, really, they are). If you must use one I guess Malwarebytes is the least worst.

One thing you might want to look for is spooky USB drivers ... Try USBDeview from http://www.nirsoft.net/utils/usb_devices_view.html - Needs a bit of effort to work out what's what though. I've found it pretty handy though (thanks to Nate who recommended this on the BU forums).


____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3134 - Posted: 11 Mar 2015 | 22:54:00 UTC - in response to Message 3132.

Thanks, malawarebytes is the one I uninstalled, typical. I shall see how it goes and add removing avg to my list of 'things to try'. At the moment it seems to have settled. I have said that several times before so am not confident it is resolved yet.

USBDeview is one Gary also recommended to me on here. I have had another look and nothing untoward seems to be going on.

It could be that the WU's that don't work are just not compatible with my APU.

So we three shall keep pressing and guessing until the project owner can have a look and see if he can work out what the issue is.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3138 - Posted: 12 Mar 2015 | 10:55:10 UTC

I get that same unknown version 609 (2,61) message at the start of all my tasks as well, and the tasks are completing ok, so I don't think that's a problem. The 2,61 is the same as the version number of the kit's firmware too, v2.61, so it does at least confirm the computer is communicating with the sensor.

I do get some other intermittent error messages in my stderr output too actually, either "timer expired" or "protocol error", depending on which type of USB port I have the sensor plugged in to (one message with USB2, the other with USB3), but again, it doesn't seem to affect the tasks, they still run on to complete ok. Not much help to you I know, but I just thought I'd mention it while we're looking at the output.

I'm using an AMD A8-3270K APU, which is similar to your 6600K, I think.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3142 - Posted: 13 Mar 2015 | 17:07:47 UTC - in response to Message 3138.
Last modified: 13 Mar 2015 | 17:23:43 UTC

So, it was running along nicely and now once again it has hit a snag, it seems to run fine in 'slot 0' but when that changes as it did today to slot 15, the WU's just run over time to double expected time or longer. Can anyone tell me how to resolve this issue?

I am removing AVG.

Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3143 - Posted: 13 Mar 2015 | 17:36:21 UTC - in response to Message 3142.

My guess is that you've excluded slot 0 from AVG's meddling and as a result it runs happily from there. Any other slot could be being screwed up by AVG. You need to exclude the whole of the BOINC data folder structure from your program destroyer, sorry, 'virus scanner'. But even then a new subdirectory (slots folder) might not be automatically excluded from interference by AV crapware.

Not a fan of AV or 'internet security' (lol) software, in case you hadn't noticed ;)


____________

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3144 - Posted: 14 Mar 2015 | 0:17:02 UTC

ChertseyAl - lol :-)

Euphoriabuzz - I notice also that your last couple of work units have been given more credit than your previous ones were getting (12.6 rather than 12.4). Not a lot, but interesting, as runtime still seems to be the same.

I haven't sussed out slots, you both are ahead of me on that, can't seem to find any mention of it in my prefs or output, but never mind. I hope it was just AVG and this fixes it for you now.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3145 - Posted: 15 Mar 2015 | 22:26:56 UTC - in response to Message 3144.

OK so it wasn't AVG, I have just cancelled a 12 hr unfinished WU . Not been on pc today until 10.15 pm so it is not a programme I am using during the day that is causing the issue.
____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3146 - Posted: 15 Mar 2015 | 23:11:20 UTC - in response to Message 3144.
Last modified: 15 Mar 2015 | 23:15:40 UTC

Regarding the credit awarded, it's relevant to ''run time secs'' of the wu not the cpu time. There is no change in that if you go back through my list of tasks you will see that all wu that go over 7300 run time get awarded the higher credit. These vary through the day and not all tasks receive the higher credits as you can see below from my list of the last few tasks my pc has done, I think this is probably normal.


Work unit Run time(sec) CPU time(sec) Credit
4609479 4590195 --- --- ---
4607139 4587855 --- --- ---
4606757 4587473 - 7,239.48 - 0.23 - 12.51
4606399 4587115 - 7,311.93 - 0.22 - 12.64
4606021 4586737 - 7,301.07 - 0.23 - 12.61
4605673 4586389 - 7,303.29 - 0.22 - 12.63
4605303 4586019 - 7,291.94 - 0.44 - 12.53
4604905 4585621 - 7,252.23 - 0.75 - 12.52
4604525 4585241 - 7,219.01 - 0.31 - 12.48
4604113 4584829 - 7,255.02 - 0.33 - 12.53
4603705 4584421 - 7,232.22 - 0.44 - 12.50
4603284 4584000 - 7,325.34 - 0.25 - 12.64
4602910 4583626 - 7,237.73 - 0.33 - 12.51
4602540 4583256 - 7,240.25 - 0.27 - 12.50
4602172 4582888 - 7,249.44 - 0.17 - 12.52
4601818 4582534 - 7,253.33 - 0.19 - 12.52
4601420 4582136 - 7,323.71 - 0.20 - 12.66
4601084 4581800 - 7,266.40 - 0.22 - 12.56
4600754 4581470 - 7,199.38 - 0.31 - 12.47
4600371 4581087 - 7,292.06 - 0.30 - 12.61

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3147 - Posted: 16 Mar 2015 | 12:41:56 UTC - in response to Message 3146.

Ok, that's probably not significant then.

I found the BOINC Wiki which gives a little overview of slots and how they're supposed to work - http://boinc.berkeley.edu/trac/wiki/BoincFiles - and it also mentions (near the bottom) a file deleter daemon - http://boinc.berkeley.edu/trac/wiki/FileDeleter - which is supposed to tidy up any unused slots (if I am reading this correctly). The last line is puzzling :

If the web-server account on your system is not 'apache', add a <httpd_user> element to your config.xml file. Otherwise antique deletion won't work.


I don't think I even have a web server on this machine, I wouldn't have thought many people would have, so don't know if this is relevant, but it's the nearest I've found to anything about slots not being tidied up. So far anyway.

My other thought was maybe one of your other projects is interfering with Radioactive at home. You could try temporarily disabling each of the others one by one (or a few at a time) with the "no more tasks" option so they stop running when the current task finishes, then if you still have problems with Radioactive, you'll know those projects weren't the problem, so re-enable them & try it on different one(s)?

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3149 - Posted: 18 Mar 2015 | 13:40:35 UTC - in response to Message 3147.

Ok so I do get an error message in the event log regarding deletion of r@h files stating that boinc is unable to remove ??.xml.?? then when a new wu comes it mentions something about being unable to download r@h?? as it already exists.

I have put question marks as I deleted the project and re installed it so that particular message is not there for me to read right now.

As I run so many projects the second suggestion would involve me having to suspend all projects at the same time or set to 'no new task' and wait until all my current wu's have finished, but I have wu's in queue up to 20th April deadline so that will take a while. I am not prepared to cancel my current queue of wu's as I have some from projects that I hardly ever receive from so wish to crunch those while I can. I would then have to allow new task one by one for each project as opposed to stopping them one by one.

Of course, even if I find that this project is in fact conflicting with another, the total lack of project admins communication means that it will probably never be resolved. I am more inclined however, to think it is to do with my APU which you pointed out was not being used by anyone else on this project as a project conflict would likely be affecting other users also and it seems to only be me with this issue, or perhaps others have just given up waiting for an answer. !!!!!!

I do appreciate the continued help and suggestions though, thanks guys.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3150 - Posted: 18 Mar 2015 | 15:14:14 UTC

a project conflict would likely be affecting other users also

- ah, yes, of course, scrap that idea then!

Hmm, maybe I'll just plod right through the list of users who've made it onto the map and see if we really are the only two using AMD A8 APUs. May take me a while but I'm interested to know.

Did you try the debug mode to see what it does? If it causes hassle on the admins side, maybe they'll take notice and have a look at it too? ;-)

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3151 - Posted: 18 Mar 2015 | 15:28:14 UTC - in response to Message 3150.
Last modified: 18 Mar 2015 | 15:37:46 UTC

lol,good plan, but I thought they would have noticed all the aborted or failed WU's and I did contact szopler directly- is it safe for my pc to try debug mode, I don't know what it does though?

I had an email off the project (automated) telling me my sensor is not responding and to check the connection, hahahahaha. That must have been generated when I deleted the project folder from the Boinc projects directory.

Profile ChertseyAl
Avatar
Send message
Joined: 16 Jun 11
Posts: 152
Credit: 394,231
RAC: 150

Message 3153 - Posted: 18 Mar 2015 | 17:30:40 UTC - in response to Message 3149.

Ok so I do get an error message in the event log regarding deletion of r@h files stating that boinc is unable to remove ??.xml.?? then when a new wu comes it mentions something about being unable to download r@h?? as it already exists.


Aha ... So there's a file permissions problem then. Do you switch users on that machine, or log off at all? Did you install BOINC to be shared by all users or just the user that installed it? Maybe you did a service install?

I'm thinking (almost certainly incorrectly!) that you've got multiple users or accounts on the machine, and as a result sometimes the account that's logged in doesn't own the files the were downloaded by the previous user so can't delete them or overwrite them with the new WU. The only file that sticks around is the sensors.xml file I believe, so maybe when the project switches slots something goes wrong with creating a new one. Don't know much about this sort of stuff to be honest :)

Perhaps it might be simpler for you to build a Wilson cloud chamber and just look at the pretty streaks that appear. Old-skool FTW ;)


____________

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3154 - Posted: 18 Mar 2015 | 18:25:21 UTC - in response to Message 3153.
Last modified: 18 Mar 2015 | 18:43:19 UTC


Aha ... So there's a file permissions problem then. Do you switch users on that machine, or log off at all? Did you install BOINC to be shared by all users or just the user that installed it? Maybe you did a service install?


Ok so no other users on this pc, guest account is switched off also, no need to log out or switch user. Not a service install.

so maybe when the project switches slots something goes wrong with creating a new one.


aha, we go back to the slot problem. I tend to agree with you here ChertseyAl but am yet to work out why it only happens sometimes, I wonder if the time allowed for this function to happen is not quite long enough so when the new WU tries to use the slot, it appears as if it is still used by the previous WU as it has not emptied quickly enough.

Perhaps it might be simpler for you to build a Wilson cloud chamber and just look at the pretty streaks that appear. Old-skool FTW ;)


In the spirit of kit making here is one I knocked up using my sons bicycle pump and an old eye solution bottle lol

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3156 - Posted: 19 Mar 2015 | 22:19:50 UTC

I've finished checking every single dot on the map looking for anyone else with v2.61 sensors up and running (they show up with GRS v0 on the little graphic under the world map when you select a sensor on there) - there are only 5 so far! Two are running Linux on ARM processors, there's me with Linux on an AMD A8, you with Win 7 on an AMD A8, and one with Win XP on a Pentium. That might explain why nobody else is having the same problem yet, if it's specific to the new v2.61 sensors.

Sensor id user Computer
13304 Euphoriabuzz AMD A8 6600K, Win 7
14082 (me) AMD A8 3270K, Linux 3.13.0 (Mint 17.1)
14133 Krzysztof Pentium E2180, Win XP
14139 DarkSoul ARM v6, Linux 3.18.7
14233 jaca ARM v6, Linux 3.18.7


I'll keep checking back to see if any new v2.61 dots have been added! Can't believe there are only 5 so far out of the 102 kits which were sent out, but I'm pretty sure I've checked the list correctly.

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3157 - Posted: 19 Mar 2015 | 23:08:47 UTC

ps - in an effort to boost numbers, I've just bought another kit off eBay which someone was selling because they didn't know how to build it. I'll get that built and hopefully sell it on as a working unit in a week or so, so we can get another v2.61 user on the map! :-)

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3158 - Posted: 19 Mar 2015 | 23:19:31 UTC - in response to Message 3156.

wow you must have been at that for ages. Thanks. Well so far since I have managed 18 WU's in a row since deleting the project from the program files and then reinstalling it. It is too soon to know if it has been resolved or not.

There is a post somewhere mentioning the lack of connected sensors and possible reasons why. I think people possibly did not realise it was in kit form so have yet to make them, or are in the middle of making them still waiting for parts. Perhaps you were just super speedy with your soldering. :)

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3159 - Posted: 20 Mar 2015 | 7:57:19 UTC - in response to Message 3157.

ps - in an effort to boost numbers, I've just bought another kit off eBay which someone was selling because they didn't know how to build it.


That's a shame, there are probably a lot of people out there like me who cannot make the sensor themselves, if they have been waiting forever to get on the list then they may not realise it is not sent out whole any more.

I'll get that built and hopefully sell it on as a working unit in a week or so, so we can get another v2.61 user on the map! :-)


Good news for you though, will that be another linux? You will have to find another roundabout !!

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3160 - Posted: 20 Mar 2015 | 8:52:23 UTC - in response to Message 3159.

you must have been at that for ages

hmm - it did take a couple of hours, in between cups of tea & doing other things to stop me going cross-eyed! Interesting seeing what others are using though.

they may not realise it is not sent out whole any more

yes, seems to have caught a few out, though as I read it this was just a trial run, possible that future ones may be ready made again? Which would be a shame because it's a really nicely designed pcb and box and I enjoyed building ours.

will that be another linux? You will have to find another roundabout

I think there were more Windows users than Linux overall, just the Linux ones seem to be quicker at getting the kits built perhaps. We like to fiddle with things like this more I guess! Yes, my favourite roundabout, not far from where I live ;-)

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3161 - Posted: 20 Mar 2015 | 17:38:01 UTC - in response to Message 3160.

Well considering the server has been down for 8 hours today and there is no message from Szopler or Admin about it- I have given up on this thread ever getting an answer from them. If they cannot interact and communicate about anything other than ordering and paying for kits then this feels like it is less of a project and more of a small business. I hope your new kit is straight forward for you to make.

I am so far,(whispers to Gary and ChertseyAl) still running ok. I hope today's outage doesn't set me back into the cycle of failing WU's. It does for now though seem that deleting the files from the projects folder in the Boinc program data file thus forcing a clean re-install seems to have cleared the issue. If I manage to run all weekend with no issues I reckon that it is fixed.

Watch this space guys.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3162 - Posted: 20 Mar 2015 | 18:14:15 UTC - in response to Message 3161.
Last modified: 20 Mar 2015 | 18:14:45 UTC

Well considering the server has been down for 8 hours today and there is no message from Szopler or Admin about it-


I am guessing it was to do with the eclipse, perhaps they shut it down to prevent false readings
____________

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3163 - Posted: 20 Mar 2015 | 22:55:52 UTC - in response to Message 3162.

I'm hoping it wasn't deliberate. I was interested to see if the radiation levels increased or decreased during the eclipse, they wouldn't be false readings if it really did change. I forgot to check the front panel to see what it was reading, too busy taking photos!

Profile krzyszp
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 16 Apr 11
Posts: 373
Credit: 462,192
RAC: 135

Message 3164 - Posted: 21 Mar 2015 | 7:48:42 UTC - in response to Message 3163.
Last modified: 21 Mar 2015 | 7:49:25 UTC

Yesterday server off situation was regarding to late bill payment date. Just email reminder about payment didn't arrive to right person...

Also, the project never was a commercial project.
____________
Regards,
Krzysztof 'krzyszp' Piszczek
Android Radioactive@Home Map
Android Radioactive@Home Map - donated
My Workplace

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3170 - Posted: 22 Mar 2015 | 9:38:32 UTC - in response to Message 3164.

Well thank you for the reply, nice to know you are reading the posts here. Shame that you do not have anything to say about the actual issue of failing WU's. There are 45 posts on this thread and only one of them (the one about money) generates a reply!!!!!

Also, the project never was a commercial project.


I said it 'FEELS' like it as the only post that gets any attention is the one to order new kits. This project should be mindful of the fact that it's participants have paid money and spent time making the sensors in order to run your project. If someone is having problems running the project or issues with your WU's you would think that as project scientists, someone would be interested enough to try and look into it.

I think this is a worthwhile project to run and it would be a shame if it did not progress beyond it's current reach across the map because it has earned itself a bad reputation!!!

Ask people who run multiple BOINC projects and they will tell you there are good projects and poor projects, and the list of poor ones nearly always comes down to the fact that the participants cannot get any response from the Team running the project.

Profile krzyszp
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 16 Apr 11
Posts: 373
Credit: 462,192
RAC: 135

Message 3180 - Posted: 24 Mar 2015 | 17:31:21 UTC - in response to Message 3170.

Firstly, apologise that I'm not always on forum.
I'm not involved in kit's and can't really answer for questions, also loads of our volunteers have bigger knowledge about some aspects of project then I have.

Just remember, please, that this project is started without any support from any educational, government or any other agency - just volunteers, like you and me. We start it three years ago, completely from scratch, but from day one we called it as Open Source and open Hardware - because nobody make a plan to make any profit and it was clear for us, that not always we will find a time to keep everything under control.

We tried to build a community of people who will make a progress and this is another reason, why everything in project is open, including results for historical data, API to access to results db, firmware sources, application sources. just everything.

Why most activity is on kit's topic? Because software problems including subjects about starting software on various systems was discussion topic on various part on forum and in 99,9% solved somewhere and Szopler probably prefer to concentrate on answers for questions where he is personally involved.

Ps. I'm participate in loads of projects since SETI@Home in 1999. I'm also involved in other projects as administrator...
____________
Regards,
Krzysztof 'krzyszp' Piszczek
Android Radioactive@Home Map
Android Radioactive@Home Map - donated
My Workplace

Profile KarmannGaz
Avatar
Send message
Joined: 12 Mar 13
Posts: 100
Credit: 129,537
RAC: 43

Message 3184 - Posted: 27 Mar 2015 | 11:48:03 UTC

Thanks for the response(s) Krzyszp :-)

Euphoriabuzz - no news for a few days, hope that's good and your setup is behaving itself! Re the debug mode, I see user Sean has it switched on - there are a lot of extra messages in his task output, see http://radioactiveathome.org/boinc/hosts_user.php?userid=4232 and look at the task outputs for his Linux machine. Just noticed it when I was looking at a post he's put on here ( http://radioactiveathome.org/boinc/forum_thread.php?id=154&nowrap=true#3182 ) which looks a lot like the fun I was having with mine. I've replied to him on there about it.

Profile Euphoriabuzz
Avatar
Send message
Joined: 29 May 12
Posts: 38
Credit: 117,401
RAC: 142

Message 3185 - Posted: 27 Mar 2015 | 12:57:57 UTC

Gary and ChertseyAl - famous last words but since I deleted the folder from the program data everything has been running smoothly. I assume from this that my initial download of R@H master files was in some way corrupt or became corrupted on 1st March (possibly from not reading the date jump from 28 Feb correctly). So it was a project software issue I think, not anything to so with the sensor itself.
I wonder if the change of clocks this weekend will also affect it?

Gary, yes I had seen Sean's task stderr files and noted he was also receiving no credits. He seemed to be getting some help from project admin TJM though.I saw something about the code written incorrectly! It is good that your struggle to get yours up and running has paid off by you being able to advise others of the possible issues.(You will be the Linux go-to man!)

Anyhow, many thanks for all the suggestions and helpful ideas. Have a great weekend both.

Post to thread

Message boards : Number crunching : no longer on the map


Main page · Your account · Message boards


Copyright © 2019 BOINC@Poland | Open Science for the future