The CUDA app

Author	Message
Death [Kiev] Send message Joined: 21 Jul 08 Posts: 14 Credit: 124,289 RAC: 0	Message 1531 - Posted: 28 Feb 2010, 16:06:40 UTC hI. When we'll see a beta? I can't wait. ))) And btw, what we are crunching right now? ====== wbr, Me. Dead J. Dona ID: 1531 · Rating: 0 · rate: / Reply Quote

ai5000 Send message Joined: 24 Sep 07 Posts: 8 Credit: 1,057,607 RAC: 0	Message 1546 - Posted: 22 Mar 2010, 18:10:41 UTC Any news, good or bad, would be appreciated! ID: 1546 · Rating: 0 · rate: / Reply Quote

trigggl Send message Joined: 23 Apr 09 Posts: 3 Credit: 271,124 RAC: 0	Message 1557 - Posted: 5 Apr 2010, 14:45:33 UTC I assume a CUDA app wouldn't require a lot of memory in Linux? I can't use my 8600GT due to the low Ram of 256M in most projects. So far, in Linux, I'm only able to crunch Collatz. In Windows I'm able to crunch AP26 as well, but they have been unwilling to do an x86 Linux CUDA app. I only installed Windows on my computer to run Turbo Tax, so I don't boot into it very often. 6r39 7ri99 Beware the dual headed Gentoo with Wine! ID: 1557 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1564 - Posted: 19 Apr 2010, 12:17:34 UTC - in response to Message 1546. Any news, good or bad, would be appreciated! Right now I have very little time for active development, during last few weeks I barely had time to keep the server running. It should change during first or second week of May, when I'll finish some of the ongoing work-related projects. M4 Project homepage M4 Project wiki ID: 1564 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1565 - Posted: 19 Apr 2010, 12:25:28 UTC - in response to Message 1557. I assume a CUDA app wouldn't require a lot of memory in Linux? I can't use my 8600GT due to the low Ram of 256M in most projects. So far, in Linux, I'm only able to crunch Collatz. In Windows I'm able to crunch AP26 as well, but they have been unwilling to do an x86 Linux CUDA app. I only installed Windows on my computer to run Turbo Tax, so I don't boot into it very often. It should also run dnetc@home tasks without problems, if the project admin already changed minimum memory requirements. As far as I remember, BOINC server has hardcoded limit of 384MB minimum, somewhere in the scheduler code. The bombe simulator works fine with just 256MB, but right now I have no idea how fast it'll run on nVidia's 8400/8500/8600 cards. The good thing is that one of the BOINC@Poland members sent me his old 8500GT for free and it's installed in my development machine, so I'll be able to test app's performance on low-end cards. M4 Project homepage M4 Project wiki ID: 1565 · Rating: 0 · rate: / Reply Quote

Falconet Send message Joined: 9 May 09 Posts: 2 Credit: 3,152 RAC: 0	Message 1720 - Posted: 17 Aug 2010, 16:00:59 UTC Any news? ID: 1720 · Rating: 0 · rate: / Reply Quote

frankk Send message Joined: 2 Sep 10 Posts: 1 Credit: 3,170,867 RAC: 0	Message 1732 - Posted: 3 Sep 2010, 2:35:51 UTC Friendly ping......? Any luck with Cuda app? Can any of us help? Thanks Frank P.S. Not a coder but a Sys Admin ID: 1732 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2015 - Posted: 7 May 2011, 21:37:06 UTC I'm somewhere in the middle developing the server software for the CUDA-bombe app. I've encountered several problems. The main problem, at least for now, is that automatically generated bombe menus sometimes are far from optimal. I don't know how does it affect average results, but in not-so-rare cases non-optimal menu prevents the bombe from finding the solution (which, for test workunits, is known to be in the search limits), most probably due to mid wheel turns. It's relatively easy to create a menu for given crib manually, on paper. However, a single menu is useful only for single workunit, so the menu creating process has to be automated, but programming good algorithm to do that might be beyond my skill levels. Another big problem is the bombe output data. Since the bombe can produce several "stops", automated review system is needed to reject junk and leave only the "good" data. Most of the junk results are easily recognized by a simple program, however the rest has to be fed to something more complicated, either directly to scoring algorithms and eventually for human review (if the menu was strong and bombe returned most of the steckers) or to another software (perhaps hillclimb algorithm with forced starting position) to try guessing the rest of the steckers. This is going to be more complicated than I initially thought. M4 Project homepage M4 Project wiki ID: 2015 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2016 - Posted: 7 May 2011, 21:47:30 UTC - in response to Message 2015. The post above describes major problems, but there are also some others. For example, the Linux app runs fine (except that I probably found a couple of bugs), but I have problems compiling the sources on Windows. I got it working with my old combination of dev-cpp and gcc, however I'd prefer MS Visual Studio, because it will be far easier to build the app with BOINC API that way and also it might be the only way to build the GPU app. I'm looking for someone who could help with patching the source to work in Visual Studio - I believe that mostly (or only) the preprocessor code has to be changed. M4 Project homepage M4 Project wiki ID: 2016 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2023 - Posted: 10 May 2011, 20:59:10 UTC - in response to Message 2016. Well, today finally I tested Windows CUDA code, tweaked for MS VS and compiled by sprint. The good news - it works. The bad news - it uses very little or no GPU at all, so something is wrong. M4 Project homepage M4 Project wiki ID: 2023 · Rating: 0 · rate: / Reply Quote

L473ncy Send message Joined: 5 Jan 11 Posts: 5 Credit: 285,258 RAC: 0	Message 2032 - Posted: 12 May 2011, 12:20:05 UTC Keep trying. If you get it working I can put an 8800GT and GTX460 to working on the project. I'd suspect a lot of other users could get their CUDA cards working on it too. The speed gains would be enormous and we could probably finish up all the WU's in a few months. ID: 2032 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2033 - Posted: 12 May 2011, 13:54:33 UTC - in response to Message 2032. Yesterday the app was rebuilt and finally GPU works on Windows. However, this time it stresses GPU too much. On my system I can barely move mouse when it's running enigma M4 menus and someone else told me, that it switched off his laptop :O This is quite strange because within M3 range it seems to be working fine. M4 Project homepage M4 Project wiki ID: 2033 · Rating: 0 · rate: / Reply Quote

Pepo Send message Joined: 15 Nov 07 Posts: 1 Credit: 19,390 RAC: 0	Message 2036 - Posted: 13 May 2011, 20:30:18 UTC - in response to Message 2033. However, this time it stresses GPU too much. On my system I can barely move mouse when it's running enigma M4 menus ... The PrimeGrid's GPU task behaves the same - getting my machine's graphical responsiveness to knees :-) So we will get another efficient mouse-killer app? Peter ID: 2036 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2037 - Posted: 13 May 2011, 20:52:18 UTC - in response to Message 2036. I don't know, perhaps the high GPU load is a bug. While running M3 workunits the app is barely noticable. M4 Project homepage M4 Project wiki ID: 2037 · Rating: 0 · rate: / Reply Quote

L473ncy Send message Joined: 5 Jan 11 Posts: 5 Credit: 285,258 RAC: 0	Message 2039 - Posted: 13 May 2011, 22:08:32 UTC I would probably chalk it up to a bug unless for some reason the complexity of M3 vs M4 is huge and you get something in Big O notation of O(n!). If it's working release the CUDA for M3 units and let the CPU crunchers work on the M4's. That way the work gets done and after that all that's needed to crack are the M4 messages. ID: 2039 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 2040 - Posted: 13 May 2011, 22:28:26 UTC - in response to Message 2039. The app is far from release. There are certain bugs that affect both GPU and CPU versions, while the GPU has at least a couple of issues on it's own. M4 Project homepage M4 Project wiki ID: 2040 · Rating: 0 · rate: / Reply Quote

thinking_goose Send message Joined: 12 Nov 07 Posts: 119 Credit: 2,750,621 RAC: 0	Message 2041 - Posted: 13 May 2011, 22:38:17 UTC - in response to Message 2037. Primegrid makes my 8600gt run a bit slower too, but it hasn't caused any problems yet. I can always switch the system over to the onboard graphics if it gets too slow.... ID: 2041 · Rating: 0 · rate: / Reply Quote

quel Send message Joined: 19 May 09 Posts: 34 Credit: 32,923,471 RAC: 0	Message 2051 - Posted: 24 May 2011, 3:28:27 UTC To decrease the latency send smaller buffers of work to the GPU but send more batches total. This gives the GPU more chances to handle other tasks. This isn't optimal for the GPUs that can handle it but is better than crushing GPU crunchers. For DistrRTgen we took that approach. There are a lot of fancy items you can do with the various custom app plans in BOINC and runtime detection to scale the buffer sizes according to the card but it's a giant pain. Everything is GPL licensed so feel free to take a look. Oh and actually follow the BOINC dev doc recommendations on how to do the thread yielding. That code is doing it manually but that's to work around bugs on windows clients < 6.10.59 (or maybe 6.10.60) and BOINC linux CUDA has no stable release that fixes all the bugs yet - it's only in the upstream dev trees. (The bug in question is related to if the GPU task is suspended for whatever reason and upon resume it would enter a state where every WU would error. My linux hack makes it at worst that the WU that was suspended might error instead of your entire task list.) ID: 2051 · Rating: 0 · rate: / Reply Quote

quel Send message Joined: 19 May 09 Posts: 34 Credit: 32,923,471 RAC: 0	Message 2053 - Posted: 24 May 2011, 16:53:40 UTC Also, despite this not being the "proper" BOINC method. Add cudaSetDeviceFlags(cudaDeviceBlockingSync); to the first line of cuda_turing_run in cuda_turing.cu and see what the responsiveness looks like. It will decrease the performance but should improve having a usable computer. ID: 2053 · Rating: 0 · rate: / Reply Quote

Romero Send message Joined: 11 Sep 10 Posts: 1 Credit: 195,497 RAC: 0	Message 2195 - Posted: 14 Jan 2012, 19:47:38 UTC Hi, someone could tell me if the cuda application for enigma @ home is in development? Thanks ID: 2195 · Rating: 0 · rate: / Reply Quote