PrimeGrid
Please visit donation page to help the project cover running costs for this month
1) Message boards : Problems and Help : How to enable NVIDIA GPU on Ubuntu 10.10 64 bit (Message 35154)
Posted 3814 days ago by Bent Vangli
I just want to share my experience of how I get my GPU working with Primegrid. It may save you from some some few pitfalls I was caught in :-)

Equipment:
- Gigabyte GA-890GPA-UD3HRev 2.1 motherboard. BIOS version FF
- 16 GByte RAm distributed on four 4 GByte modules
- AMD Phenom II 1100T 6-core processor
- GeForce GTX 460 (1 GByte RAM, 336 Cuda processors)
- Ubuntu 10.10 64 bit.
- BOINC installed from packet manager (Synaptic)

Do the following:
- Update all your program from update manager, inclusive any kernel updates. If requested, reboot.
- Download latest NVIDIA driver, version 270.26 Beta from NVIDIA. Keep away from the 260.xx versions. It fails on Cullen/Wodall Sieve (cuda23)
and maybe other cida applications.

You should now be ready for doing some magic :-)

With this start setup BOINC manager is automatically started when computer is turned on. This is comparable to service mode on Windows machines. On windows
BOINC isn't able to use your GPU for CUDA processing in service mode. However, it is possible on Linux if the NVIDIA driver is loaded before the BOINC application. Default
this isn't true. So a little modification is necessary.

Before you do the actual tricks to change this automated startup, take a look at runlevels.

My Ubuntu do start in runlevel 2, which I do believ is standard on Ubuntu. You find your runlevel by typing "runlevel" on a command line. Your response will be like "N 2", where 2 is your runlevel. Knowing this, you are now prepared.

Steps (Do it as root. You switch to root by typing "sudo su" in a terminal window.):

1) rename /etc/rc2.d/S20b.oinc-client to /etc/rc2.d/S99boinc-client (2 reflects runlevel. This will load BOINC very late in boot process)

2) Enter asci mode by typing "telinit 1" in your terminal window. You will then switch to non graphics mode. Enter root mode if you are not there already

3) Enter the directory where your downloaded the NVIDIA driver using cd's.

4) Install the driver by typing "sh ./NVIDIA-Linux-x86_64-270.26.run" (Or the version you downloaded). This installation has to be done in none graphical mode.

5) Reboot. Just type "reboot" and press Enter.

If all went well, you should return to graphical mode with new driver and may log in as normal. Your BOINC client should automatically start enabling GPU processing. If you later on update your kernel version, it may be necessary to repeat step 2) to 5) above.

Good luck :-)

Bent Vangli


PS! If anybody finds improvements or errors in this description, please comment. :-)[/url]
2) Message boards : Problems and Help : GCW(Sieve) cuda errors, but PPS (Sieve) cuda runs just fine. (Message 34958)
Posted 3819 days ago by Bent Vangli
I can confirm, driver 270.26 beta work perfectly. Thanks again rroonnaalldd

Bent :-)
3) Message boards : Problems and Help : GCW(Sieve) cuda errors, but PPS (Sieve) cuda runs just fine. (Message 34955)
Posted 3819 days ago by Bent Vangli
Thanks rroonnaalldd

I will try that and report back.

Best regards Bent
4) Message boards : Problems and Help : GCW(Sieve) cuda errors, but PPS (Sieve) cuda runs just fine. (Message 34949)
Posted 3819 days ago by Bent Vangli
Hi

This week I was able to get my GPU up and running (I am working on compiling a "what I did list" for this forum), but then I stumbled into a peculiar problem with Cullen/Woodall Prime Search (Sieve) v1.12 (cuda23) tasks. They all error out after 1-3 GPU seconds (10-20 CPU seconds).

System:
- Gigabyte GA-890GPA-UD3HRev 2.1 motherboard. BIOS version FF
- 16 GByte RAM distributed on four 4 GByte modules (ganged mode)
- AMD Phenom II 1100T 6-core processor
- GeForce GTX 460 (336 Cuda processors, 1 GByte RAM)
- Ubuntu 10.10 64-bit version
- NVIDIA driver version 260.19.44

Task example giving this output:
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: hostid
Skipping: 186828
Skipping: /hostid
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 0.000000
Skipping: /starting_elapsed_time
--pmin=332856740000000 from command line overrides pmin=1000000001 from `input'
Sieve started: 332856740000000 <= p < 332856860000000
Thread 0 starting
Detected GPU 0: GeForce GTX 460
Detected compute capability: 2.1
Detected 7 multiprocessors.
Warning: A kernel failed with error unknown error. Retry 2.
Warning: A kernel failed with error unknown error. Retry 3.
Warning: A kernel failed with error unknown error. Retry 4.
Warning: A kernel failed with error unknown error. Retry 5.
Warning: A kernel failed with error unknown error. Retry 6.
Warning: A kernel failed with error unknown error. Retry 7.
Warning: A kernel failed with error unknown error. Retry 8.
Warning: A kernel failed with error unknown error. Retry 9.
Warning: A kernel failed with error unknown error. Retry 10.
Cuda error: getting factors found: unknown error
called boinc_finish

</stderr_txt>


The peculiar thing, for me :-), is that every Proth Prime Search (Sieve) v1.38 (cuda23) tasks runs perfectly. I have read that for a while ago some of the CW cuda tasks has some error in it, but I don't think this is the case now. I can't really see that it is a GPU hardware error, since a heavy load of PPS cuda tasks has finished without errors and still runs perfectly. GPU temperature is 74-75 degrees Celsius, far below the maximum of 104. Does anyone have a clue?

With very best regards

Bent Vangli, Oslo, Norway
5) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34736)
Posted 3824 days ago by Bent Vangli
Thanks for information. I learn every day.

I haven't noticed any difference in speed. At least not with any of my BOINC projects. I have build this PC purely for BOINC'ing. And next week I will install a GPU to test accelerated tasks.

With best regards Bent
6) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34699)
Posted 3824 days ago by Bent Vangli
Hi everyone

The problem I have reported in this thread seems to have found a working solution. I write this in hope that it may help others in the same situation.

Equipment:
- Gigabyte GA-890GPA-UD3HRev 2.1 motherboard. BIOS version FF
- 16 GByte RAm distributed on four 4 GByte modules (1333 MHz)
- AMD Phenom II 1100T 6-core processor (No overclocking)

Problem description:
All longer, and occasionally shorter, LLR based work tasks seems to complete, but doesn't validate. All Sieve based tasks completed perfectly and validates.

Solution:
- Enter BIOS during boot by presing Del Button
- Enter MB Intelligent Tweaker(M.I.T.)
- Enter DRAM Configuration
- On top of screen change DCTs Mode to "ganged"
- Press F10 for saving settings and reboot

Whats happening:
This is more like a guess than solid knowledge :-) The 1100T CPU has two memory controller registers named DCT0 and DCT1. In default setting "unganged" they may operate more or less independent. My guess is that the highly optimized assembler code inn LLR isn't aware of this, thus when the processors internally do task-switches or other multi kernel allocations, the LLR process may occasionally use the wrong DCT register and then operate on wrong memory allocations. Setting the DCT mode to "ganged", those two registers are tied together (made equal ?), thus this error can't show up.

Credits:
The clue to find this solution was given by TheDawgz, but thanks to everybody helping along the road. Every proposal and help brought me further in the investigation.

With very best regards

Bent, Oslo, Norway
7) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34632)
Posted 3826 days ago by Bent Vangli
Very interesting!!!!

A TRP LLR task now validated ( http://www.primegrid.com/workunit.php?wuid=173552059 )

However, it validated against another Phenom II Processor of an earlier modell. So before I can be conclusive, the other three LLR task has to complete.

TheDawgz, it may turn out that you gave the best clue despite I did the opposite :-)))

I hope PrimGrid give you some bonus credits :-))

Setting DCT to "ganged" seems to help in my situation. Motherboard defaults to "unganged".
8) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34624)
Posted 3826 days ago by Bent Vangli
As the easiest to do imidiate, I checked BIOS DCTs mode. This was default set to "unganged". To do a change I switched this to "ganged", which should tie the two build in memory controllers in the CPU together.

When restarted I aborted those LLR units which was already running and downloaded four new units:

3 x TRP LLR
1 x 321 PS LLR

Let us see what happens now :-)

Bent
9) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34623)
Posted 3826 days ago by Bent Vangli
New status:

SGP LLR finished, reported and validated. ( http://www.primegrid.com/workunit.php?wuid=173555973 )

TRP LLR finished, reported and marked as invalid ( http://www.primegrid.com/workunit.php?wuid=173481173 )

PPS LLR finished, reported and waiting for validation. ( http://www.primegrid.com/workunit.php?wuid=173556327 )

321 PS LLR finished, reported and marked with inconclusive validation ( http://www.primegrid.com/workunit.php?wuid=173501974 )

Woodall PS LLR still running ( http://www.primegrid.com/workunit.php?wuid=173483997 )

Cullen PS LLR still running ( http://www.primegrid.com/workunit.php?wuid=167451872 )

PSP LLR still running ( http://www.primegrid.com/workunit.php?wuid=173484291 )

Seventeen or Bust LLR still running ( http://www.primegrid.com/workunit.php?wuid=169605259)



In parallell I am running some different Sieve tasks

Cullen/Wodall PS Sieve finished, reported and validated ( http://www.primegrid.com/workunit.php?wuid=171236217 )

321 Prime Search Sieve finished, reported and validated ( http://www.primegrid.com/workunit.php?wuid=172796298 )

The Riesel Problem Sieve finished, reported and validated ( http://www.primegrid.com/workunit.php?wuid=167527527 )

PPS Sieve still running ( http://www.primegrid.com/workunit.php?wuid=173125671)


I monitor several temepratures during run. CPU temperature is about 40-41 deegre Celsius during 100% load on all kernels due to a liquid cooling system. Nortbridge may reach up to 50 Degree Celsius. All harddisks are below 40 deegrees Celsius.

Vato: I will try to follow your instructions and report back.
TheDawgz: I will check that too.

According to AMD the 1100T Phenom II is a 10h modell, which has several improvements over earlier models in the streaming architecture, thus affecting the SSEx instructions behaviour. If I have read documentation correctly, LLR is heavily dependent on assembler coding. Could it be some kind of mismatch?

Very best regards

Bent
10) Message boards : Problems and Help : Ubuntu 10.10 64bit AMD Phenom II 6 core trouble with some projects (Message 34612)
Posted 3827 days ago by Bent Vangli
One unit from SGP LLR already finished and validated. ( http://www.primegrid.com/workunit.php?wuid=173555973 )

One unit from PPS LLR finished and waiting for validation. ( http://www.primegrid.com/workunit.php?wuid=173556327 )

Its late night in Norway and I will follow up tomorrow :-)

Best regards Bent

Thanks TheDawgz, I realy appreciate your help. I am realy fumbling in the dark. Let the force be with me :-)



Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2021 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.40, 2.05, 2.32
Generated 20 Sep 2021 | 1:42:14 UTC