CUDA on Gentoo

If you're a at least slighty geeky person and fancy of C, I definitely recommend you to check a thing called CUDA. Basically it's a way to use the GPU present at your nVidia graphics card for arbitrary parallel processing. And when I say parallel, I'm not thinking about stupid four cores but about 192 cores (available on GeForce GTS 450, reasonable basic model, but other cards have up to 1024 cores). I'm not a CUDA guru, I'm playing with it for a few days, but maybe this post could help you not to do the first steps more easily - especially if you're using Gentoo like me ...

Let's suppose you've successfully installed the card to the PCI-E slot and the Gentoo is running. Check that the OS is aware of the new card, i.e. that it's listed in the lspci output:

$ lspci | grep -i 'vga .* nvidia'

If you see something like this

01:00.0 VGA compatible controller: nVidia Corporation Device 1245 ...

then everything seems to work fine. Now you need to install the packages - drivers, SDK and a toolkit

  • x11-drivers/nvidia-drivers
  • dev-util/nvidia-cuda-sdk
  • dev-util/nvidia-cuda-toolkit

The current SDK and toolkit version is 4.0, but it's still masked in portage so you have to unmask it (no, I really don't want to use the ancient version 2.x). So let's put these two lines into /etc/portage/package.keywords

dev-util/nvidia-cuda-sdk
dev-util/nvidia-cuda-toolkit

You don't need to unmask nvidia-drivers, but you should have the current stable version - I was not able to make it work with 270.41.06, but with 270.41.19 everything seems to work fine.

Ok, so let's install the packages. Run this

emerge nvidia-cuda-sdk nvidia-cuda-toolkit

and if this succeeds, you're on the right path for working CUDA. Let's check that it works. The SDK installs a bunch of samples to /opt/cuda/sdk/C/bin/linux/release. The program we want to run is deviceQuery, and you have to run that as root:

$ sudo /opt/cuda/sdk/C/bin/linux/release/deviceQuery

That should print various info about the devices with CUDA support. In my case it prints this:

[deviceQuery] starting...
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Found 1 CUDA Capable device(s)

Device 0: "GeForce GTS 450"
  CUDA Driver Version / Runtime Version          4.0 / 4.0
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 1024 MBytes (...)
  ( 4) Multiprocessors x (48) CUDA Cores/MP:     192 CUDA Cores
  GPU Clock Speed:                               1.57 GHz
  Memory Clock rate:                             1804.00 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 262144 bytes
  ...

and much more information.

This program actually does one important thing - it creates necessary devices in /dev, namely

/dev/nvidiactl
/dev/nvidia0

and more /dev/nvidiaX devices if you have more CUDA capable devices. I do have just one, so I get /dev/nvidia0. Running this program without root privileges (without the devices already existing) results in this strange error:

[deviceQuery] starting...
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
[deviceQuery] test results...
FAILED

Press ENTER to exit...

which means "the devices in /dev do not exist and I can't create them." That's obviously a bit annoying because the files disappear after restart and "modprobe nvidia" does not recreate them. So you have to create them every time you restart your machine. You can either run deviceQuery with root privileges, or you can create this simple script in /etc/local.d/

#!/bin/bash

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then

        # Count the number of NVIDIA controllers found.
        NVDEVS=`lspci | grep -i NVIDIA`
        N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
        NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
        N=`expr $N3D + $NVGA - 1`

        for i in `seq 0 $N`; do
                if [ ! -e "/dev/nvidia$i" ]; then
                        mknod -m 666 /dev/nvidia$i c 195 $i
                fi
        done

        if [ ! -e "/dev/nvidiactl" ]; then
                mknod -m 666 /dev/nvidiactl c 195 255
        fi
else
        exit 1
fi

which is just a slightly enhanced script from CUDA Getting Started Guide (it checks the devices do not exist before actually creating them). Name it cuda.start and mark it as executable.

I personally use yet another script /etc/local.d/cuda.stop, that does exactly the opposite, i.e. it removes the files created in cuda.start and unloads the nvidia module from kernel.

#!/bin/bash

# check if there's a nvidia module
mod=`lsmod | grep '^nvidia ' | wc -l`

if [ "$mod" -eq "1" ]; then

        # Count the number of NVIDIA controllers found.
        NVDEVS=`lspci | grep -i NVIDIA`
        N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
        NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
        N=`expr $N3D + $NVGA - 1`

        for i in `seq 0 $N`; do
                if [ -e "/dev/nvidia$i" ]; then
                        rm /dev/nvidia$i
                fi
        done

        if [ -e "/dev/nvidiactl" ]; then
                rm /dev/nvidiactl
        fi

        rmmod nvidia

else
        exit 1
fi

This way it should work fine even when restarting /etc/init.d/local service.

I'm not quite sure how this is gonna work if you're not using the nVidia graphic card as primary (i.e. for X Window) - I'm using it just for CUDA and integrated graphics for GUI etc.

Anyway I think these two scripts are quite handy and should be added to nvidia-cuda-sdk, so I've created this enhancement bug.

Update: According to the bug, the files in /dev may be already created after restart. I'm not sure why that is not happening in my case, but I do suspect it's because I use the nVidia just for CUDA. Anyway, before creating the scripts (and before running deviceQuery with root privileges) check if the /dev/nvidia* files exist - if they do, then you don't need to bother with the start/stop script.

Just for sake of completeness - using the old drivers (270.41.06) results in this error

[deviceQuery] starting...
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 3
-> initialization error
[deviceQuery.exe] test results...
FAILED

Press ENTER to exit...

but installing the new ones fixes that.

Comments

Thanks, Especially for the sake of completeness

Yes!! I had always this initialisation error! Old damn driver!

Thanks and have a good day

Thanks

Well written. Helped me with "no CUDA-capable device is detected" error.

Thanks

Thanks

Sorted my "no CUDA-capable device is detected" problem. Thanks.

New comment

All the comments have to be accepted, so there may be some delay between submitting and accepting (or rejecting) the comment. If you enter the e-mail address, you will be informed about acceptance or rejection.

Subject or body may not contain HTML tags - they will be automatically removed. Paragraphs may be separated using a newline (ENTER).

(optional)