CS475 Lab : Introduction to GPU and CUDA

This lab session is intended to learn about the various GPUs installed in the CS lab machines and to use NVIDIA tool to find out the various features

1. Details of the GPUs on the CS machines machines list

  1. Log into 3 different CS machine: 1 of the capital machines, and 2 in the hpc-lab. (get the list from the and do one of each color on this slide.
    1. ssh to a hpc-lab machine and run the command /sbin/lshw
    2. This will list the details of all the hardware details
    3. Check the GPUs listed and record their details. When you are done with this you should have a list of 3 different GPUs. Note that each machine has more than one GPU and you have to choose which one is CUDA capable.
  2. Check the features
    1. Use the NVIDIA built in tool to check the features of the installed GPUs
    2. Copy the directory /usr/local/cuda-7.5/samples/1_Utilities/deviceQuery into your own work space
    3. Type the command "make" which will compile the .cpp file
    4. The make command is going to fail
    5. You need to fix the Makefile by changing 3 parts of the file:
      INCLUDES  := -I../../common/inc
      to
      INCLUDES  := -I/usr/local/cuda-7.5/samples/common/inc
      
      $(EXEC) mkdir -p ../../bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)
      $(EXEC) cp $@ ../../bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)
      to
      $(EXEC) mkdir -p bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)
      $(EXEC) cp $@ bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)
      
      rm -rf ../../bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)/deviceQuery
      to
      rm -rf bin/$(TARGET_ARCH)/$(TARGET_OS)/$(BUILD_TYPE)/deviceQuery
      
    6. Run deviceQuery
  3. Lab Report
    1. Prepare the machine description portion of a lab report consisting of the below fields for all the different GPUs installed on the machines along with a description of each host machine (2 of which will be the same).
      1. Cuda Cores
      2. Clock Speed
      3. Memory Clock Speed
      4. Total amount of shared memory per block
      5. L2 Cache Size
      6. CUDA Capability level

2. Run Hello World in Cuda!

  1. Run the given HelloWorld.cu program in Cuda
  2. Modify the number of Threads and Blocks and observe the differences
  3. Uncomment the if statemnet in the kernel code and run the program again
Instructions to run the program
  1. You can compile using the below command:
    nvcc HelloWorld.cu -o hello
  2. Run it with the command: ./hello to get the output
  3. You can alternatively use a makefile to compile it