Monday, 7 December 2015

Project Euler #10: Summation of primes

This post is copied from another blog I run. I'll be shutting that blog down shortly and all future posts of that kind will go on this blog. These are solutions to various programming brain teasers. Optimizing your solution is the real challenge.

The problem reads as follows:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
Right from the start I can think of two methods:
  1. Using a similar solution as for PE-7 but adding the primes instead of storing them
  2. Sieve of Eratosthenes
I opted for the first method since I'm lazy. With minimal modifications, I get 173518 microseconds as my runtime. I was curious about the second method, so I copied and modified a solution found in the problem discussion thread. You can find that one under sieve.cpp. It has a runtime of 221631 microseconds. That's still pretty good.

The third way I can think of is to combine the first two methods. As with PE-7, automatically exclude all multiples of 2 and 3 for a 2/3 reduction in numbers considered. Then find all the primes below the square root of 2 million (so anything below and including 1415) and sieve on the remaining 1/3 with those prime numbers. Could result in a nice speedup over the first method. As usual, another thing you could do is to remove the use of the modulus operand - it's expensive.

The code can be found here 

Wednesday, 2 December 2015

Installing libclc on Ubuntu 14.04 LTS

libclc is needed for using Clang with OpenCL. See here for more details. Unfortunately it requires LLVM to be 3.7 or higher, while Ubuntu 14.04 only has up to 3.6.

Make sure you don't have LLVM installed before proceeding.To get around this you first have to get LLVM 3.7 by adding this line:
deb http://llvm.org/apt/precise/ llvm-toolchain-precise-3.7 main
 to this file:
/etc/apt/sources.list.d/llvm.list
I used nano. The original instructions using output redirection didn't work for me. You can find the address for other versions of Ubuntu or LLVM here. Now install llvm-3.7 and clang-3.7
sudo apt-get install llvm-3.7 clang-3.7
It will complain that it can't verify the source, but we added the source so go ahead and ignore it. Next get libclc if you don't already have it:
git clone http://llvm.org/git/libclc.git
Go through the readme and use llvm-config-3.7 instead of llvm-config. You may get an error like so:
/usr/bin/ld: cannot find -ledit
In this case just install libedit:
 sudo apt-get install libedit-dev
 However I still got many warnings of the following type:
WARNING: Linking two modules of different data layouts: 'amdgcn--/lib/image/write_image_impl.ll.tahiti.bc' is '' whereas 'llvm-link' is 'e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-p24:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64'
A quick google search returned nothing. Because the data layout of the first module is empty I'm going to proceed with the assumption that I can ignore this.

Tuesday, 17 November 2015

Mixing C and C++

First, use g++ as it can compile C++ AND C - gcc cannot do that. If for whatever reason you need to use gcc, then just make object files to link with using g++ and the cpp files. 

If you're using malloc and get an error like this,
Process.cpp:45:53: error: invalid conversion from void* to char* [-fpermissive]
     char *linkCopy = malloc(strlen(&line[count]) + 1);
 it's because you need to cast the return value in C++ but not C. See this SO Q&A

Linker complains that a function doesn't exist when it does. First verify with nm that the function does indeed exist. You can use it on .o object and .a archive (static library) files. If it does, the your problem is you're trying to include a header for a C file in a C++ file. There's some problems with not enough information being generated from the C file compared to what the C++ file needs. See this SO Q&A for a solution

Thursday, 15 October 2015

Issues when installing GPU Ocelot

Although I encountered all of these problems, most of them are not specific to GPU Ocelot

1. 'PTXLexer' is not a member of 'parser'
As per here, just edit ocelot/.release_build/ptxgrammar.hpp (make sure to use sudo) and comment out line 352.

 2. typedef 'GlobalMap' locally defined but not used
Turn off Werror by using "--no_werr" when running ./build.py

3. /usr/local/lib/libocelot.so: undefined reference to `tigetnum'
As per here, edit  SConscript (after running sudo ./build.py --install) on line 138 to add "-ltinfo" to ocelot_libs before it's used in OcelotConfig

4. /usr/bin/ld: cannot find -lboost_system-mt
According to this, -mt has been removed from Boost. As suggested I made the necessary symbolic links like so:
sudo ln -s /usr/lib/x86_64-linux-gnu/libboost_system.so /usr/lib/x86_64-linux-gnu
/libboost_system-mt.so

Monday, 5 October 2015

Skype calls or Steam are/is silencing my games/music/whatever

This isn't actually Skype's fault - it's Window's. The default setting in Windows is that a call will lower the volume of other applications by 80%! Thankfully, this is a quick fix. To solve this do the following
  1. Right click on the volume icon (the little speaker in your notification tray)
  2. Select "Sounds"
  3. Choose the "Communications" tab
  4. Pick whatever option you'd like.
For more details see here.

Thursday, 24 September 2015

Excel hangs when selecting the file menu

Also known as "<insert MS Office product here> hangs when selecting the file menu"

What happened in my case was the printer was offline which causes a problem when you select the file menu. Why you ask? Because Excel tries to communicate with your printer to figure out the appropriate page margins or something of the sort. Of course since your printer is offline you can't do that, so the program hangs. Forever.

You can do one of two things to fix this:

  1. Turn on your printer. You'll have to do this each time you encounter this problem.
  2. Set the Microsoft XPS Document Writer as the default printer.
That's what fixed it for me.

Monday, 20 July 2015

How to install OpenCL for Nvidia GPUs on Ubuntu

These instructions are for installing the Ubuntu packages that contain the necessary drivers and libraries for Nvidia and OpenCL. I prefer doing it this way over the ones provided by Nvidia because it makes installation and/or cleanup a lot easier. However, you won't end up with the latest driver and toolkit. For reference I did this with Ubuntu server release14.04 LTS and an Nvidia GTX 780.

Step 1: Get the Nvidia driver

Use the following command to have the packaged driver for your GPU listed. Use the non-updates versions.
sudo ubuntu-drivers devices
Then install the driver:
sudo apt-get install nvidia-<version here>
 It should look like this for example. Reference
sudo apt-get install nvidia-331

Step 2: Get more Nvidia driver packages

Not quite sure what the following packages are for,  but you need them for this to work. Reference
sudo apt-get install nvidia-<version here>-uvm nvidia-opencl-dev nvidia-modprobe

Step 3: Get the toolkit

Easiest step here. I know it says cuda, but it includes OpenCL libraries too:
sudo apt-get install nvidia-cuda-toolkit

getPlatformIDs returns -1001

For me this was resolved by applying step 2, but I saw people getting this error for other reasons. See here and here

Next steps

Run nvidia-smi to confirm that the driver was installed correctly. Next build and run the deviceQuery sample to ensure you can build and run OpenCL programs successfully.