https://github.com/hartikainen/parallel-programming
https://github.com/hartikainen/parallel-programming
Last synced: 9 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/hartikainen/parallel-programming
- Owner: hartikainen
- Created: 2015-09-06T16:44:07.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2015-09-06T16:44:20.000Z (over 10 years ago)
- Last Synced: 2025-02-09T12:17:23.539Z (11 months ago)
- Language: C++
- Size: 2.31 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
ICS-E4020 Programming Parallel Computers
========================================
Material for the weekly exercises.
See the course web page for details:
https://users.ics.aalto.fi/suomela/ppc-2015/
Files to edit
-------------
Weekly reports:
report/week*.pdf
Tasks:
mf*/mf.cc
cp*/cp.cc
is*/is.cc
so*/so.cc
Quick start
-----------
Let us use task MF1 as an example:
cd mf1
make test
The test should fail, as we have not implemented the median filter
subroutine yet. Now open the file `mf.cc` in a text editor and
fill in the missing details. Then compile it:
make
Run the test suite:
make test
And do some benchmarking to see that it performs well:
make benchmark
In the benchmarks, the last column is the running time in seconds.
Variants
--------
For more thorough tests and benchmarks, you can also try:
make test2
make benchmark2
For OpenMP code, tests and benchmarks are automatically ran for
different numbers of threads (1, 2, 4, 8, and default). To only
run the default version, try:
make test1
make benchmark1
Environment
-----------
We will assume the following:
- Operating system: Linux or Mac OS X.
- Compiler: GCC version 5.1, 4.9, or 4.8, somewhere in the path,
with the name `g++-5.1`, `g++-5`, `g++-4.9`, `g++-4.8`, or `g++`.
- Libraries: libpng installed in a location where GCC can find it.
For assembly output, you also need `objdump` and `c++filt`.
For CUDA code, you also need to have CUDA 7.0 installed in
`/usr/local/cuda-7.0`.
### Classroom computers
Everything should work directly in the classroom computers (Maari-A).
### Your own computers
To run this on your own OS X computer, try the following:
- Install Homebrew: http://brew.sh/
- Run: `brew install gcc libpng binutils`
To run this on your own Ubuntu 14.04 Linux computer, try the
following:
- Run: `sudo apt-get install g++-4.8 libpng12-dev`
Advanced: debug builds
----------------------
Disable optimisations:
cd mf1
make clean
make DEBUG=1
make test
Disable optimisations and enable AddressSanitizer
(helps to catch many memory access errors):
cd mf1
make clean
make DEBUG=2
make test
Disable optimisations and enable AddressSanitizer
and the C++ standard library debug mode (helps to
catch e.g. out-of-bounds accesses with standard
containers):
cd mf1
make clean
make DEBUG=3
make test
Disable modern CPU instructions:
cd mf1
make clean
make ARCH=1
make test
You can also combine these, e.g., ARCH=1 DEBUG=1.
Remember to run `make clean` afterwards.
Advanced: CUDA debug builds
---------------------------
You can compile CUDA code with `make DEBUG=1` for debug builds.
This will compile with `nvcc -g -G` so that you can easily debug
your code with `cuda-gdb`.
Sorry, `DEBUG=2` is not supported for CUDA code.
Advanced: see the assembly code
-------------------------------
To see the assembly code generated by the compiler, try the targets
`*.asm1` or `*.asm2`. Example:
cd mf1
make mf.asm1
make mf.asm2
Open the files `mf.asm1` and `mf.asm2` in a text editor.
The first file is the assembly code produced by the compiler for
the file `mf.cc`, with some post-processing to make it more readable
(name demangling).
The second file is a disassembly of the object file `mf.o`, interleaved
with the source code from `mf.cc`.
Search for `mf(` to find the code related to function `mf()`.
The assembly code may not be that easy to read. You can try to see
what it looks like without optimisation:
cd mf1
make clean
make DEBUG=1 mf.asm1
make DEBUG=1 mf.asm2