ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ

ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ

CS527 - Parallel Computer Architecture

Ενότητα: 2

Άγγελος Μπίλας

Τμήμα Επιστήμης Υπολογιστών

Parallel Programming

The goal of this programming problem is to implement a matrix multiplication program in both shared memory and message passing. You can implement this on any parallel system. The suggested platform is the research cluster at FORTH-ICS. You will have access to 8 nodes of the cluster.

The platform

Each node in the sub-cluster is a dual-processor AMD opteron system. The cluster nodes you will use are physically interconnected with a Gigabit Ethernet network. Although you may use this system as both a shared memory (over a software shared memory abstraction) and a message passing platform, in this assignment you will only use it as a message passing system over a standard library, Message Passing Interface (MPI). For the shared memory part of the assignment you will use a single node in the cluster that has two, quad-core CPUs, for a total of eight (8) CPUs. You can access the cluster by logging in via ssh to shark dot ics dot forth dot gr. You can only access shark only from UoC systems. From shark, you can then access the 4-node sub-cluster for MPI (piranha63,71,72,73) and the eight-core node for SAS (penguin3) via ssh. User accounts for the cluster will be distributed in class.

The assignment

(a) SAS programming

To write a shared memory parallel program for the eight-core system you can use the ANL m4 macros (Argonne National Laboratory) that allow you to create processes (threads), allocate global memory, and use synchronization primitives. The file ~cs527/sas/macros/c.linux.m4 contains this set of macros. The ~cs527/sas/macros/c.null.m4 file may be handy for running the sequential versions of the SPLASH-2 programs.

Tasks:

README_FIRST.SAS. Get, compile, and run FFT from the ~cs527/sas/applications. This version of the application is similar to the original SPLASH-2 version at the SPLASH-2 web page, with minor modifications (mainly to support data placement and 64-bit addresses).
Write a shared memory program that reads two NxN matrices from a file and multiples them on a system with P processors. You don't have to worry too much about corner cases (for instance you can assume that N is a power of P). For the format of the input file use one array element per line, and elements are linearized in a row-wise fashion. Output the result on the standard output in the same format. The program should report the time it took to compute the result (not including initialization, reading files, or outputting results) to the standard output.
Run your SAS program on 1,2,3,4,5,6,7,8 cores and create a speedup curve.

(b) MPI Programming

README_FIRST.MPI. Install MPICH locally in your account.
Copy, compile, and run the int_pi2 program from ~cs527/mpi (runmpi.txt). This application computes the value of pi. Read the instrucitons in ~cs527/mpi/Readme for compiling an MPI program. Experiment with the number of approximation intervals: try 100, 1000, and 1000000. Why is the error lower with 1000 approximation intervals than with 100? Why does the error increase for large numbers?
Write an MPI program that reads two NxN matrices and multiples them in the same way as task 2 above.
Run your program on 1,2,3,4 processors (one processor per node) and on 2,4,6,8 processors (two processors per node) and create two speedup curves.

Put the three speedup curves (one from SAS and two from MPI) on a single graph with appropriate legends, indicating application, programming abstraction, and input size.

References

GNU m4 macro preprocessor. Most unix systems include the m4 macro preprocessor. Type "man m4" for more information.
ANL macros
Message Passing Interface
Open MPI

Submission

Turn in (by mail to b i l a s @ c s d . u o c . g r) a tar file that contains your solutions and a README file stating assumptions or special features.

Άδειες Χρήσης

•Το παρόν εκπαιδευτικό υλικό υπόκειται στην άδεια χρήσης Creative Commons και

ειδικότερα

Αναφορά - Μη εμπορική Χρήση - Όχι Παράγωγο Έργο 3.0 Ελλάδα

(Attribution - Non Commercial - Non-derivatives 3.0 Greece)

•Εξαιρείται από την ως άνω άδεια υλικό που περιλαμβάνεται στις διαφάνειες

του μαθήματος, και υπόκειται σε άλλου τύπου άδεια χρήσης. Η άδεια χρήσης

στην οποία υπόκειται το υλικό αυτό αναφέρεται ρητώς.

Χρηματοδότηση

•Το παρόν εκπαιδευτικό υλικό έχει αναπτυχθεί στα πλαίσια του εκπαιδευτικού έργου του διδάσκοντα.

•Το έργο «Ανοικτά Ακαδημαϊκά Μαθήματα στο Πανεπιστήμιο Κρήτης» έχει χρηματοδοτήσει μόνο τη αναδιαμόρφωση του εκπαιδευτικού υλικού.

•Το έργο υλοποιείται στο πλαίσιο του Επιχειρησιακού Προγράμματος «Εκπαίδευση και Δια Βίου Μάθηση» και συγχρηματοδοτείται από την Ευρωπαϊκή Ένωση (Ευρωπαϊκό Κοινωνικό Ταμείο) και από εθνικούς πόρους.

espa

Τελευταία τροποποίηση: Τετάρτη, 1 Ιουλίου 2015, 8:48 PM

Assignment 2

Parallel Programming

Contact Us