Parallel Programming for Machine Learning
Teacher
ECTS:
3
Course Hours:
18
Tutorials Hours:
0
Language:
French
Examination Modality:
mém.
Objective
This course is taught by Xavier Dupré and Matthieu Durut.
This course covers CPU programming (the classical processing unit) and GPU programming (on graphic cards). It is directed towards the goal of writing efficient programs which take advantage of the hardware architecture.
A first part of the course is deditaced to computer architecture, and in particular everything that enables pograms to run in parallel and communicate.
Sessions that follow put these notions in practice. First, C++ programming on CPU with several example of efficient algorithm implementations, then programming of GPU.
Planning
Architecture: hardware, shared memory, order of magnitude of CPU speed, communication
Parallel execution: algorithm, multithread, race condition, lock
CPU parallelisation: development tools, examples of parallel programs, application to machine learning algorithm (Random Forest etc.)
GPU programming:
CUDA, threads, memory management
Pointers, GPU/CPU interaction, using __inline__ and __globals__
PyTorch: extension implementation
References
Plateforme utilisée pour les TP : SPP Cloud