Assignment I: Fundamentals of Computer Systems and Profiling Codes
- Due 31 Jan 2022 by 17:00
- Points 4
- Submitting a file upload
- File types pdf
Exercise I: Computer Systems Performance Measurements
The goal of this exercise is to review the major concepts of HPC computer system components and their technological development.
Task 1.1 Answer the following questions:
- What are the main performance metrics that characterize
- computing units
- memory units
- communication layers
?
Task 1.2. HPC Laws (Dead or Alive :) ). During the lecture, we discuss briefly some (phenomenological) laws that describe the evolution of hardware and in general silicon-based technologies. Using lectures and sources online (such as papers and websites), describe and report the implications for the technologies or code development in your report:
- Moore's law
- Dennard's scaling
- Amhdal's law
Task 1.3 Your System Spec. Report the specification of the computer systems you are going to use in this course:
- Processor model
- Processor clock frequency
- Caches memories and their size
- GPU model (this can be an integrated GPU)
- Size of RAM memory
- Size of HDD or SSD
- The operative system in use
Exercise II: Profiling the Julia Set Code
In this exercise, we will time and profile the Julia Set code Download Julia Set code we have used during the lectures. This exercise will experiment with the most common approaches for timing and profiling, following the step presented in the lectures.
Task 2.1 Calculate the Clock Granularity of different Python Timers (on your system). Whenever we use timers, it is critical to calculate and know the so-called clock granularity. Any event with a duration less than clock granularity won't be captured by the timer.
The standard time.time() function provides sub-second precision, though that precision varies by platform. Precision is typically approximately +- 1 microsecond but timers are often improved with new Python versions.
The timeit module typically provides a higher resolution, e.g. lower clock granularity. During the lecture, we use timeit from the command line. We can use timeit within the code and timers as follows:
from timeit import default_timer as timer
...
t1 = timer()
...
t2 = timer()
Additionally, Python 3.7 introduces new functions to the time
module that provide higher resolution. An example is a time.time_ns() that provides timestamps in nanoseconds (instead of the usual second of time.time())
...
t1 = time.time_ns()
...
t2 = time.time_ns()
The clock granularity can be calculated experimentally, by timing to successive instants without computation without them. To calculate experimentally the granularity of different timers, we can design a code such as the following (previous versions of the code had an issue with the indentation in Canvas and int type used for a specific timer):
import numpy as np
def checktick():
M = 200
timesfound = np.empty((M,))
for i in range(M):
t1 = ... # get timestamp from timer
t2 = ... # get timestamp from timer
while (t2 - t1) < 1e-16: # if zero then we are below clock granularity, retake timing
t2 = ... # get timestamp from timer
t1 = t2 # this is outside the loop
timesfound[i] = t1 # record the time stamp
minDelta = 1000000
Delta = np.diff(timesfound) # it should be cast to int only when needed
minDelta = Delta.min()
return minDelta
As part of the task measure and report the clock granularity (measured experimentally on your system) for 1) time.time(), 2) timeit and 3) time.time_ns() (for the last one remember that time is outputted in ns!).
Task 2.2 Timing the Julia set code functions. The goal is to time the calc_pure_python and calculate_z_serial_purepython. For this task, we ask you to develop a decorator to wrap the functions to be profiled. Use decorator for adding timer functionality for time measurements with the best timer you found in the previous task. As part of the task:
- Develop the timer decorator
- Provide timing information for the two functions. Report averages and standard deviation. How does the standard deviation compare to the clock frequency?
Task 2.3 Profile the Julia set code with cProfile and line_profiler the computation. Use the cProfile and line_profiler to profile the computation in JuliaSet code.
As part of the task:
- Report the results of cProfile and line_profiler (for the two functions)
- Use SnakeViz to visualize the profiling information from cProfile
- Measure the overhead added by using cProfile and line_profiler. For this, you can time the code with and without the profilers
Task 2.4 Memory-profile the Juliaset code. Use the memory_profiler and mprof to profile the computation in JuliaSet code. Use memory_profiler to profile the memory usage for the two functions and use mprof to collect and visualize the profiling information.
As part of the task:
- Report the memory profiling results from memory_profiler and mprof (including the plot)
- Measure the overhead of memory_profiler and mprof.
Exercise III: Profiling Diffusion Process Code
As part of the exercise, we use Python code to solve the diffusion equation. We will be solving the 2D version of the diffusion equation, e.g. we will be operating over a 2D matrix storing our variable i. The equation in 2D geometry we solve is the following:
We define a numerical grid on which we discretize the equation and we solve numerically the equation above for one step with the function evolve:
grid_shape = (640, 640) def evolve(grid, dt, D=1.0): xmax, ymax = grid_shape new_grid = [[0.0] * ymax for x in range(xmax)] for i in range(xmax): for j in range(ymax): grid_xx = ( grid[(i + 1) % xmax][j] + grid[(i - 1) % xmax][j] - 2.0 * grid[i][j] ) grid_yy = ( grid[i][(j + 1) % ymax] + grid[i][(j - 1) % ymax] - 2.0 * grid[i][j] ) new_grid[i][j] = grid[i][j] + D * (grid_xx + grid_yy) * dt return new_grid
The global variable grid_shape
designates how big a region we will simulate. We are using periodic boundary conditions (which is why we use modulo for the indices). To use this code, we must initialize a grid and call evolve
on it.
def
run_experiment
(
num_iterations
)
:
# Setting up initial conditions
xmax
,
ymax
=
grid_shape
grid
=
[
[
0.0
]
*
ymax
for
x
in
range
(
xmax
)
]
# These initial conditions are simulating a drop of dye in the middle of our
# simulated region
block_low
=
int
(
grid_shape
[
0
]
*
0.4
)
block_high
=
int
(
grid_shape
[
0
]
*
0.5
)
for
i
in
range
(
block_low
,
block_high
)
:
for
j
in
range
(
block_low
,
block_high
)
:
grid
[
i
]
[
j
]
=
0.005
# Evolve the initial conditions
for
i
in
range
(
num_iterations
)
:
grid
=
evolve
(
grid
,
0.1
)
The values for dt
and grid elements have been chosen to be sufficiently small that the algorithm is stable. See Numerical Recipes
Links to an external site. for a more in-depth treatment of this algorithm’s convergence characteristics.
Task 3.1 Profile the diffusion code with cProfile and line_profiler the computation. Use the cProfile and line_profiler to profile the computation.
As part of the task:
- Report the results of cProfile and line_profiler
- Use SnakeViz to visualize the profiling information from cProfile
Task 3.2 Memory-profile the diffusion code. Use the memory_profiler and mprof to profile the computation. Use memory_profiler to profile the memory usage and use mprof to collect and visualize the profiling information.
As part of the task:
- Report the memory profiling results from memory_profiler and mprof (including the plot)
Exercise IV - Interactive Version Control with Git / Learn Git Branching
Version control and git are critical for the development of HPC applications and we require you to submit all the courses using git and share with GitHub or GitLab.
Before starting the exercises, check our tutorial at A.2 Tutorial: Versioning Control with Git and complete the train yourself using Git by completing an online training at
https://learngitbranching.js.org
Task 4.1 - Complete the Levels 1-4 of the Main Introduction track and Levels 1-4 of the Ramping Up track of https://learngitbranching.js.org
Task 4.2 - Complete the Levels 1-8 of the Remote Push & Pull track to the tutorial
Task 4.3. Reflect on the usage of version control and git in HPC and scientific application development and answer the questions:
- What are the advantages of using git compared to having local copies of the code on your computer?
- What are the challenges in using Git? Provide some example
- What other version control systems, other than git, do you know the existence of?
Bonus Exercise: Develop your own profiler tool for monitoring CPU percentage use with psutil
The goal of this exercise is to develop your own profiler using the psutil module functions and Python timing capabilities. We briefly mentioned psutil for memory profilers, and the module is presented in the course tutorial: A.1 Tutorial: The psutil Module.
For the new profiling tool, we want to record the CPU usage percentage per core during the code execution and create a final plot and summary table. For the plotting, you can the matplotlib module Links to an external site. or other Python visualization modules.
For retrieving the CPU usage percentage per core, we can use the psutil.cpu_percent(interval=1, percpu=True) function.
The tool should produce a plot with the evolution of the CPU percentage for different cores a table with recorded values as the final result.
The design of the tool is up to you: it can be simply a set of functions, decorators, Python classes, ...
In the report, you should describe the design and implementation of the profiler and report the results of your profiler against the codes used in Exercise II and III.