Exercises for Memory-Efficient Computing

Optimizing arithmetic expressions

Exercise 1

Use script ``poly1.py`` to check how much time it takes to evaluate the next polynomial:

    y = .25*x**3 + .75*x**2 - 1.5*x - 2

with x in the range [-1, 1], and with 10 millions points.

Exercise 2

The expression below:

  y = ((.25*x + .75)*x - 1.5)*x - 2

represents the same polynomial than the original one, but with some interesting side-effects in efficiency. Repeat the computation for numpy and numexpr and draw your own conclusions.

Exercise 3

The C program ``poly.c`` does the same computation than above, but in pure C. Compile it like this:

  gcc -O3 -o poly poly.c -lm

and execute it.

Parallelism with threads

Exercise 4

Be sure that you are on a multi-processor machine and repeat the last computation in poly1.py but increasing the number of threads one by one (change the number in the ``for nt in range(1):`` loop).

Exercise 5

With the same multi-processor, recompile the above poly.c, but with OpenMP support:

  gcc -O3 -o poly poly.c -lm -fopenmp    # notice the new -fopenmp flag!

and execute it for several numbers of threads:

  OMP_NUM_THREADS=desired_number_of_threads ./poly

Compare its performance with the parallel numexpr.

Exercise 6

With the previous examples, compute the expression:

  y = x

That is, do a simple copy of the `x` vector. What's the performance that you are seeing? How does it evolve when using different threads?

Evaluating with carray

Exercise 7

Look into the sources of carray-eval.py and run it. For the first expression evaluation, i.e.:

    ((.25*x + .75)*x - 1.5)*x - 2
Exercise 8

Repeat your reasoning with the second expression:

    ((.25*x + .75)*x - 1.5)*x - 2 < 0

Querying Big Data

Exercise 9

Look into the sources of 'carray-ctable.py' script and run it.

Exercise 10

Enter the ipython console and generate the big `t` ctable (just copy and paste the appropriate statements from the previous 'carray-ctable.py').

      ca.set_nthreads(your_number_of_threads)