These files may be downloaded from https://portal.g-node.org/python-autumnschool/materials/advanced_numpy

Please do explore beyond the problems given, and feel free to ask questions at any time.

Note

Solutions to some problems are provided, but try to avoid peeking at them until you've made an attempt!

Before we start, load NumPy:

import numpy as np

These exercises explore some of the more advanced features of NumPy. If you prefer to learn the basics from scratch, have a look at this introduction.

Let's explore the basics of a NumPy array. Create a 3x4 array of random values (using

`np.random.random`), and call it`x`. Create another array as follows:`y = x[2]`. What happens when you modify`y`--does`x`also change?In the array above, have a look at its different properties:

`x.shape`,`x.ndim`,`x.dtype`,`x.strides`. What does each property tell you about the array?Compare the

`strides`property of`x.T`to the above. What is`x.T`and can you explain the new strides?Compute the sum of the columns of

`x`. You may find the function to do that by using`np.<TAB>`in IPython or`print dir(np)`in standard Python. Remember that you can always look at the docstring of any function in Python (use`np.func_name?<ENTER>`in IPython or`help(np.func_name)`in standard Python).Construct the array

`x = np.array([0, 1, 2, 3], dtype=np.uint8)`(here,`uint8`represents a single byte in memory, an unsigned integer between 0 and 255). Can you explain the results obtained by`x + 1`and`x / 2`? Also try`x.astype(float) + 1`and`x.astype(float) / 2`.What is the maximum number of dimensions a NumPy array can have? Use one of the array constructors (

`np.zeros`,`np.empty`,`np.random.random`, etc.) to find out.

Design a data-type for storing the following record:

- Timestamp in nanoseconds (a 64-bit unsigned integer)
- Position (x- and y-coordinates, stored as floating point numbers)

Use it to represent the following data:

dt = np.dtype(<your parameters here>) x = np.array([(100, (0, 0.5)), (200, (0, 10.3)), (300, (5.5, 15.1))], dtype=dt)

Have a look at the

`np.dtype`docstring if you need help. After constructing`x`, try to print all the x values for which time is greater than 100 (hint: something of the form`y[y > 100]`).Consider

`structured_arrays/data.txt`. Modify`structured_arrays/load_txt_template.py`to load the data in this text file. This requires specifying a data-type that encapsulates a record such as# name x y block - 2x3 ints aaaa 1.0 8.0 1 2 3 4 5 6

Note that binary files can also be read with structured data-types, using

`np.fromfile`.Create two structured arrays of your choosing. Use the

`np.savez`command to save these to a single data-file. Load the data-file using`np.load`and confirm whether the data survived the round-trip. (Saving data using`save`or`savez`is highly recommended over pickling.)

Take a look at the following. Many of you may know it as a meshgrid:

x, y = np.mgrid[:10, :5] z = x + y

Now, reproduce

`z`from the above snippet, but this time use NumPy's`ogrid`together with broadcasting.In our solution, broadcasting is used "behind the scenes". To see what happens more explicitly, apply

`np.broadcast_arrays`on the`x`and`y`given by ogrid`ogrid`. This should correspond to the`x`and`y`produced by`mgrid`.Benchmark the two approaches (one using

`mgrid`, the other`ogrid`), using IPython's`timeit`function. Can you explain the difference in execution time?Given a list of 3-dimensional coordinates,

`p`,[[1.0, 2.0, 10], [3.0, 4.0, 20], [5.0, 6.0, 30], [7.0, 8.0, 40]]

Normalise each coordinate by dividing with its Z (3rd) element. For example, the first row becomes:

[1/10, 2/10, 10/10]

Hint: extract the last column as

`y`, and then change its dimensions so that`p / y`works.

Create a 3x3 ndarray. Use fancy indexing to slice out the diagonal elements.

Predict and verify the shape of the following slicing operation.

x = np.empty((10, 8, 6)) idx0 = np.zeros((3, 8)).astype(int) idx1 = np.zeros((3, 1)).astype(int) idx3 = np.zeros((1, 1)).astype(int) x[idx0, idx1, idx2]

**Advanced**: This is not strictly speaking a question on indexing, but it's a fun exercise either way.Construct an array

x = np.arange(12, dtype=np.int32).reshape((3, 4))

so that

`x`isarray([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])

Now, provide to

`np.lib.stride_tricks.as_strided`the strides necessary to view a sliding 2x2 window over this array. The output should bearray([[[[ 0, 1], [ 4, 5]], [[ 1, 2], [ 5, 6]], [[ 2, 3], [ 6, 7]]], [[[ 4, 5], [ 8, 9]], [[ 5, 6], [ 9, 10]], [[ 6, 7], [10, 11]]]], dtype=int32)

The code is of the form

z = as_strided(x, shape=(2, 3, 2, 2), strides=(..., ..., ..., ...))

This sort of stride manipulation is very handy to implement techniques such as region based statistics, convolutions, etc.

Documentation for NumPy's `__array_interface__` may be found in the
online docs.

An author of a foreign package (

`array_interface/mutable_str.py`) provides a string class that allocates its own memory:In [1]: from mutable_str import MutableString In [2]: s = MutableString('abcde') In [3]: print s abcde

You'd like to view these mutable strings as ndarrays, in order to manipulate the underlying memory.

- Add an
`__array_interface__`dictionary attribute to`s`, then convert`s`to an ndarray. Use the given`array_interface/template.py`as a guide. - Add "1" to the array. Now print the original string to ensure that its value was modified.

- Add an

Construct the following two arrays:

x = np.array([[1, 2], [3, 4]], order='C', dtype=np.uint8) y = np.array([[1, 2], [3, 4]], order='F', dtype=np.uint8)

Compare the bytes they store in memory by using

[ord(c) for c in x.data]

Note that, even though these arrays store data in different memory order, they are identical from the user's perspective.

print x print y

Examine the bytes stored by the following array (using the "ord" trick shown above).

x = np.array([[1, 2], [3, 4]], dtype=np.uint32)

Note that, on most laptops, the byte order will be little Endian, i.e. least significant byte first.

Attempt the Fortran-ordering quiz.