Advanced Scientific Programming in Python [advanced_python:exercises2]

Exercise E0 [warmup]

Execute the two following implementations of rm_change. Which one is faster? Why is there a difference?

from time import time

def rm_change(change):
    if change in COMMITS:
        COMMITS.remove(change)

COMMITS = range(10**7)
t = time()
rm_change(10**7); rm_change(10**7-1); rm_change(10**7-2)
print(time()-t)

def rm_change(change):
    try:
        COMMITS.remove(change)
    except ValueError:
        pass

COMMITS = range(10**7)
t = time()
rm_change(10**7); rm_change(10**7-1); rm_change(10**7-2)
print(time()-t)

The first version goes over the list twice: first time to check if the value is in the list, and second time to remove it. The second version goes over the list just once, and is faster. The difference is almost exactly 2×.

Exercise D1 (30 min)

Write a decorator which wraps functions to log function arguments and the return value on each call. Provide support for both positional and named arguments (your wrapper function should take both *args and **kwargs and print them both):

>>> @logged
... def func(*args):
...     return 3 + len(args)
>>> func(4, 4, 4)
you called func(4, 4, 4)
it returned 6
6

Solution (class)

class logged(object):
    def __init__(self, func):
        self.func = func

    def __call__(self, *args, **kwargs):
        print('you called {.__name__}({}{}{})'.format(
             func,
             str(list(args))[1:-1], # cast to list is because tuple
                                    # of length one has an extra comma
             ', ' if kwargs else '',
             ', '.join('{}={}'.format(*pair) for pair in kwargs.items()),
             ))
        val = func(*args, **kwargs)
        
        print('it returned', val)
        return val

Solution (function)

def logged(func):
    """Print out the arguments before function call and
    after the call print out the returned value
    """

    def wrapper(*args, **kwargs):
        print('you called {.__name__}({}{}{})'.format(
             func,
             str(list(args))[1:-1], # cast to list is because tuple
                                    # of length one has an extra comma
             ', ' if kwargs else '',
             ', '.join('{}={}'.format(*pair) for pair in kwargs.items()),
             ))
        val = func(*args, **kwargs)
        print('it returned', val)
        return val
    return wrapper

A version with doctests: logged.py

Exercise D2 (20 min)

Write a decorator to cache function invocation results. Store pairs arg:result in a dictionary in an attribute of the function object. The function being memoized is:

def fibonacci(n):
    assert n >= 0
    if n < 2:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

Solution

def memoize(func):
    func.cache = {}
    def wrapper(n):
        try:
            ans = func.cache[n]
        except KeyError:
            ans = func.cache[n] = func(n)
        return ans
    return wrapper

@memoize
def fibonacci(n):
    """
    >>> print(fibonacci.cache)
    {}
    >>> fibonacci(1)
    1
    >>> fibonacci(2)
    1
    >>> fibonacci(10)
    55
    >>> fibonacci.cache[10]
    55
    >>> fibonacci(40)
    102334155
    """
    assert n >= 0
    if n < 2:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

Exercise CM1

Write a context manager similar to @assert_raises@, which checks if the execution took at most the specified amount of time:

>>> with time_limit(10):
...       short_computation()
...
11339393393939393
>>> with time_limit(10):
...       loooong_computation()
...
⚡ function took 13s to execute — too long

Solution

import time
import functools
def time_limit(limit):
    def decorator(func):
        def wraper(*args, **kwargs):
            t = time.time()
            ans = func(*args, **kwargs)
            actual = t - time.time()
            if actual > limit:
                 print('⚡ function took %fs to execute — too long'%actual)
                 return None
            return ans
        return functools.update_wrapper(wraper, func)
    return decorator

Excercise G2 (20 min) [optional]

You are writing a file browser which displays files line by line. The list of files is specified on the commands line (in sys.argv). After displaying one line, the program waits for user input. The user can:

press Enter to display the next line
press n + Enter to forget the rest of the current file and start with the next file
or anything else + Enter to display the next line

The first part is already written: it is a function which displays the lines and queries the user for input. Your job is to write the second part — the generator read_lines with the following interface: during construction it is passed a list of files to read. If yields line after line from the first file, then from the second file, and so on. When the last file is exhausted, it stops. The user of the generator can also throw an exception into the generator (SkipThisFile) which signals the generator to skip the rest of the current file, and just yield a dummy value to be skipped.

 class SkipThisFile(Exception):
     "Tells the generator to jump to the next file in list."
     pass
 
 def read_lines(*files):
     """this is the generator to be written
 
     >>> list(read_lines('exercises.rst'))[:2]
     ['=============================', 'Advanced Python  — excercises']
     """
     for file in files:
         yield 'dummy line'
 
 def display_files(*files):
     source = read_lines(*files)
     for line in source:
         print(line, end='')
         inp = input()
         if inp == 'n':
             print('NEXT')
             source.throw(SkipThisFile) # return value is ignored

Solution

 def read_lines(*files):
     for file in files:
         for line in open(file):
             try:
                 yield line.rstrip('\n')
             except SkipThisFile:
                 yield 'dummy'
                 break

Exercise D3: plugin registration system (5 min) [optional]

This exercise is to be done at the end if time permits.

This is the plugin registration system from the lecture::

class WordProcessor(object):
    def process(self, text):
        for plugin in self.PLUGINS:
            text = plugin().cleanup(text)
        return text
 
    PLUGINS = []
    ...

@WordProcessor.plugin
class CleanMdashesExtension(object):
    def cleanup(self, text):
        return text.replace('&mdash;', u'\N{em dash}')

…implement the plugin decorator!

Solution

class WordProcessor(object):
    ...

    PLUGINS = []

    @classmethod
    def plugin(cls, plugin):
        cls.PLUGINS.append(plugin)

Excercise D4 (30 min) [optional]

This exercise is to be done at the end if time permits.

Write a decorator to memoize functions with an arbitrary set of arguments. Memoization is only possible if the arguments are hashable. If the wrapper is called with arguments which are not hashable, then the wrapped function should just be called without caching.

Note: To use args and kwargs as dictionary keys, they must be hashable, which basically means that they must be immutable. Variable args is already a tuple, which is fine, but kwargs have to be converted. One way is invoke tuple(sorted(kwargs.items())).

Example solution

import functools

def memoize(func):
    """
    >>> @memoize
    ... def f(*args, **kwargs):
    ...     ans = len(args) + len(kwargs)
    ...     print(args, kwargs, '->', ans)
    ...     return ans
    >>> f(3)
    (3,) {} -> 1
    1
    >>> f(3)
    1
    >>> f(*[3])
    1
    >>> f(a=1, b=2)
    () {'a': 1, 'b': 2} -> 2
    2
    >>> f(b=2, a=1)
    2
    >>> f([1,2,3])
    ([1, 2, 3],) {} -> 1
    1
    >>> f([1,2,3])
    ([1, 2, 3],) {} -> 1
    1
    """
    func.cache = {}
    def wrapper(*args, **kwargs):
        key = (args, tuple(sorted(kwargs.items())))
        try:
            ans = func.cache[key]
        except TypeError:
            # key is unhashable
            return func(*args, **kwargs)
        except KeyError:
            # value is not present in cache
            ans = func.cache[key] = func(*args, **kwargs)
        return ans
    return functools.update_wrapper(wrapper, func)

Exercise D5 (15 min) [really optional]

Modify deprecated2 (see the lecture slides) to take an optional argument — a function to call instead of the original function::

>>> def eot_new(): return 'EOT NEW'
>>> @deprecated3('using eot_new not {func.__name__}', eot_new)
... def eot(): return 'EOT'
...
>>> eot()
using eot_new not eot
'EOT NEW'

Exercise X1 [good for a laugh]

Execute the following code and explain the result.

f = lambda: map((yield), range(10))
for x in f(): print x

Answer

The lambda is equivalent to the following code:

def f():
    return map((yield), range(10))

The yield is executed first, and it returns something (dependent on the way that the generator is used):

def f():
    x = yield
    return map(x, range(10))

Since we are calling f() from a for loop, we use .next(), not .send(), so it is equivalent to:

def f():
    yield
    return map(None, range(10))

which is equivalent to (because map simply creates a list if None is given as the first argument):

def f():
    yield
    return range(10)

which in turn is equivalent to:

def f():
    yield

because the return value from a generator is ignored.

In case of a normal function, the final return wouldn't be allowed, because it doesn't make sense to return things from a generator function. In case of a lambda function, it's not possible to tell Python to ignore the return value, so the yield is allowed, but confusing.