Data Serialization: from pickle to databases and HDF5

Executive summary

When scientists have finished their data reduction tasks they need a way to consolidate the results in persistent storage media so that they can easily recover data afterward. I'll talk about the basic tools that comes with Python library for allowing this task, as well as introducing relational databases and general numerical oriented formats (NPY and HDF5).

The talk will be given in a tutorial style, so that people can directly look at how things are done. During the tutorial emphasis will be put in comparing serialized sizes and performance.

Contents

1. The Basics

2. Numerical Binary Formats

3. Adding Compression

Preliminary slides here