Managing Large Datasets with Python and HDF5 – O’Reilly Webcast

Python By Builder

Are you using Python to process large numerical datasets? Over the past few years, the Hierarchical Data Format (HDF5) has emerged as the mechanism of choice for processing, archiving and sharing scientific datasets ranging from gigabytes to terabytes and beyond. With a diverse user base spanning the range from NASA to the financial industry, HDF5 lets you create high-performance, portable, self-describing containers for your data. HDF5’s flexibility and speed make it particularly well-suited to analysis in Python.

This webcast provides a practical, Python-based introduction to the world of HDF5.

This webcast led by Andrew Collette will cover:

– The basics of the format
– Performance
– Best practices for making sharable data files which can be read by colleagues on other platforms

About Andrew Collette

Andrew Collette holds a Ph.D. in physics from UCLA, and works as a laboratory research scientist at the University of Colorado. He has worked with the Python-NumPy-HDF5 stack at two multimillion-dollar research facilities; the first being the Large Plasma Device at UCLA (entirely standardized on HDF5), and the second being the hypervelocity dust accelerator at the Colorado Center for Lunar Dust and Atmospheric Studies, University of Colorado at Boulder. Additionally, Dr. Collette is a leading developer of the HDF5 for Python (h5py) project.

Produced by: Yasmina Greco

– Don’t miss an upload! Subscribe! http://goo.gl/szEauh –
Stay Connected to O’Reilly Media. Visit http://oreillymedia.com
Sign up to one of our newsletters – http://goo.gl/YZSWbO

Follow O’Reilly Media:
http://plus.google.com/+oreillymedia
https://www.facebook.com/OReilly
Tweets by OReillyMedia

source