Tutorial - Best Testing Practices for Data Science

May 13, 2017

   

Description

So you’re a data scientist wrangling with data that’s continually avalanching in, and there’s always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Instructor Bio

Eric Ma is a 6th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Pre-Tutorial Instructions

Please follow instructions on the GitHub repository: https://github.com/ericmjl/data-testing-tutorial

Other Notes

Food will not be provided, as we do not have sponsors for the event. Lunch options nearby in the Kendall/MIT area include Au Bon Pain, Chipotle, Clover, Champions, and more.

Meetup link: https://www.meetup.com/bostonpython/events/238341350/

Back to Past Events Page