Python is a versatile language considered to be one of the essential tools of a data engineer. This course builds on the Computing in Python course and teaches the participants techniques for efficient data handling. Using teaching videos and practical exercises, the student will learn how to handle medium to large data sets and build a basic data pipeline using Python.
What You Will Learn
Upon completion of this course, the learners are expected to:
- work with medium and large data sets using Python and pandas;
- overcome CPU and I/O limits using multithreading;
- use different data structures and algorithms to speed up data analysis; and
- use functional programming to build data pipelines with Python.
You will need a computer or laptop with Microsoft Excel installed. Computer or laptop requirements are:
- For Windows: Core i3 or better, 4GB RAM or better, MS Excel 2007 or better
- For MacBook: ideally MS Excel 2013 or newer should be installed (some functions require this version on the Mac). If the version of MS Excel is 2011, download and install StatPlus.