Skip to main content

Python for Data Engineering


Development Academy of the Philippines
Enrollment is Closed

Course Overview

Python is a versatile language considered to be one of the essential tools of a data engineer. This course builds on the Computing in Python course and teaches the participants techniques for efficient data handling. Using teaching videos and practical exercises, the student will learn how to handle medium to large data sets and build a basic data pipeline using Python.

Note to SPARTA scholars: Upon enrollment, you will have 6 months to finish a SPARTA course. Failure to complete the course in 6 months and/or inactivity for 3 months will result in course access revocation.

What You Will Learn

Upon completion of this course, the learners are expected to:

  • work with medium and large data sets using Python and pandas;
  • overcome CPU and I/O limits using multithreading;
  • use different data structures and algorithms to speed up data analysis; and
  • use functional programming to build data pipelines with Python.

Course Instructors

Course Staff Image #1

Simon Lorenzo

Subject Matter Expert



Course Staff Image #2

Pierre Allan Villena

Subject Matter Expert



Course Content

Week 1: Introduction to Python for Data Engineering

10 Videos | 3 Activities

10 Videos

  • Welcome to the course!
  • Variable Types
  • Variable Characteristics
  • Lists
  • Dictionaries
  • Tuples
  • Libraries
  • Conditional Statements
  • Loop
  • Functions

3 Activities

  • Recall Activities (2)
  • Peer-Graded Assignment: Creating Functions

Week 2: Gathering and Storing Data

5 Videos | 3 Activities

5 Videos

  • Handling CSV
  • Handling Excel File
  • Handling PDF Files
  • Scraping from Websites
  • Loading to Persistent Storage

3 Activities

  • Recall Activities (2)
  • Peer-Graded Assignment: Gathering and Storing Data

Week 3: Cleaning and Preparing Data (Part 1)

5 Videos | 2 Activities

5 Videos

  • Inspecting DataFrames
  • Editing Columns of DataFrames
  • Mapping Existing Values
  • Removing Duplicate Rows
  • Retrieving Specific Columns and Rows

2 Activities

  • Recall Activity
  • Peer-Graded Assignment: Data Preprocessing Part 1

Week 4: Cleaning and Preparing Data (Part 2)

5 Videos | 2 Activities

5 Videos

  • Filtering and Sorting
  • Standardizing Column Values
  • Meging DataFrames
  • Dealing with Missing Values
  • Outlier Handling

2 Activities

  • Recall Activity
  • Quiz

Week 5: Feature Engineering

6 Videos | 3 Activities

6 Videos

  • Discretization
  • Variable Transformation
  • Feature Scaling
  • Interaction Features
  • Engineering Datetime Variables
  • Key Takeaways

3 Activities

  • Recall Activity
  • Peer-Graded Assignment: Feature Engineering
  • Capstone Project
  1. Course Number

    SP702
  2. Classes Start

  3. Classes End

  4. Estimated Effort

    1-2 hours/week (10 hours)
  5. Price

    ₱1000