Python for Data Science Syllabus

Course I: Python for Data Science

15 hours (may need more for beginners)

Environment Setup

  • Installation
  • Virtual Environments
  • Downloads
  • Connections
  • Hello World!

Python Basics

  • Why we Program?
  • Types
  • Syntax
  • Expressions and Variables
  • Strings
  • String Operations

Python Data Structures

  • Lists
  • Tuples
  • Dictionaries
  • Sets
  • Comprehensions

Python Programming Fundamentals

  • Conditions and Branching
  • Loops and Iterations
  • Functions
  • Objects and Classes

Working with Data in Python

  • Reading Files with Open
  • Writing Files with Open
  • Loading Data with Pandas
  • Pandas - Working with and Saving Data
  • NumPy - One Dimensional
  • NumPy - Two Dimensional

Course II: Applied Data Science with Python

45 hours

Part I: Introduction

  • Introduction to Specialization
  • Data Science & Jupyter Notebook
  • Python Basics & Functions
  • Types and Sequences
  • More on Strings
  • Reading and Writing CSV files
  • Dates and Times
  • Advanced Python Objects, map()
  • Lambda and List Comprehensions
  • NumPy Basics

Part II: Data Cleansing and Processing with Pandas

  • The Series Data Structure
  • Querying a Series
  • Data Frame Structure
  • Indexing, Loading, Querying
  • Missing Values
  • Merging Data Frames
  • Group by, Scales, Pivot Tables
  • Pandas Idioms

Part III: Statistical Techniques

  • Distributions
  • Hypothesis Testing

Data Visualization and Representation

Part I: Basic Plots

  • Matplotlib Architecture
  • Basic Plotting
  • Scatterplots
  • Line Plots
  • Bar Charts

Part II: Visualization Techniques

  • Subplots
  • Histograms
  • Box Plots
  • Plotting with Pandas

Machine Learning

Part I: Fundamentals

  • Introduction to ML
  • Scikit-Learn Basics
  • K-Nearest Neighbours

Part II: Supervised Learning

  • Overfitting and Underfitting
  • Linear Regression
  • Logistic Regression
  • SVMs
  • Cross-Validation
  • Decision Trees

Part III: Model Selection and Evaluation

  • Confusion Matrix
  • Evaluation Metrics
  • Precision-recall, ROC Curves
  • Optimizing for Metrics

Part IV: Advanced Learning (Optional)

  • Naive Bayes
  • Random Forests
  • Gradient Boosted Trees
  • Neural Networks
  • Data Leakage
  • Clustering

Text Classification (Optional)

  • Text Mining Basics
  • Regex & Text Handling
  • NLTK for NLP
  • Text Classification
  • Feature Extraction
  • Naive Bayes, SVM
  • Sentiment Analysis Case Study
  • YOLO Case Study
  • Automation Case Study