Python for Data Science Syllabus
Course I: Python for Data Science
15 hours (may need more for beginners)
Environment Setup
- Installation
- Virtual Environments
- Downloads
- Connections
- Hello World!
Python Basics
- Why we Program?
- Types
- Syntax
- Expressions and Variables
- Strings
- String Operations
Python Data Structures
- Lists
- Tuples
- Dictionaries
- Sets
- Comprehensions
Python Programming Fundamentals
- Conditions and Branching
- Loops and Iterations
- Functions
- Objects and Classes
Working with Data in Python
- Reading Files with Open
- Writing Files with Open
- Loading Data with Pandas
- Pandas - Working with and Saving Data
- NumPy - One Dimensional
- NumPy - Two Dimensional
Course II: Applied Data Science with Python
45 hours
Part I: Introduction
- Introduction to Specialization
- Data Science & Jupyter Notebook
- Python Basics & Functions
- Types and Sequences
- More on Strings
- Reading and Writing CSV files
- Dates and Times
- Advanced Python Objects, map()
- Lambda and List Comprehensions
- NumPy Basics
Part II: Data Cleansing and Processing with Pandas
- The Series Data Structure
- Querying a Series
- Data Frame Structure
- Indexing, Loading, Querying
- Missing Values
- Merging Data Frames
- Group by, Scales, Pivot Tables
- Pandas Idioms
Part III: Statistical Techniques
- Distributions
- Hypothesis Testing
Data Visualization and Representation
Part I: Basic Plots
- Matplotlib Architecture
- Basic Plotting
- Scatterplots
- Line Plots
- Bar Charts
Part II: Visualization Techniques
- Subplots
- Histograms
- Box Plots
- Plotting with Pandas
Machine Learning
Part I: Fundamentals
- Introduction to ML
- Scikit-Learn Basics
- K-Nearest Neighbours
Part II: Supervised Learning
- Overfitting and Underfitting
- Linear Regression
- Logistic Regression
- SVMs
- Cross-Validation
- Decision Trees
Part III: Model Selection and Evaluation
- Confusion Matrix
- Evaluation Metrics
- Precision-recall, ROC Curves
- Optimizing for Metrics
Part IV: Advanced Learning (Optional)
- Naive Bayes
- Random Forests
- Gradient Boosted Trees
- Neural Networks
- Data Leakage
- Clustering
Text Classification (Optional)
- Text Mining Basics
- Regex & Text Handling
- NLTK for NLP
- Text Classification
- Feature Extraction
- Naive Bayes, SVM
- Sentiment Analysis Case Study
- YOLO Case Study
- Automation Case Study