📅 Duration: 4 Weeks (Twice a Week) 🖥 Format: Online Live Sessions + Hands-on Assignments
📌 Session 1:
Optimizing DataFrames in Pandas (Efficient Memory Usage, astype())
astype()
Advanced Data Cleaning Techniques (Handling Outliers, Duplicates)
Working with Large Datasets (dask, polars, vaex)
dask
polars
vaex
Mini Project: Optimizing a Large E-Commerce Dataset
📌 Session 2:
SQL for Data Analysis (Joins, Subqueries, Window Functions)
Integrating Pandas with SQL (sqlite3, SQLAlchemy)
sqlite3
SQLAlchemy
Writing Efficient Queries for Large Databases
Mini Project: Analyzing Customer Transactions with SQL
📌 Session 3:
Introduction to Time-Series Analysis
Working with Time-Series Data in Pandas (resample(), Rolling Windows)
resample()
Decomposing Time-Series (Trend, Seasonality, Noise)
Mini Project: Forecasting Sales Trends with Time-Series Data
📌 Session 4:
Advanced Data Visualization Techniques
Creating Interactive Dashboards (Plotly, Dash, Streamlit)
Plotly
Dash
Streamlit
Geographic Data Visualization with geopandas & folium
geopandas
folium
Mini Project: Building an Interactive Sales Dashboard
📌 Session 5:
Introduction to Inferential Statistics & Hypothesis Testing
Linear & Multiple Regression for Data Analysis
Feature Engineering & Selection for Predictive Analysis
Mini Project: Predicting House Prices Using Regression Models
📌 Session 6:
Introduction to Machine Learning for Data Analysis (scikit-learn)
scikit-learn
Clustering Techniques (K-Means, DBSCAN)
K-Means
DBSCAN
Anomaly Detection for Fraud Analysis
Mini Project: Customer Segmentation Using Clustering
📌 Session 7:
Handling Big Data with PySpark & Hadoop Basics
PySpark
Hadoop
Automating Data Pipelines with Airflow
Airflow
Working with APIs & Web Scraping for Data Collection
Mini Project: Scraping and Analyzing Twitter Sentiments
📌 Session 8:
Final Project Implementation & Debugging
Code Optimization & Performance Tuning
Final Project Showcase & Code Review
Next Steps: Deep Learning, AI-Driven Analytics, Data Engineering