By the end of this program, participants will master the art of data analysis, statistical modeling, and building machine learning models to gain actionable insights from data.
Overview of data science, its lifecycle, and applications. Understanding data types and sources.
Analyze a publicly available dataset and summarize key findings.
Techniques for cleaning and preparing data. Handling missing values, outliers, and categorical data.
Preprocess a messy dataset and handle missing values.
Visualizing data and performing statistical analysis to uncover patterns.
Conduct an EDA on a real-world dataset and create visualizations.
Conducting hypothesis tests and understanding statistical significance.
Perform a hypothesis test and interpret the results.
Overview of machine learning models, including supervised and unsupervised learning.
Build a linear regression model using a sample dataset.
Implementing regression and classification models such as KNN, Decision Trees, and Random Forest.
Build a classification model to predict customer churn.
Techniques for clustering and dimensionality reduction, including K-Means and PCA.
Apply K-Means clustering to group data points based on features.
Understanding accuracy, precision, recall, and F1 score. Techniques for model validation.
Evaluate multiple models and compare their performance.
Introduction to deep learning and neural networks.
Create a simple neural network for image classification.
Advanced data visualization using Matplotlib, Seaborn, and Plotly.
Create interactive visualizations for a dataset using Plotly.
Introduction to big data frameworks such as Hadoop and Spark.
Perform basic data processing using Apache Spark.
Develop a data science project and prepare for interviews in the field.
Present your capstone project and undergo a mock interview.