Data Science with Python

DATA SCIENCE WITH PYTHON

Introduction to Data Science

  • 1. What is Data Science?
  • 2. Importance of data science
  • 3. Demand for Data Science Professional
  • 4. Data Science Life cycle
  • 5. Tools and Technologies used in data science.
  • 6. Roles and Responsibilities of a Data Scientist

COURSE 1: STATISTICS FOR DATASCIENCE

  • 1. Module A: Introduction to Statistics
    • a. Statistics in Business
    • b. Types of Data
    • c. Data Measurement Scales
    • d. Fundamentals of Probability
  • 2. Module B: Descriptive Statistics
    • a. Measures of central tendency (Mean, Median and Mode)
    • b. Measure of dispersion/spread (Variance and Standard Deviation)
    • c. Kurtosis and Skewness
    • d. Types of Probability Distributions
  • 3. Module C: Inferential Statistics
    • a. What is inferential statistics
    • b. Different types of Sampling techniques
    • c. Central Limit Theorem
    • d. Point estimate and Interval estimate
    • e. Creating confidence interval for population parameter
    • f. Characteristics of Z-distribution and T-Distribution
  • 4. Module D: Hypothesis Testing
    • a. Basics of Hypothesis Testing
    • b. Type of test and Rejection Region
    • c. Type of errors-Type 1 Error and Type 2 Errors
    • d. Parametric vs Non-Parametric Testing
    • e. ANOVA and Chi-Square testes
  • 5. Module E: Correlation & Regression
    • a. Introduction to Regression
    • b. Type of Regression
    • c. Correlation
    • d. Weak and Strong Correlation

COURSE 2: PYTHON FOR DATA SCIENCE

  • 1. Module A: Programming Basics - Python
    • a. Installing Jupiter Notebooks
    • b. Python Overview
    • c. Python various Operators and Operators Precedence
    • d. Getting input from user, comments, Multi line comments
  • 2. Module B: Making Decisions and Loop - Python
    • a. Types of Operators
    • b. Data Types
    • c. Flow Controls (Loops)
    • d. Functions
    • e. List compressors
  • 3. Module C: List,Tuples,Dictionaries– Python
    • a. Python Lists,Tuples,Dictionaries
    • b. Accessing Values
    • c. Basic Operations
    • d. Indexing, Slicing, and Matrixes
    • e. Built-in Functions & Methods
  • 4. Module D: Functions And Modules – Python
    • a. Introduction To Functions – Why
    • b. Defining Functions
    • c. Calling Functions
    • d. Functions With Multiple Arguments.
    • e. Anonymous Functions - Lambda
  • 5. Module F: Introduction of Essential Python Libraries for Data Science
    • a. Numpy
    • b. Pandas
    • c. Matplotlib
    • d. Scikit-learn
    • e. Seaborn
  • 6. Module G: Numpy Package
    • a. Importing Numpy
    • b. Numpy overview
    • c. Numpy Array creation and basic operations
    • d. Indexing and Slicing
    • e. Iterating over array
    • f. Array manipulation
    • g. Numpy universal functions
    • h. Shape Manipulation
    • i. Stacking and Splitting Arrays
    • j. Indexing: Arrays of Indices, Boolean Arrays
  • 7. Module H: Pandas Package
    • a. Importing Pandas
    • b. Pandas overview
    • c. Object Creation: Series Object , Data Frame Object
    • d. Handling the data and exporting the data
    • e. Pandas Sorting
    • f. Indexing, Selecting and filtering
  • 8. Module I: Python Advanced: Data Mugging/Wrangling with Pandas
    • a. Handling Missing Data (Fillna, Dropna, Replace, Interpolate etc.,)
    • b. Group by Method
    • c. Merging, Joining and Concatenating Data Frames
    • d. Pivot Table
    • e. Reshaping the Data Frame using melt
    • f. Crosstab
  • 9. Module J: Python Advanced: Visualization with Matplotlib and Seaborn
    • a. Introduction to Matplotlib
    • b. Creating basic chart : Line Chart, Bar Charts and Pie Charts
    • c. Plotting from Pandas object
    • d. Saving a plot
    • e. Multiple Plots
    • f. Plot Formatting : Custom Lines, Markers, Labels, Annotations, Colors
    • g. Statistical Plots with Seaborn (Distribution Plots, Categorical Plots, Matrix and regression plots)

COURSE 3: UNDERSTANDING AND IMPLEMENTING MACHINE LEARNING

  • 1. Module A: Introduction to Machine Learning
    • a. What is Machine Learning
    • b. Applications of Machine Learning
    • c. Types of Machine Learning
    • d. Machine Learning Process
    • e. Python libraries suitable for Machine learning
  • 2. Module B: Data Processing for Machine Learning
    • a. What is data preprocessing
    • b. Exploration of data (Uni-variate & Bi-variate analysis)
    • c. Outlier Detection and Treatment
    • d. Preprocess Data
      • i. Formatting
      • ii. Cleaning
      • iii. Sampling
    • e. Transform Data
  • 3. Module C: Algorithms for Machine learning
    • a. Supervised Learning Algorithms
      • 1. Linear Regression
        • i. Concepts and Application
        • ii. Simple Linear Regression
        • iii. Multivariate Linear Regression
        • iv. Lasso Regression
        • v. Ridge Regression
      • 2. Logistic Regression – Concepts & Application
      • 3. kNN – Concepts & Application
      • 4. Decision Tree and random Forest – Concepts & Application
      • 5. Support Vector Machines – Concepts & Application
      • 6. Naïve Bayes – Concepts & Application
    • b. Unsupervised Learning
      • i. k Means Clustering
      • ii. Hierarchal Clustering
  • 4. Module D: Dimensionality Reduction Techniques
    • a. PCA – Principal Component Analysis
    • b. LDA – Linear Discriminant Analysis
  • 5. Module E: Other Topics
    • a. K-fold Cross Validation
    • b. Stratified Cross Validation
    • c. Boosting Techniques
      • i. Ada Boost
      • ii. XG Boost

About Instructor

KudVenkat

Software Architect, Trainer, Author and Speaker in Pragim Technologies.

Subscribe Email Alerts

If you wish to receive email alerts when new articles, videos or interview questions are posted on PragimTech.com, you can subscribe by providing your valid email.