Mob: +254 721 130 397, +254 780 342 333 | Email: info@learnovate.co.ke

Python for Data Science


Course Description

Python for Data Science. Python is open source, interpreted, high level language and provides great approach for object-oriented programming. It is one of the best languages used by data scientists for various data science projects/applications. It provides great libraries to deal with data science applications.

Data Science with Python has a really good future. Data Scientists and Data Science are always improving and are projected to change to a vast extent over the next ten years. We can clearly say that Data Scientists will have a lot of scope in the future, and companies looking for Data Scientists will increase. The best job in the future you will get is data science jobs.

Objectives

  • Basic process of data science.
  • Python and Jupyter notebooks.
  • An applied understanding of how to manipulate and analyze uncurated datasets.
  • Basic statistical analysis and machine learning methods.
  • How to effectively visualize results.

By the end of the course, you should be able to find a dataset, formulate a research question, use the tools and techniques of this course to explore the answer to that question, and share your findings.

Target Audience

  • This course is intended for learners who have a basic knowledge of programming in any language (Java, C, C++, Pascal, Fortran, JavaScript, PHP, python, etc.). You could have learned these basic programming skills on your own or taken a course in programming in high school or college.

Prerequisites

  • Before proceeding with this tutorial, you should have a basic knowledge of writing code in Python programming language, using any python IDE and execution of Python programs. If you are completely new to python then please refer our Python tutorial to get a sound understanding of the language.

Duration

  • 10 Weeks

Course Outline

Module 1: Introduction to Data Science

  • What is Data Science?
  • What is Machine Learning?
  • What is Deep Learning?
  • What is AI?
  • Data Analytics & it’s types

Module 2: Introduction to Python

  • What is Python?
  • Why Python?
  • Installing Python
  • Python IDEs
  • Jupyter Notebook Overview

Module 3: Python Basics

  • Python Basic Data types
  • Lists
  • Slicing
  • IF statements
  • Loops
  • Dictionaries
  • Tuples
  • Functions
  • Array
  • Selection by position & Labels

Module 4: Python Packages

  • Pandas
  • Numpy
  • Sci-kit Learn
  • Mat-plot library

Module 5: Importing data

  • Reading CSV files
  • Saving in Python data
  • Loading Python data objects
  • Writing data to csv file

Module 6: Manipulating Data

  • Selecting rows/observations
  • Rounding Number
  • Selecting columns/fields
  • Merging data
  • Data aggregation
  • Data munging techniques

Module 7: Statistics Basics

  • Central Tendency
  • Mean
  • Median
  • Mode
  • Skewness
  • Normal Distribution
  • Probability Basics
  • What does mean by probability?
  • Types of Probability
  • ODDS Ratio?
  • Standard Deviation
  • Data deviation & distribution
  • Variance
  • Bias variance Trade off
  • Underfitting
  • Overfitting
  • Distance metrics
  • Euclidean Distance
  • Manhattan Distance
  • Outlier analysis
  • What is an Outlier?
  • Inter Quartile Range
  • Box & whisker plot
  • Upper Whisker
  • Lower Whisker
  • catter plot
  • Cook’s Distance
  • Missing Value treatments
  • What is a NA?
  • Central Imputation
  • KNN imputation
  • Dummification
  • Correlation
  • Pearson correlation
  • Positive & Negative correlation
  • Error Metrics
  • Classification
  • • Confusion Matrix
  • • Precision
  • • Recall
  • • Specificity
  • • F1 Score
  • • Regression
  • • MSE
  • • RMSE
  • • MAPE

Module 8: Machine Learning

Module 9: Supervised Learning

  • Linear Regression
  • Linear Equation
  • Slope<
  • Intercept
  • R square value
  • Logistic regression
  • ODDS ratio
  • Probability of success
  • Probability of failure
  • ROC curve
  • Bias Variance Tradeoff

Module 10: Unsupervised Learning

  • K-Means
  • K-Means ++
  • Hierarchical Clustering

Module 11: Other Machine Learning algorithms

  • K – Nearest Neighbour
  • Naïve Bayes Classifier
  • Decision Tree – CART
  • Decision Tree – C50
  • Random Forest

Accreditations

 

Contact Information

Eco Bank Towers, 4th Floor Muindi Mbingu Street
P. O. Box 21857 - 00100 Nairobi

Mob: +254 780 342 333, +254 202 246145, 2246154 

Copyright © 2022 Learnovate Technologies Limited. All rights reserved