Python for Data Analytics
Course Description
This training course teaches you to master the concepts of Python programming. Through this Python for Data Analytics training, you will gain knowledge in data analysis, machine learning, data visualization, web scraping, & natural language processing. Upon course completion, you will master the essential tools of Data Science with Python.
The Data Analytics with Python course provides a complete overview of Data Analytics tools and techniques using Python. Learning Python is a crucial skill for many Data Science roles. Acquiring knowledge in Python will be the key to unlock your career as a Data Scientist and Data Analyst.
Python is a general-purpose programming language, meaning it can be used in the development of both web and desktop applications. It’s also useful in the development of complex numeric and scientific applications. With this sort of versatility, it comes as no surprise that Python is one of the fastest-growing programming languages in the world.
Through a series of hands-on exercises, students will learn to turn data into actionable information. The world is drowning in data. Each day 2.5 Exabytes of data (250 new Library of Congresses built or 90 years of HD video) is produced. The problem is getting the data into a format which can be used by tools that help in understanding and verifying the data. Python programming is relatively quick to learn and has a great set of tools for importing, transforming, exploring, extracting insights from, making predictions with, and exporting the data. This course introduces the major Python tools used for preparing the data for analysis, the tools available for understanding the data, and using the data for insights and predictions. All class work and exercises are done in Python 3.x.
Objectives
- Learn how to use Jupyter notebooks.
- Learn how to work with NumPy datatypes.
- Be proficient in pandas Series.
- Be proficient in pandas DataFrames.
- Understand how to use data visualization.
- Know how to import and clean data.
- Introduce statistical tools for working with data sets.
- An introduction to the problems of working with PDF data sources.
- Introduce machine learning tools for working with data sets.
- Work through a complete data analysis to understand how the tools interact with each other.
Target Audience
- Anyone wanting to use Python as part of their data analysis program.
Prerequisites
- Basic knowledge of Python.
Duration
- 12 Weeks
Course Outline
Unit 1: Advanced Python Review
- A Python Development Environment
- A Review of Data type
- The New Class Structure
- Python Best Practices
Unit 2: Ipython Notebook
- Functionality Provided – Why Use Them?
- CRUD for Notebooks
- Interface and Shortcuts
Unit 3: NumPy
- Datatypes
- Universal Functions
- Indexing
- Summary Methods
- Sorting
- Computations and Broadcasting
Unit 4: SciPy
- Overview of SciPy
- Statistical Functions
Unit 5: Panda: Series
- Pandas Series Structure
- Series CRUD
- Indexing and Access Techniques
- Data Methods
Unit 6: Pandas: DataFrame Basics
- DataFrame Construction
- DataFrame Change and Reorganization
- Indexing and Access Techniques
- Grouping, Pivoting, and Reshaping
- DataFrame CRUD
Unit 7: Pandas DataFrame: Data Manipulation
- Statistics
- Data Methods
- Missing Data Tools
Unit 8: Understanding Data Visualization
- Visualization Is Storytelling
- Types of Charts
- Colors Yes and No
- Common Mistakes
- Best Practices
- Reproducibility
Unit 9: Matplotlib for Data Visualization
- Steps for Creating a Data Visualization
- Jupyter Notebooks and Matplotlib
- Matplotlib Styles
- Small Multiples
- Panda Series Plotting
- Panda Dataframe Plotting
Unit 10: Advanced Techniques
- Seaborn
- Bokeh
Unit 11: Data Cleaning
- Importing Data: csv, xml, html, xls
- Problems of PDF Data Sources
- Transformations Data
- Missing Data
- Time Series Problems
- Automation of Process
Unit 12: Statistics for Understanding Data
- Exploratory Data Analysis Tools: PMF, CDF, Correlation, Least Squares
- A/B Testing
- Hypothesis Test
- Statistical Significance, P-Values, and Confidence Intervals
- Z- and T- Statistics
Unit 13: Approach to Understanding Data
- Overview of Approach
- Great Data Sources
- Class Demonstration on Data Set
- Team Project: Working a Project
Unit 14: Introduction to Statistical Techniques
- Regression and Prediction
- Classification
- K-Nearest Neighbors
- Tree Models
- Clustering
Unit 15: Introduction to Machine Learning
- Regression and Prediction
- Classification
- K-Nearest Neighbors
- Clustering
- Neural Networks
- Deep Learning