This block of Python for AIML online course will teach you all about the Exploratory Data Analysis like Preprocessing, Missing values, etc.
Data, Data Types, and Variables
This module will drive you through some essential data types and variables.
Central Tendency and Dispersion
Central tendency is expressed by median and mode. Dispersion is described by data that is distributed around this central tendency. Dispersion is represented by a range, deviation, variance, standard deviation and standard error.
5 point summary and skewness of data
5 point summary is a large set of descriptive statistics, which provides information about a dataset. Skewness characterises the degree of asymmetry of a distribution around its mean.
Box-plot, covariance, and Coeff of Correlation
This module will teach you how to solve the problems of Box-plot, Covariance, and Coefficient of Correlation using Python.
Univariate and Multivariate Analysis
Univariate Analysis and Multivariate Analysis are used for statistical comparisons.
Encoding Categorical Data
You will learn how to encode and transform categorical data using Python in this module.
Scaling and Normalization
In Scaling, you change the range of your data. In normalisation, you change the shape of the distribution of your data.
What is Preprocessing?
The process of cleaning raw data for it to be used for machine learning activities is known as data pre-processing. It’s the first and foremost step while doing a machine learning project. It’s the phase that is generally most time-taking as well. In this module, you will learn why is preprocessing required and all the steps involved in it.
Imputing missing values
Missing values results in causing problems for machine learning algorithms. The process of identifying missing values and replacing them with a numerical value is known as Data Imputation.
Working with Outliers
An object deviating notably from the rest of the objects, is known as an Outlier. A measurement or execution error causes an Outlier. This module will teach you how to work with Outliers.
The pandas-profiling library generates a complete report for a dataset, which includes data type information, descriptive statistics, correlations, etc.