2) Data Science With Python
2) Data Science With Python
WITH PYTHON
Installing Libraries
If you installed Anaconda, you do not need to download any libraries as it automatically
installs all the popular data science libraries such as Pandas, Numpy, Matplotlib, Seaborn,
etc.
Importing Libraries
• Open your Jupyter Notebook.
• To import a library we use the keyword import followed by library name.
• We can use the as keyword to use abbreviations for our library names.
• The common abbreviations used are
• pd for pandas
• np for numpy
• plt for matplotlib.pyplot
• sns for seaborn
Pandas Library for Data Science
• Pandas is a Python library for data manipulation and analysis.
• It allows exploring, cleaning, and processing tabular data.
• It provides two ways for storing data;
• Series, which is one dimensional data structure
• Data Frame, which is two dimensional data structure
Series
DataFrame
NumPy Library for Data Science
• NumPy stands for Numerical Python.
• It provides a data structure called NumPy array, which is a grid of values.
• It also provides a collection of high-level mathematical functions which can be
performed on multi-dimensional NumPy arrays.
Pandas vs NumPy
NumPy Pandas
NumPy and Pandas are both Python libraries for Data Science
It is used for scientific computing It is used for data manipulation such as
storing, exploring, cleaning, and
processing the data
It provides NumPy arrays which can be It provides two data structures;
multidimensional • Series (one dimensional)
• Data frames (two dimensional)
We use Pandas for data manipulation and NumPy for Mathematical Computations
Since Pandas Series and Data Frames can be thought of as one and two dimensional
NumPy arrays respectively, we can apply NumPy mathematical functions on them as
well
Matplotlib Library for Data Science
• Matplotlib is a visualization Python library, i.e., it is used for plotting graphs.
• The pyplot module inside of the Matplotlib provides the interface to underlying
plotting functionality of the Matplotlib.
• We can create a number of different types of graphs using Matplotlib such as line
graphs, bar graphs, histograms, scatter plots, area plots, pie plots, and so on.
Seaborn Library for Data Science
• Seaborn is another visualization Python library built on top of Matplotlib.
• It extends the functionality of Matplotlib and allows creating a variety of different
graphs with fewer syntax.
NumPy Arrays
What are NumPy Arrays
• NumPy array is a multidimensional data structure designed to handle large data sets
easily.
• A NumPy array is called ndarray.
• We can find the number of dimensions of a NumPy array using .ndim.
• A 1-D NumPy array is where each element of the outermost array is a 0-D array (scalar).
• We can create a NumPy array using the array() function in the NumPy library.
• We can create a NumPy array using either Python lists or tuples.
• To create a 1-D NumPy Array, we provide a 1-D Python list or tuple to the array()
function.
Creating NumPy Arrays (2/3)
2-D NumPy Arrays
• A 2-D NumPy array is where each element in the outermost array is a 1-D array.
• To create a 2-D NumPy Array, we provide a 2-D Python list or tuple to the array()
function.
Creating NumPy Arrays (3/3)
3-D NumPy Arrays
• A 3-D NumPy array is where each element of the outermost array is a 2-D array.
• To create a 3-D NumPy Array, we provide a 3-D Python list or tuple to the array()
function.
Quiz Time
1. How many dimensions are in the array [[[1, 2, 3, 4]]]
a) 1
b) 2
c) 3
Quiz Time
1. How many dimensions are in the array [[[1, 2, 3, 4]]]
a) 1
b) 2
c) 3
Indexing NumPy Arrays (1/8)
1-D NumPy Arrays
• Indexing a 1-D NumPy array is the same as indexing a 1-D Python list.
• Provide the index of the element inside the square brackets to get that element.
Indexing NumPy Arrays (2/8)
2-D NumPy Arrays
• To index a 2-D NumPy array, we provide 2 values inside the square brackets ([ ]).
• First value is the index of the inner array
• Second value is the index of the element inside the inner array
• In the following example, we get the first element of the second array.
Indexing NumPy Arrays (3/8)
2-D NumPy Arrays
• To index a 3-D NumPy array, we provide 3 values inside the square brackets ([ ]).
• First value is the index of the inner 2-D array in the first dimension.
• Second value is the index of the inner 1-D array in the second dimension.
• Third value is the index of the element in the third dimension.
• In the following example, we get the first element of the second array of the first array.
Indexing NumPy Arrays (6/8)
3-D NumPy Arrays
• We can use a for loop to iterate over a 1-D array just as we do in a 1-D Python list.
Iterating Over NumPy Arrays (2/8)
2-D NumPy Arrays
• We use another for loop nested inside the outer for loop to iterate over the inner array.
• We print all the elements in each of the inner arrays.
Iterating Over NumPy Arrays (5/8)
3-D NumPy Arrays
• Each of the 2-D array contains 2 arrays in the second dimension, each of which is 1-D.
• We use another for loop nested within the first for loop to print these 1-D arrays.
Iterating Over NumPy Arrays (8/8)
3-D NumPy Arrays
• Each of the 1-D array contains 3 elements in the third dimension, each of which is 1-D.
• We use another for loop nested within the first 2 for loops to print these elements.
Mathematics for Data Science
• To create a NumPy array prefilled with zeros, we can use the .zeros() built-in NumPy
function.
• .zeros() gives us a list prefilled with float zeros. To convert this list into integer list, we
use the .astype() function.
.ones()
• To create a NumPy array prefilled with ones, we can use the .ones() built-in NumPy
function.
• .ones() gives us a list prefilled with float ones. To convert this list into integer list, we
use the .astype() function.
.full()
• To create a NumPy array prefilled with some specific number, we can use the .full()
built-in NumPy function.
• First argument in the .full() function is the size of the array
• Second argument in the .full() function is the value that we want our list to be pre-
filled with
Quiz Time
1. What is the correct syntax to create a numpy array of 9 elements filled with
all zeros (float)?
a) np.zeros()
b) np.zeros(0)
c) np.zeros(9)
Quiz Time
1. What is the correct syntax to create a numpy array of 9 elements filled with
all zeros (float)?
a) np.zeros()
b) np.zeros(0)
c) np.zeros(9)
Scalar Operations (1/5)
Addition
• We can add a scalar to a NumPy array simply by using the (+) operator.
• The scalar quantity is added to each of the elements of the array.
• Note that adding a scalar to a Python list will result in an error.
Scalar Operations (2/5)
Subtraction
• We can subtract a scalar from a NumPy array simply by using the (-) operator.
• The scalar quantity is subtracted from each of the elements of the array.
• Note that subtracting a scalar from a Python list will result in an error.
Scalar Operations (3/5)
Multiplication
• We can multiply a scalar with a NumPy array simply by using the (*) operator.
• The scalar quantity is multiplied with each of the elements of the array.
• Note that multiplying a scalar with a Python list will result in list concatenation.
Scalar Operations (4/5)
Division
• We can divide a NumPy array by a scalar simply by using the (/) operator for float
division or (//) operator for integer division.
• Each of the elements of the array is divided by the scalar.
• Note that dividing a Python list by a scalar will result in an error.
Scalar Operations (5/5)
Power
• We can raise each element of a NumPy array to a power simply by using the (**)
operator.
• Note that raising the elements of a Python list using (**) operator will result in an error.
Transpose
• We can take the transpose of a NumPy array by putting .T at the end of the array.
• Note that taking transpose of a Python list will result in an error.
Element-wise Operations (1/4)
Addition
• We can add the elements of two NumPy arrays simply by using (+) operator.
• Each element of the first array is added to the corresponding element of the second
array.
• Note that adding two Python lists using plus (+) operator is not possible. Instead the
lists are concatenated if we use (+) operator.
Element-wise Operations (2/4)
Subtraction
• We can subtract the elements of two NumPy arrays simply by using (-) operator.
• Each element of the second array is subtracted from the corresponding element of the
first array.
• Note that subtracting the elements of two Python lists using (-) operator will result in
an error.
Element-wise Operations (3/4)
Multiplication
• We can multiply the elements of two NumPy arrays simply by using (*) operator.
• Each element of the first array is multiplied by the corresponding element of the
second array.
• Note that multiplying the elements of two Python lists using (*) operator will result in
an error.
Element-wise Operations (4/4)
Division
• We can divide the elements of two NumPy arrays simply by using (/) operator.
• Each element of the first array is divided by the corresponding element of the second
array.
• Note that dividing the elements of two Python lists using (/) operator will result in an
error.
Matrix Multiplication
• Apart from elementwise multiplication, NumPy also provides us with a built-in function
to compute the matrix multiplication of two arrays.
• We use the .matmul() function inside the NumPy library for matrix multiplication of
two arrays.
Quiz Time
1. Which of the following statements is correct?
a) * is used for array multiplication
b) * is used for element-wise multiplication
c) * is used for power
Quiz Time
1. Which of the following statements is correct?
a) * is used for array multiplication
b) * is used for element-wise multiplication
c) * is used for power
Statistics (1/7)
.min()
• .min() function gives us the minimum value in a NumPy array.
• This function can also be applied on Python lists.
Statistics (2/7)
.max()
• .max() function gives us the maximum value in a NumPy array.
• This function can also be applied on Python lists.
Statistics (3/7)
.sum()
• .sum() function gives us the sum of all the values in a NumPy array.
• This function can also be applied on Python lists.
Statistics (4/7)
.mean()
• .mean() function gives us the mean of all the values in a NumPy array.
• This function can also be applied on Python lists.
Statistics (5/7)
.std()
• .std() function gives us the standard deviation of a NumPy array.
• This function can also be applied on Python lists.
Statistics (6/7)
.median()
• .median() function gives us the median of a NumPy array.
• This function can also be applied on Python lists.
Statistics (7/7)
• A detailed list of NumPy statistical functions can be found at the link below;
https://numpy.org/doc/stable/reference/routines.statistics.html
Resources
• https://www.w3schools.com/python/numpy/numpy_intro.asp
• https://www.tutorialspoint.com/numpy/index.htm