0% found this document useful (0 votes)
16 views139 pages

Ccs346 Eda Unit 1

Uploaded by

ANITHARANI K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views139 pages

Ccs346 Eda Unit 1

Uploaded by

ANITHARANI K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 139

L T P C

CCS346 EXPLORATORY DATA ANALYSIS


2 0 2 3

COURSE OBJECTIVES:
 To outline an overview of exploratory data analysis.
 To implement data visualization using Matplotlib.
 To perform univariate data exploration and analysis.
 To apply bivariate data exploration and analysis.
 To use Data exploration and visualization techniques for multivariate and time series data.

UNIT I EXPLORATORY DATA ANALYSIS 6


EDA fundamentals – Understanding data science – Significance of EDA – Making sense of data –
Comparing EDA with classical and Bayesian analysis – Software tools for EDA - Visual Aids for
EDA- Data transformation techniques-merging database, reshaping and pivoting,
Transformation techniques.

UNIT II EDA USING PYTHON 6


Data Manipulation using Pandas – Pandas Objects – Data Indexing and Selection – Operating on
Data – Handling Missing Data – Hierarchical Indexing – Combining datasets – Concat, Append,
Merge and Join – Aggregation and grouping – Pivot Tables – Vectorized String Operations.

UNIT III UNIVARIATE ANALYSIS 6


Introduction to Single variable: Distribution Variables - Numerical Summaries of Level and
Spread - Scaling and Standardizing – Inequality.

UNIT IV BIVARIATE ANALYSIS 6


Relationships between Two Variables - Percentage Tables - Analysing Contingency Tables -
Handling Several Batches - Scatterplots and Resistant Lines.

UNIT V MULTIVARIATE AND TIME SERIES ANALYSIS 6


Introducing a Third Variable - Causal Explanations - Three-Variable Contingency Tables and
Beyond – Fundamentals of TSA – Characteristics of time series data – Data Cleaning – Time-
based indexing – Visualizing – Grouping – Resampling.
30 PERIODS

PRACTICAL EXERCISES: 30 PERIODS


1. Install the data Analysis and Visualization tool: R/ Python /Tableau Public/ Power BI.
2. Perform exploratory data analysis (EDA) with datasets like email data set. Export all your
emails as a dataset, import them inside a pandas data frame, visualize them and get different
insights from the data.
3. Working with Numpy arrays, Pandas data frames , Basic plots using Matplotlib.
4. Explore various variable and row filters in R for cleaning data. Apply various plot features in
R on sample data sets and visualize.
5. Perform Time Series Analysis and apply the various visualization techniques.
6. Perform Data Analysis and representation on a Map using various Map data sets with Mouse
Rollover effect, user interaction, etc.
7. Build cartographic visualization for multiple datasets involving various countries of the
world; states and districts in India etc.
8. Perform EDA on Wine Quality Data Set.
9. Use a case study on a data set and apply the various EDA and visualization techniques and
present an analysis report.

COURSE OUTCOMES:
At the end of this course, the students will be able to:
CO1: Understand the fundamentals of exploratory data analysis.
CO2: Implement the data visualization using Matplotlib.
CO3: Perform univariate data exploration and analysis.
CO4: Apply bivariate data exploration and analysis.
CO5: Use Data exploration and visualization techniques for multivariate and time series data.
TOTAL: 60 PERIODS

TEXT BOOKS:

1. Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with Python”,
Packt Publishing, 2020. (Unit 1)
2. Jake Vander Plas, "Python Data Science Handbook: Essential Tools for Working with Data",
First Edition, O Reilly, 2017. (Unit 2)
3. Catherine Marsh, Jane Elliott, “Exploring Data: An Introduction to Data Analysis for Social
Scientists”, Wiley Publications, 2nd Edition, 2008. (Unit 3,4,5)

REFERENCES:

1. Eric Pimpler, Data Visualization and Exploration with R, GeoSpatial Training service, 2017.
2. Claus O. Wilke, “Fundamentals of Data Visualization”, O’reilly publications, 2019.
3. Matthew O. Ward, Georges Grinstein, Daniel Keim, “Interactive Data Visualization:
Foundations, Techniques, and Applications”, 2nd Edition, CRC press, 2015.
Visual Aids for EDA
As data scientists, two important goals in our work would be to extract knowledge from the
data and to present the data to stakeholders. Presenting results to stakeholders is very
complex in the sense that our audience may not have enough technical know-how to
understand programming jargon and other technicalities. Hence, visual aids are very useful
tools. In this chapter, we will focus on different types of visual aids that can be used with
our datasets. We are going to learn about different types of techniques that can be used in
the visualization of data.
In this chapter, we will cover the following topics:
 Line chart
 Bar chart
 Scatter plot
 Area plot and stacked plot
 Pie chart
 Table chart
 Polar chart
 Histogram
 Lollipop chart
 Choosing the best chart
 Other libraries to explore

Line chart
Line plots or line graphs are a fundamental type of chart used to represent data points
connected by straight lines. They are widely used to illustrate trends or changes in
data over time or across categories. Line plots are easy to understand, versatile, and
can be used to visualize different types of data, making them useful tools in data
analysis and communication.

Creating line plots:

When it comes to creating line plots in Python, you have two primary libraries to choose
from: `Matplotlib` and `Seaborn`.

Using “Matplotlib”:

`Matplotlib` is a highly customizable library that can produce a wide range of plots, including
line plots. With Matplotlib, you can specify the appearance of your line plots using a variety
of options such as line style, color, marker, and label.

1. “Single” line plot:

A single-line plot is used to display the relationship between two variables, where one
variable is plotted on the x-axis and the other on the y-axis. This type of plot is best used for
displaying trends over time, as it allows you to see how one variable changes in response to
the other over a continuous period.

1
Steps involved
Let's look at the process of creating the line chart:
1. Load and prepare the dataset.
2. Import the matplotlib library. It can be done with this command:
import matplotlib.pyplot as plt
3. Plot the graph:
plt.plot(df)
4. Display it on the screen:
plt.show()

#Loading Library
import matplotlib.pyplot as plt

#Creating sample Dataset


x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

#Creating plot and customizing visualization


plt.plot(x, y, color='green', linestyle='dashed',
linewidth=2, marker='o',
markerfacecolor='blue', markersize=8)

#Assigning labels
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')

#Giving title to graph


plt.title('Sample graph')
plt.show()

Limitations of line plots:

Line plots have some limitations that need to be considered when using them for data
visualization. These include:

2
1. Limited data types: Line plots are not suitable for all types of data. For example,
they may not work well with data that has multiple categories or data with nonlinear
relationships.
2. Can be misleading: If the scale of the y-axis is not carefully chosen, line plots can be
misleading. It is important to choose appropriate scales to avoid misinterpretation of
the data.
3. Lack of context: Line plots only show the relationship between two variables, and do
not provide context about other factors that may be influencing the data.
4. Limited visual impact: Line plots may not be as visually impactful as other types of
data visualizations, such as bar charts or scatter plots.
5. Difficulty comparing multiple datasets: When using multiple line plots to compare
different datasets, it can be difficult to visually compare the lines if they are not
plotted on the same scale or with the same y-axis limits

___________________________________________________________________________
_______
Bar charts

This is one of the most common types of visualization that almost everyone must have
encountered. Bars can be drawn horizontally or vertically to represent categorical
variables.
Bar charts are frequently used to distinguish objects between distinct collections in order to
track variations over time. In most cases, bar charts are very convenient when the changes
are large. In order to learn about bar charts, let's assume a pharmacy in Norway keeps track
of the amount of Zoloft sold every month. Zoloft is a medicine prescribed to patients
suffering from depression. We can use the calendar Python library to keep track of the
months of the year (1 to 12) corresponding to January to December:

# Let us import the required libraries


import numpy as np
import calendar
import matplotlib.pyplot as plt
# Step 1: Capture the data in Python
country = ['A', 'B', 'C', 'D', 'E']
gdp_per_capita = [45000, 42000, 52000, 49000, 47000]
#Step 2:Create the bar chart in Python using Matplotlib
colors = ['green', 'blue', 'purple', 'brown', 'teal']
plt.bar(country, gdp_per_capita, color=colors)
plt.title('Country Vs GDP Per Capita', fontsize=14)
plt.xlabel('Country', fontsize=14)
plt.ylabel('GDP Per Capita', fontsize=14)
plt.grid(True)
# Step 3: Display the graph on the screen.
plt.show()

3
#Horizontal Bar
colors = ['green', 'blue', 'purple', 'brown', 'teal']
plt.barh(country, gdp_per_capita, color=colors)
plt.title('Country Vs GDP Per Capita', fontsize=14)
plt.xlabel('Country', fontsize=14)
plt.ylabel('GDP Per Capita', fontsize=14)
plt.grid(True)
plt.show()

Scatter plot

Scatter plots are also called scatter graphs, scatter charts, scattergrams, and scatter diagrams.
They use a Cartesian coordinates system to display values of typically two variables for a
set of data.
When one continuous variable is dependent on another variable, which is under the control of
the observer When both continuous variables are independent.
There are two important concepts—independent variable and dependent variable. In
statistical modelling or mathematical modelling, the values of dependent variables rely on the
values of independent variables. The dependent variable is the outcome variable being
studied. The independent variables are also referred to as regressors. The takeaway message
here is that scatter plots are used when we need to show the relationship between two
variables, and hence are sometimes referred to as correlation plots

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')

4
plt.colorbar()
plt.show()

Combine Color Size and Alpha

You can combine a colormap with different sizes of the dots. This is best visualized if the
dots are transparent:

Example

Create random arrays with 100 values for x-points, y-points, colors and sizes:

import matplotlib.pyplot as plt


import numpy as np

x = np.random.randint(100, size=(100))
y = np.random.randint(100, size=(100))
colors = np.random.randint(100, size=(100))
sizes = 10 * np.random.randint(100, size=(100))

plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='nipy_spectral')

plt.colorbar()

plt.show()

5
# import required modules
import matplotlib.pyplot as plt

# adjust coordinates
x = [1,2,3,4,5]
y1 = [2,4,6,8,10]
y2 = [3,6,9,12,15]
# depict illustration
plt.scatter(x, y1)
plt.scatter(x,y2)
# apply legend()
plt.legend(["x*2" , "x*3"])
plt.show()

Here, we are using seaborn to load the dataset:


1. Import seaborn and set some default parameters of matplotlib:
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (8, 6)
plt.rcParams['figure.dpi'] = 150
2. Use style from seaborn. Try to comment on the next line and see the difference
in the graph:
sns.set()
3. Load the Iris dataset:
df = sns.load_dataset('iris')
df['species'] = df['species'].map({'setosa': 0, "versicolor": 1,
"virginica": 2})
4. Create a regular scatter plot:
plt.scatter(x=df["sepal_length"], y=df["sepal_width"], c =
df.species)

6
5. Create the labels for the axes:
plt.xlabel('Septal Length')
plt.ylabel('Petal length')
6. Display the plot on the screen:
plt.show()

Stacked area chart in matplotlib with stackplot

When using matplotlib, the stackplot function will allow you to create a stacked area plot in
Python. The function has two ways to input data, the fist one is stackplot(x, y), being x an
array for the values for the X-axis and y a multidimensional array representing the unstacked
values for the series and the second one is stackplot(x, y1, y2, ..., yn) where in this case y1,
y2, ..., yn are the individual unstacked arrays for each series, being n the number of series or
areas. See the example below for clarification.

import numpy as np
import matplotlib.pyplot as plt

# Data
x = np.arange(2015, 2021, 1)
series1 = [2, 3, 5, 3, 5, 6]
series2 = [1, 3, 5, 2, 5, 3]
series3 = [4, 1, 2, 4, 6, 1]
y = np.vstack([series1, series2, series3])

# Stacked area plot


fig, ax = plt.subplots()

ax.stackplot(x, y)

7
Axis limits

You might have noticed that there is a gap between the areas and the vertical lines of the box
of the plot. If you want, you can set the axis limits with the following line to remove the gaps.

import numpy as np
import matplotlib.pyplot as plt

# Data
x = np.arange(2015, 2021, 1)
series1 = [2, 3, 5, 3, 5, 6]
series2 = [1, 3, 5, 2, 5, 3]
series3 = [4, 1, 2, 4, 6, 1]
y = np.vstack([series1, series2, series3])

# Stacked area plot


fig, ax = plt.subplots()

ax.stackplot(x, y)

# Set the X-axis ticks and limits


ax.set(xlim = (min(x), max(x)), xticks = x)

plt.show()

Adding a legend

Note that the stackplot function provides an argument named labels. You can pass an array of
labels for each area to this argument in case you want to add a legend to the chart with
ax.legend.

import numpy as np
import matplotlib.pyplot as plt

# Data
x = np.arange(2015, 2021, 1)

8
series1 = [2, 3, 5, 3, 5, 6]
series2 = [1, 3, 5, 2, 5, 3]
series3 = [4, 1, 2, 4, 6, 1]
y = np.vstack([series1, series2, series3])

# Stacked area plot


fig, ax = plt.subplots()

ax.stackplot(x, y, labels = ["G1", "G2", "G3"])


ax.legend(loc = 'upper left')

# Axis limits
ax.set(xlim = (min(x), max(x)), xticks = x)

# plt.show()

Color customization

The colors argument can be used to modify the default color palette of the area chart. You
can pass as many colors as areas to this argument, as in the example below. Recall that the
transparency of the areas can be set with alpha.

import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.arange(2015, 2021, 1)
series1 = [2, 3, 5, 3, 5, 6]
series2 = [1, 3, 5, 2, 5, 3]
series3 = [4, 1, 2, 4, 6, 1]
y = np.vstack([series1, series2, series3])

# Array of colors
cols = ['#FDF5E6', '#FFEBCD', '#DEB887']

# Stacked area plot


fig, ax = plt.subplots()

ax.stackplot(x, y, labels = ["G1", "G2", "G3"],

9
colors = cols, alpha = 0.9)

# Legend
ax.legend(loc = 'upper left')

# Axis limits
ax.set(xlim = (min(x), max(x)), xticks = x)

plt.show()

Baseline methods

The stackplot function provides several methods to customize the baseline. By default, the
baseline is zero, e.g. baseline = 'zero'.

Symmetric stacked area plot around zero (ThemeRiver)

Setting baseline = 'sym' will create a symmetric stacked area chart around zero. This is
sometimes called “ThemeRiver”.

import numpy as np
import matplotlib.pyplot as plt

# Data
x = np.arange(2015, 2021, 1)
series1 = [2, 3, 5, 3, 5, 6]
series2 = [1, 3, 5, 2, 5, 3]
series3 = [4, 1, 2, 4, 6, 1]
y = np.vstack([series1, series2, series3])

# Stacked area plot


fig, ax = plt.subplots()
ax.stackplot(x, y, baseline = 'sym')

# Axis limits
ax.set(xlim = (min(x), max(x)), xticks = x)

10
# plt.show()

Area plot in matplotlib

Another plot we're going to familiarize ourselves with is the area chart. It is based on the
line chart. The main difference lies in the X-axis. In an area chart, the part between the X-axis
and the line is filled with color. Area charts and line charts are good for visualizing data that
change over time. In this topic, we will learn to create area charts with matplotlib.

Creating a simple area chart

As you remember from the introductory topic, the first step is always to import matplotlib to
your code:

import matplotlib.pyplot as plt

You are ready to plot your first area chart! We'll do it with the help of plt.fill_between(x, y).
As you can see, this function takes two arguments – two arrays of numeric values. Let's say
you want to plot the number of carrots your hamster Bonnie chomps each month of the year.
We create a variable called months that stores numbers from 1 to 12 and a carrots variable
that contains a list of 12 values: carrots consumed in one month.

The X-axis values come first — first months, than carrots, not the other way around.
months = range(1, 13)
carrots = [14, 13, 10, 15, 17, 15, 15, 13, 12, 10, 14, 11]
plt.fill_between(months, carrots)

The plot is already cool. But as usual, we can work on the clarity. What do the values on the
X- and Y-axes represent? Couldn't it be better to have all numbers from 1 to 12 on the X-axis
rather than 2, 4, 6, 8, 10, and 12? Let's address these issues.

Titles and colors


11
If you have experience with any other kinds of matplotlib plots before, you're familiar with
the way titles and colors are set. If not – no stress; this section is for you!

Color is specified in the plt.fill_between() function. Use the color attribute and a str as its
value. This list of named colors can help you. "Named" means you can type the color name
as a value; for other colors, you need the RGB code.

plt.fill_between(months, carrots, color="darkorange")

The plot title and labels for both axes are on separate lines; use the following functions:
plt.xlabel(), plt.ylabel(), plt.title(), and str as arguments.

plt.xlabel("Months")
plt.ylabel("Number of carrots")
plt.title("Bonnie's monthly carrot intake")

Here's what we get as a result:

Having taken a look at the graph, we don't require additional explanations on what all these
numbers mean. As we've mentioned, it'd be nice to change the numbers on the axes. Let's
move on to that!

Changing the axes

To change the numbers on the axes, we need to use separate functions, just like for titles.
These functions are plt.xticks() and plt.yticks() that take arrays of numeric values. We want
to display the numbers from 1 to 12 on the X-axis (you may want to create a list or use
range(1, 13) to save time) and numbers from 0 to 20 with a step of 5 on the Y-axis (range(0,
21, 5)):

plt.xticks(range(1, 13))
plt.yticks(range(0, 21, 5))

Now, our graph looks like this:

12
Much better, isn't it? You can see that your hamster had at least 10 carrots per month, and the
average value was somewhere between 10 and 15. It's a good thing to know when you're
planning your shopping list!

Stacked area chart

Imagine that you have two hamsters. Bonnie has a friend named Clyde. It infers two carrot
datasets that you want to plot on the same graph to compare the data. When you plot several
datasets on one area chart, it turns to a stacked area chart. You can use it to display big
stacks of data and see how much each stacked group (in our case, each hamster) contributes
to the total.

To create this type of area chart, let's refer to another function: plt.stackplot(x, y1, y2). Note
that the X-axis data still comes first and is followed by your datasets (two or more). Take a
look at the data plot of carrot consumption for Bonnie and Clyde:

months = range(1, 13)


bonnie_carrots = [14, 13, 10, 15, 17, 15, 15, 13, 12, 10, 14, 11]
clyde_carrots = [13, 17, 12, 11, 11, 10, 15, 14, 13, 12, 11, 15]

plt.stackplot(months, bonnie_carrots, clyde_carrots)


plt.xlabel("Months")
plt.ylabel("Number of carrots")
plt.title("Bonnie and Clyde's monthly carrot intake")
plt.xticks(range(1, 13))
plt.yticks(range(0, 31, 5))

We have changed the values for plt.yticks() to accommodate the new data. Let's have a look
at the result:

13
You can see the difference between the two datasets, but which is Bonnie and which is
Clyde? To clarify that, we need to add a legend. You can do it in two steps: first, add the
labels argument to the plt.stackplot() function, and then add the plt.legend() function without
arguments. You can change the colors in this kind of area chart too — just pass a list of str to
the colors argument (note that it's colors, not color):

plt.stackplot(months, bonnie_carrots, clyde_carrots, labels=["Bonnie", "Clyde"],


colors=["yellow", "orange"])
plt.legend()

Here's the resulting plot:

Now, our data is presented clearly and concisely.

Filling the area between two lines

Sometimes, you want to represent only the difference between two datasets, not their
cumulative total. To do it, you need to plot two separate lines using plt.plot(x, y) and add
plt.fill_between(). In this example, we also add plt.grid() to add a grid in the background that
facilitates interpretation:

plt.plot(months, bonnie_carrots, color="green", label="Bonnie")


plt.plot(months, clyde_carrots, color="purple", label="Clyde")
plt.fill_between(months, bonnie_carrots, clyde_carrots, color="darkorange")

plt.xlabel('Months')
plt.ylabel('Number of carrots')
plt.title("The difference in monthly carrot intake")
plt.xticks(range(1, 13))
plt.yticks(range(0, 21, 5))

plt.legend()
plt.grid()

This is our result:

14
We can see that Bonnie and Clyde had the same number of carrots in July, but Bonnie
chomped more in April, May, and June.

Pie Chart

Given a set of categories or groups with their corresponding values you can make use of the
pie function from matplotlib to create a pie chart in Python. Pass the labels and the values as
input to the function to create a pie chart counterclockwise, as in the example below. Note
that by default the area of the slices will be calculated as each value divided by the sum of
values.
import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels)
# plt.show()

Partial pie
If your data doesn’t sum up to one and you don’t want to normalize your data you can set
normalize = False, so a partial pie chart will be created.

15
import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [0.1, 0.2, 0.1, 0.2, 0.1]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, normalize = False)
# plt.show()

Clockwise pie chart

As stated before, the pie chart will be created by default counterclockwise. To set a clockwise
direction set the argument counterclock as False.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, counterclock = False)
# plt.show()

Start angle

The pie will rotate counterclockwise from the X-axis by default. You can change the start
angle with startangle. As an example, if you set this argument to 90 the first slice will start to
rotate counterclokwise perpendicular to the X-axis.

import matplotlib.pyplot as plt

16
# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, startangle = 90)
# plt.show()

Size (radius)

The size of the pie can be controlled with the radius argument, which defaults to 1.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, radius = 0.5)
# plt.show()

Explode

Note that you can also explode (offset) one or some slices of the pie passing an array of the
length of the data to explode.

17
import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]
explode = [0, 0, 0, 0.1, 0]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, explode = explode)
# plt.show()

Add a shadow

The pie function also allows adding a shadow to the pie setting the shadow argument to True.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, shadow = True)
# plt.show()

Add the frame of the plot

You might have noticed that the default pie doesn’t display the typical frame of the charts
created with matplotlib. In case you want to add it you can set frame = True.

import matplotlib.pyplot as plt

18
# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, frame = True)
# plt.show()

Pie chart labels

Pie chart with percentages

In addition to the group labels you can also display the count or the percentages for each slice
with the autopct argument, as shown below.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, autopct = '%1.1f%%')
# plt.show()

Percentage labels distance to the origin

19
Note that you can customize the distance of these labels from the origin and display them
instead of the group labels. The default value is 0.6.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, autopct = '%1.1f%%', pctdistance = 1.1)
# plt.show()

Pie chart colors

The colors argument allows customizing the fill color for each slice. You can input an array
of ordered colors to change the color for each category.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]
colors = ["#B9DDF1", "#9FCAE6", "#73A4CA", "#497AA7", "#2E5B88"]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, colors = colors)
# plt.show()

Border color

20
In case you want to add a border you can use the wedgeprops argument and set a line width
and a border color with a dict, as in the example below.

import matplotlib.pyplot as plt

# Data
labels = ["G1", "G2", "G3", "G4", "G5"]
value = [12, 22, 16, 38, 12]
colors = ["#B9DDF1", "#9FCAE6", "#73A4CA", "#497AA7", "#2E5B88"]

# Pie chart
fig, ax = plt.subplots()
ax.pie(value, labels = labels, colors = colors,
wedgeprops = {"linewidth": 1, "edgecolor": "white"})
# plt.show()

How to Create a Table with Matplotlib?

Method 1: Create a Table using matplotlib.plyplot.table() function

In this example, we create a database of average scores of subjects for 5 consecutive years.
We import packages and plotline plots for each consecutive year. A table can be added to
Axes using matplotlib.pyplot.table(). We can plot the table by taking columns on the x-axis
and the y-axis for values.

Syntax

matplotlib.pyplot.table(cellText=None, cellColours=None, cellLoc=’right’, colWidths=None,


rowLabels=None, rowColours=None, rowLoc=’left’, colLabels=None, colColours=None,
colLoc=’center’, loc=’bottom’, bbox=None, edges=’closed’, **kwargs)

# importing packages and modules


import numpy as np
import matplotlib.pyplot as plt

# average marks data for 5 consecutive years


data = [[98, 95, 93, 96, 97],

21
[97, 92, 95, 94, 96],
[98, 95, 93, 95, 94],
[96, 94, 94, 92, 95],
[95, 90, 91, 94, 98]]

columns = ('English', 'Maths', 'Physics',


'Chemistry', 'Biology')
rows = ['%d academic year' % x for x in (2015, 2016, 2017, 2018, 2019)]

# Get some pastel shades for the colors


colors = plt.cm.BuPu(np.linspace(0, 0.5, len(rows)))
n_rows = len(data)

index = np.arange(len(columns)) + 0.3


bar_width = 0.4

# Initialize the vertical-offset for


# the line plots.
y_offset = np.zeros(len(columns))

# Plot line plots and create text labels


# for the table
cell_text = []
for row in range(n_rows):
plt.plot(index, data[row], color=colors[row])
y_offset = data[row]
cell_text.append([x for x in y_offset])

# Reverse colors and text labels to display


# the last value at the top.
colors = colors[::-1]
cell_text.reverse()

# Add a table at the bottom of the axes


the_table = plt.table(cellText=cell_text,
rowLabels=rows,
rowColours=colors,
colLabels=columns,
loc='bottom')

# Adjust layout to make room for the table:


plt.subplots_adjust(left=0.2, bottom=0.2)

plt.ylabel("marks".format(value_increment))
plt.xticks([])
plt.title('average marks in each consecutive year')

plt.show()

22
23
NumPy– Introduction
NumPy (Numerical Python) is an open source Python library that’s used in almost every field of
science and engineering. It’s the universal standard for working with numerical data in Python, and
it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from
beginning coders to experienced researchers doing state-of-the-art scientific and industrial research
and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-
image and most other data science and scientific Python packages.

The NumPy library contains multidimensional array and matrix data structures (you’ll find more
information about this in later sections). It provides ndarray, a homogeneous n-dimensional array
object, with methods to efficiently operate on it. NumPy can be used to perform a wide variety of
mathematical operations on arrays. It adds powerful data structures to Python that guarantee
efficient calculations with arrays and matrices and it supplies an enormous library of high-level
mathematical functions that operate on these arrays and matrices.

Operations using NumPy


Using NumPy, a developer can perform the following operations:
 Mathematical and logical operations on arrays.
 Fourier transforms and routines for shape manipulation.
 Operations related to linear algebra. NumPy has in-built functions for linear algebra
and random number generation.
NumPy–Environment
Standard Python distribution doesn't come bundled with NumPy module. A
lightweight alternative is to install NumPy using popular Python package installer,
pip.

pip install numpy

The best way to enable NumPy is to use an installable binary package specific to your operating
system. These binaries contain full SciPy stack (inclusive of NumPy, SciPy, matplotlib, IPython, SymPy
and nose packages along with core Python).

NumPy– Introduction
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting
of multidimensional array objects and a collection of routines for processing of
array.

To test whether NumPy module is properly installed, try to import it from python prompt.

import numpy

If it is not installed, the following error message will be displayed.

Traceback (most recent call last):


File "<pyshell#0>", line 1, in <module>
import numpy
ImportError: No module named 'numpy'

1
Alternatively, NumPy package is imported using the following syntax:

import numpy as np

NumPy – nd array object


The most important object defined in NumPy is an N-dimensional array type called ndarray. It
describes the collection of items of the same type. Items in the collection can be accessed using a
zero-based index.
Every item in an ndarray takes the same size of block in the memory. Each element in ndarray is an
object of data-type object (called dtype).
Any item extracted from ndarray object (by slicing) is represented by a Python object of one of array
scalar types. The following diagram shows a relationship between ndarray, data type object (dtype)
and array scalar type:

The basic ndarray is created using an array function in NumPy as follows:

numpy.array

It creates an ndarray from any object exposing array interface, or from any method that returns an
array.

numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

The above constructor takes the following parameters:

Any object exposing the array interface method returns an array, or any (nested)
object
sequence

dtype Desired data type of array, optional

copy Optional. By default (true), the object is copied

order C (row major) or F (column major) or A (any) (default)

By default, returned array forced to be a base class array. If true, sub-classes passed
subok
through

2
ndimin Specifies minimum dimensions of resultant array

Fixed-Type Arrays in Python


Python offers several different options for storing data in efficient, fixed-type data buffers. The built-
in array module (available since Python 3.3) can be used to create dense arrays of a uniform type:

import array
L = list(range(10))
A = array.array('i', L)
A

Output:
array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Creating Numpy array from a list


A list in Python is a linear data structure that can hold heterogeneous elements that do not require
to be declared and are flexible to shrink and grow. On the other hand, an array is a data structure
that can hold homogeneous elements. Arrays are implemented in Python using the NumPy library.
Arrays require less memory than lists. The similarity between an array and a list is that the elements
of both array and a list can be identified by its index value.

Example

Input: [1, 7, 0, 6, 2, 5, 6]
Output: [1 7 0 6 2 5 6]
Explanation: Given Python List is converted into NumPy Array
Convert Python List to Numpy Arrays
In Python, lists can be converted into arrays by using two methods from the NumPy library:

 Using numpy.array()

 Using numpy.asarray()

Python List to NumPy Arrays using numpy.array()


In Python, the simplest way to convert a list to a NumPy array is by using numpy.array() function. It
takes an argument and returns a NumPy array as a result. It creates a new copy in memory and
returns a new array.

# importing library
import numpy

# initializing list
lst = [1, 7, 0, 6, 2, 5, 6]

# converting list to array


arr = numpy.array(lst)

# displaying list
print ("List: ", lst)

3
# displaying array
print ("Array: ", arr)

Output
List: [1, 7, 0, 6, 2, 5, 6]
Array: [1 7 0 6 2 5 6]

Creating Arrays from the scratch

NumPy stands for Numerical Python. It is a Python library used for working with an array. In Python,
we use the list for purpose of the array but it’s slow to process. NumPy array is a powerful N-
dimensional array object and its use in linear algebra, Fourier transform, and random number
capabilities. It provides an array object much faster than traditional Python lists.

Types of Array:
1. One Dimensional Array
2. Multi-Dimensional Array

One Dimensional Array:


A one-dimensional array is a type of linear array.

One Dimensional Array

Example:

# importing numpy module


import numpy as np

# creating list
list = [1, 2, 3, 4]

# creating numpy array


sample_array = np.array(list1)

print("List in python : ", list)

print("Numpy Array in python :",


sample_array)
print(type(list_1))
print(type(sample_array))
Output:

List in python : [1, 2, 3, 4]


Numpy Array in python : [1 2 3 4]
<class 'list'>
<class 'numpy.ndarray'>
Multi-Dimensional Array:
Data in multidimensional arrays are stored in tabular form.

4
Two Dimensional Array

Example:

# importing numpy module


import numpy as np

# creating list
list_1 = [1, 2, 3, 4]
list_2 = [5, 6, 7, 8]
list_3 = [9, 10, 11, 12]

# creating numpy array


sample_array = np.array([list_1, list_2,list_3])

print("Numpy multi dimensional array in python\n", sample_array)

Output
Numpy multi dimensional array in python
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]

Ones and zeros


Name Description Syntax
Return a new array of given shape and type, numpy.empty(shape[, dtype,
empty()
without initializing entries. order])
Return a new array with the same shape and type numpy.empty_like(a[, dtype, order,
empty_like
as a given array. subok])
Return a 2-D array with ones on the diagonal and
eye() numpy.eye(N[, M, k, dtype])
zeros elsewhere.
identity() Return the identity array. numpy.identity(n[, dtype])
Return a new array of given shape and type, filled
ones() numpy.ones(shape[, dtype, order])
with ones.
Return a new array of given shape and type, filled
zeros numpy.zeros(shape[, dtype, order])
with zeros.
Return a new array of given shape and type, filled numpy.full(shape, fill_value[, dtype,
full()
with fill_value. order])

numpy.empty() function
The numpy.empty() function is used to create a new array of given shape and type, without
initializing entries. It is typically used for large arrays when performance is critical, and the values will
be filled in later.

5
Syntax:

numpy.empty(shape, dtype=float, order='C')

Parameters:

Required
Name Description /
Optional
shape Shape of the empty array, e.g., (2, 3) or 2. Required
dtype Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64. optional
Whether to store multi-dimensional data in row-major ('C' for C-style) or column-
order major ('F' for Fortran-style) order in memory. optional argument representing the optional
memory layout of the array.

Return value:

[ndarray] Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays
will be initialized to None.

Example: Create empty NumPy arrays using np.empty()


import numpy as np
np.empty(2)

Output
array([ 6.95033087e-310, 1.69970835e-316])

np.empty(32)
Ouput
array([ 6.95033087e-310, 1.65350412e-316, 6.95032869e-310,
6.95032869e-310, 6.95033051e-310, 6.95033014e-310,
6.95033165e-310, 6.95033167e-310, 6.95033163e-310,
6.95032955e-310, 6.95033162e-310, 6.95033166e-310,
6.95033160e-310, 6.95033163e-310, 6.95033162e-310,
6.95033167e-310, 6.95033167e-310, 6.95033167e-310,
6.95033167e-310, 6.95033158e-310, 6.95033160e-310,
6.95033164e-310, 6.95033162e-310, 6.95033051e-310,
6.95033161e-310, 6.95033051e-310, 6.95033013e-310,
6.95033166e-310, 6.95033161e-310, 2.97403466e+289,
7.55774284e+091, 1.31611495e+294])

np.empty([2, 3])
Ouput

6
array([[ 6.95033087e-310, 1.68240973e-316, 6.95032825e-310],
[ 6.95032825e-310, 6.95032825e-310, 6.95032825e-310]])

The above code demonstrates the use of np.empty() function in NumPy to create empty arrays of
different sizes and data types. The np.empty() function creates an array without initializing its
values, which means that the values of the array are undefined and may vary each time the function
is called.

In the second example, an empty 2D array of size (2, 2) is created with the specified data type float.
The resulting array contains four undefined floating-point values. The values shown in the output are
also machine-dependent and may vary each time the function is called.

Example : Creating an empty array with a user-defined data type

import numpy as np

# Define a custom data type


dt = np.dtype([('Employee Name:', np.str_, 16), ('Age:', np.int32), ('Salary:', np.float64)])

# Create an empty array with the custom data type


employee = np.empty((2, 3), dtype=dt)

# Print the array


print(employee)
Output:

[[('', 0, 0.) ('', 0, 0.) ('', 0, 0.)]


[('', 0, 0.) ('', 0, 0.) ('', 0, 0.)]]
In the above example, we first define a custom data type that consists of three fields - Employee
Name (string with length 16), Age (32-bit integer), and Salary (64-bit floating-point number). We
then create an empty array with dimensions (2, 3) and data type dt. When we print the array, we see
that it contains random values of the custom data type.

NumPy: numpy.eye() function


numpy.eye() function
The eye() function is used to create a 2-D array with ones on the diagonal and zeros elsewhere.
The eye() function is commonly used in linear algebra and matrix operations. It is useful for
generating matrices to transform, rotate, or scale vectors. It can also be used in scientific computing
for solving differential equations, optimization, and signal processing.

Syntax:

numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C')

7
Parameters:

Required
Name Description /
Optional
N Number of rows in the output. Required
M Number of columns in the output. If None, defaults to N. optional
Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value
k optional
refers to an upper diagonal, and a negative value to a lower diagonal.
dtype Data-type of the returned array. optional
Whether the output should be stored in row-major (C-style) or column-major
order optional
(Fortran-style) order in memory
Return value:

[ndarray of shape (N,M)] An array where all elements are equal to zero, except for the k-th diagonal,
whose values are equal to one.

Example-1: Identity Matrix using NumPy eye function

import numpy as np
np.eye(2)

Ouput
array([[ 1., 0.],
[ 0., 1.]])

np.eye(2,3)

Output
array([[ 1., 0., 0.],
[ 0., 1., 0.]])

Ouput
np.eye(3, 3)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])

8
In linear algebra, an identity matrix is a square matrix with ones on the main diagonal and zeros
elsewhere.

In the first example, np.eye(2) creates a 2x2 identity matrix where both the rows and columns are
equal to 2.

In the second example, np.eye(2,3) creates a 2x3 identity matrix where the first argument specifies
the number of rows and the second argument specifies the number of columns.

In the third example, np.eye(3,3) creates a 3x3 identity matrix where both the rows and columns are
equal to 3.

Example: Create a sparse identity matrix with a non-zero diagonal offset:


import numpy as np

# Create a sparse identity matrix with diagonal offset of 1


nums = np.eye(5, k=1, dtype=int)
print(nums)

Output:

[[0 1 0 0 0]
[0 0 1 0 0]
[0 0 0 1 0]
[0 0 0 0 1]
[0 0 0 0 0]]
In the above example, we create a sparse identity matrix with dimensions (5, 5) and a diagonal offset
of 1. This means that the diagonal elements are shifted one position to the right, resulting in a matrix
with 1's on the first upper diagonal and 0's elsewhere.
NumPy: numpy.ones() function
numpy.ones() function
The numpy.ones() function is used to create a new array of given shape and type, filled with ones.
The ones() function is useful in situations where we need to create an array of ones with a specific
shape and data type, for example in matrix operations or in initializing an array with default values.
Syntax:
numpy.ones(shape, dtype=None, order='C')

Parameters:
Required
Name Description /
Optional
shape Shape of the new array, e.g., (2, 3) or 2. Required
dtype The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64. optional

9
Whether to store multi-dimensional data in row-major (C-style) or column-major
order optional
(Fortran-style) order in memory
Return value:

[ndarray] Array of ones with the given shape, dtype, and order.

Example-1: Create arrays of ones with NumPy's ones() function

import numpy as np
np.ones(7)

Ouput
array([ 1., 1., 1., 1., 1., 1., 1.])
np.ones((2, 1))

Output
array([[ 1.],
[ 1.]])

np.ones(7,)

Output
array([ 1., 1., 1., 1., 1., 1., 1.])

x = (2, 3)
Output
np.ones(x)
array([[ 1., 1., 1.],
[ 1., 1., 1.]])
In the above code:

np.ones(7): This creates a 1-dimensional array of length 7 with all elements set to 1.

np.ones((2, 1)): This creates a 2-dimensional array with 2 rows and 1 column, with all elements set to
1.

np.ones(7,): This is equivalent to np.ones(7) and creates a 1-dimensional array of length 7 with all
elements set to 1.

x = (2, 3) and np.ones(x): This creates a 2-dimensional array with 2 rows and 3 columns, with all
elements set to 1.

10
NumPy: numpy.zeros() function
numpy.zeros() function
The numpy.zeros() function is used to create an array of specified shape and data type, filled with
zeros. The function is commonly used to initialize an array of a specific size and type, before filling it
with actual values obtained from some calculations or data sources. It is also used as a placeholder
to allocate memory for later use.

Syntax:

numpy.zeros(a, dtype=None, order='K', subok=True)

Parameters:

Required
Name Description /
Optional
shape Shape of the new array, e.g., (2, 3) or 2. Required
dtype The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64. optional
Whether to store multi-dimensional data in row-major (C-style) or column-major
order optional
(Fortran-style) order in memory

Return value:

[ndarray] Array of zeros with the given shape, dtype, and order.

Example: Creating a numpy array of zeros with a tuple shape

11
import numpy as np
a = (3,2)
np.zeros(a)

Output
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
In the above code a tuple (3, 2) is created and assigned to variable 'a'. The np.zeros()
function is called with 'a' as its argument, which creates a numpy array of zeros with a shape
of (3, 2).

Example: Creating arrays of zeros using NumPy


import numpy as np
np.zeros(6)

Ouput
array([ 0., 0., 0., 0., 0., 0.])

np.zeros((6,), dtype=int)

Output
array([0, 0, 0, 0, 0, 0])

np.zeros((3, 1))

Output
array([[ 0.],
[ 0.],
[ 0.]])

In the above code the first line, np.zeros(6) creates a one-dimensional array of size 6 with all
elements set to 0, and its data type is float.

In the second line, np.zeros((6,), dtype=int) creates a one-dimensional array of size 6 with all
elements set to 0, and its data type is integer.

In the third line, np.zeros((3, 1)) creates a two-dimensional array of size 3x1 with all elements set to
0, and its data type is float.

12
NumPy: numpy.full() function
numpy.full() function
The numpy.full() function is used to create a new array of the specified shape and type, filled with a
specified value.

Syntax:

numpy.full(shape, fill_value, dtype=None, order='C')

Parameters:

Required
Name Description /
Optional
shape Shape of the new array, e.g., (2, 3) or 2. Required
fill_value Fill value. Required
The desired data-type for the array The default, None, means
dtype optional
np.array(fill_value).dtype.
Whether to store multidimensional data in C- or Fortran-contiguous (row- or
order optional
column-wise) order in memory

Return value:

[ndarray] Array of fill_value with the given shape, dtype, and order.

Example: Create arrays filled with a constant value using numpy.full()

import numpy as np
np.full((3, 3), np.inf)

Output
array([[ inf, inf, inf],
[ inf, inf, inf],
[ inf, inf, inf]])

np.full((3, 3), 10.1)

13
Output
array([[ 10.1, 10.1, 10.1],
[ 10.1, 10.1, 10.1],
[ 10.1, 10.1, 10.1]])

The above code creates arrays filled with a constant value using the numpy.full() function. In the first
example, np.full((3, 3), np.inf) creates a 3x3 numpy array filled with np.inf (infinity). np.inf is a special
floating-point value that represents infinity, and is often used in calculations involving limits and
asymptotes.
In the second example, np.full((3, 3), 10.1) creates a 3x3 numpy array filled with the value 10.1.
Here, the dtype parameter is omitted, so numpy infers the data type of the array from the given
value.
Example: Create an array filled with a single value using np.full()

import numpy as np
np.full((3,3), 55, dtype=int)

Output
array([[55, 55, 55],
[55, 55, 55],
[55, 55, 55]])

In the above code, np.full((3,3), 55, dtype=int) creates a 3x3 numpy array filled with the integer
value 55. The dtype parameter is explicitly set to int, so the resulting array has integer data type.
The ndarray object consists of contiguous one-dimensional segment of computer memory,
combined with an indexing scheme that maps each item to a location in the memory block. The
memory block holds the elements in a row-major order (C style) or a column-major order (FORTRAN
or MatLab style).
NumPy – Data Types
NumPy supports a much greater variety of numerical types than Python does.
The following table shows different scalar data types defined in NumPy.

Data Types Description


bool_ Boolean (True or False) stored as a byte
int_ Default integer type (same as C long; normally either int64 or int32)
intc Identical to C int (normally int32 or int64)
Integer used for indexing (same as C ssize_t; normally either int32 or
intp
nt64)
int8 Byte (-128 to 127)
int16 Integer (-32768 to 32767)
int32 Integer (-2147483648 to 2147483647)
int64 Integer (-9223372036854775808 to 9223372036854775807)

14
uint8 Unsigned integer (0 to 255)
uint16 Unsigned integer (0 to 65535)
uint32 Unsigned integer (0 to 4294967295)
uint64 Unsigned integer (0 to 18446744073709551615)
float_ Shorthand for float64
float16 Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
float32 Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
float64 Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
complex_ Shorthand for complex128
Complex number, represented by two 32-bit floats (real and imaginary
complex64 components)
Complex number, represented by two 64-bit floats (real and imaginary
complex128
components)

DataTypeObjects(dtype)
A data type object describes interpretation of fixed block of memory corresponding to an array,
depending on the following aspects:
 Type of data (integer, float or Python object)

 Size of data

 Byte order (little-endian or big-endian)

 In case of structured type, the names of fields, data type of each field and part of the
memory block taken by each field

If data type is a subarray, its shape and data type


The byte order is decided by prefixing '<' or '>' to data type. '<' means that encoding is little-endian
(least significant is stored in smallest address). '>' means that encoding is big-endian (most
significant byte is stored in smallest address).
A dtype object is constructed using the following syntax:

numpy.dtype(object, align, copy)

The parameters are:

 Object: To be converted to data type object


 Align: If true, adds padding to the field to make it similar to C-struct
 Copy: Makes a new copy of dtype object. If false, the result is reference to built- in data type
object
Example 1

15
# using array-scalar type
import numpy as np
dt=np.dtype(np.int32)
print dt

The output is as follows:

int32

Example 2

#int8, int16, int32, int64 can be replaced by equivalent string 'i1',


'i2','i4', etc.
import numpy as np dt = np.dtype('i4')
print dt

The output is as follows:

int32

Example 3

# using endian notation


import numpy as np
dt = np.dtype('>i4')
print dt

The output is as follows:

>i4

The following examples show the use of structured data type. Here, the field name
and the corresponding scalar data type is to be declared.
Example 4

# first create structured data type


import numpy as np
dt = np.dtype([('age',np.int8)])
print dt

The output is as follows:

16
[('age', 'i1')]

Example 5

# now apply it to ndarray object


import numpy as np
dt = np.dtype([('age',np.int8)])
a = np.array([(10,),(20,),(30,)], dtype=dt)
print a

The output is as follows:

[(10,) (20,) (30,)]

Example 6

# file name can be used to access content of age column


import numpy as np
dt = np.dtype([('age',np.int8)])
a = np.array([(10,),(20,),(30,)], dtype=dt)
print a['age']

The output is as follows:

[10 20 30]

Example 7
The following examples define a structured data type called student with a string
field 'name', an integer field 'age' and a float field 'marks'. This dtype is applied to
ndarray object.

import numpy as np
student=np.dtype([('name','S20'), ('age', 'i1'), ('marks', 'f4')])
print student

The output is as follows:

[('name', 'S20'), ('age', 'i1'), ('marks', '<f4')])

17
Example 8

import numpy as np
student=np.dtype([('name','S20'), ('age', 'i1'), ('marks', 'f4')])
a = np.array([('abc', 21, 50),('xyz', 18, 75)], dtype=student)
print a

The output is as follows:

[('abc', 21, 50.0), ('xyz', 18, 75.0)]

Each built-in data type has a character code that uniquely identifies it.
 'b': boolean
 'i': (signed) integer
 'u': unsigned integer
 'f': floating-point
 'c': complex-floating point
 'm': timedelta
 'M': datetime
 'O': (Python) objects
 'S', 'a': (byte-)string
 'U': Unicode
 'V': raw data (void)

Array Concatenation and Splitting


All of the preceding routines worked on single arrays. It's also possible to combine multiple arrays
into one, and to conversely split a single array into multiple arrays. We'll take a look at those
operations here.

Concatenation of arrays
Concatenation, or joining of two arrays in NumPy, is primarily accomplished using the routines
np.concatenate, np.vstack, and np.hstack. np.concatenate takes a tuple or list of arrays as its first
argument, as we can see here:

import numpy as np
x = np.array([1, 2, 3])
y = np.array([3, 2, 1])
np.concatenate([x, y])
output
array([1, 2, 3, 3, 2, 1])
You can also concatenate more than two arrays at once:
z = [99, 99, 99]
print(np.concatenate([x, y, z]))
Output
[ 1 2 3 3 2 1 99 99 99]

18
It can also be used for two-dimensional arrays:
grid = np.array([[1, 2, 3],
[4, 5, 6]])
# concatenate along the first axis
np.concatenate([grid, grid])

Output
array([[1, 2, 3],
[4, 5, 6],
[1, 2, 3],
[4, 5, 6]])

# concatenate along the second axis (zero-indexed)


np.concatenate([grid, grid], axis=1)
Output
array([[1, 2, 3, 1, 2, 3],
[4, 5, 6, 4, 5, 6]])

For working with arrays of mixed dimensions, it can be clearer to use the np.vstack (vertical stack)
and np.hstack (horizontal stack) functions:
numpy.vstack() function is used to stack the sequence of input arrays vertically to make a single
array.
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
[6, 5, 4]])

# vertically stack the arrays


np.vstack([x, grid])
Output
array([[1, 2, 3],
[9, 8, 7],
[6, 5, 4]])

(np.row_stack is an alias for vstack. They are the same function.)

numpy.hstack() function is used to stack the sequence of input arrays horizontally (i.e. column wise)
to make a single array.

# horizontally stack the arrays


y = np.array([[99],
[99]])
np.hstack([grid, y])
Output
array([[ 9, 8, 7, 99],
[ 6, 5, 4, 99]])

numpy.dstack
numpy.dstack(tup)[source]
Stack arrays in sequence depth wise (along third axis).

19
This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been
reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1). Rebuilds arrays
divided by dsplit.

This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with
a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate,
stack and block provide more general stacking and concatenation operations.

Parameters:
tupsequence of arrays
The arrays must have the same shape along all but the third axis. 1-D or 2-D arrays must
have the same shape.

Returns:
stackedndarray
The array formed by stacking the given arrays, will be at least 3-D.

import numpy as np
a = np.array((1,2,3))
b = np.array((2,3,4))
np.dstack((a,b))
Output
array([[[1, 2],
[2, 3],
[3, 4]]])

a = np.array([[1],[2],[3]])
b = np.array([[2],[3],[4]])
np.dstack((a,b))
Output
array([[[1, 2]],
[[2, 3]],
[[3, 4]]])

numpy.column_stack
numpy.column_stack(tup)[source]
Stack 1-D arrays as columns into a 2-D array.

Take a sequence of 1-D arrays and stack them as columns to make a single 2-D array. 2-D arrays are
stacked as-is, just like with hstack. 1-D arrays are turned into 2-D columns first.

Parameters:
tupsequence of 1-D or 2-D arrays.
Arrays to stack. All of them must have the same first dimension.

Returns:
stacked2-D array
The array formed by stacking the given arrays.

a = np.array((1,2,3))
print("Array'a':",a)

20
b = np.array((2,3,4))
print("Array'b':",b)
np.column_stack((a,b))
Output
Array'a': [1 2 3]
Array'b': [2 3 4]

array([[1, 2],
[2, 3],
[3, 4]])

Splitting of arrays
The opposite of concatenation is splitting, which is implemented by the functions np.split, np.hsplit,
and np.vsplit. For each of these, we can pass a list of indices giving the split points:

import numpy as np

x = [1, 2, 3, 99, 99, 3, 2, 1]

x1, x2, x3 = np.split(x, [3, 5])

print(x1, x2, x3)


Output

[1 2 3] [99 99] [3 2 1]
Notice that N split-points, leads to N + 1 subarrays. The related functions np.hsplit and np.vsplit are
similar:

grid = np.arange(16).reshape((4, 4))

grid
Output
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])

import numpy as np

grid = np.arange(16).reshape((4, 4))

upper, lower = np.vsplit(grid, [2])

print("upper:",upper)

print("lower:",lower)
Output
upper: [[0 1 2 3]
[4 5 6 7]]

21
lower: [[ 8 9 10 11]
[12 13 14 15]

left, right = np.hsplit(grid, [2])

print(left)

print(right)
[[ 0 1]
[ 4 5]
[ 8 9]
[12 13]]
[[ 2 3]
[ 6 7]
[10 11]
[14 15]]

The Basics of NumPy Arrays

Few categories of basic array manipulations:


 Attributes of arrays: Determining the size, shape, memory consumption, and data types of
arrays
 Indexing of arrays: Getting and setting the value of individual array elements
 Slicing of arrays: Getting and setting smaller subarrays within a larger array
 Reshaping of arrays: Changing the shape of a given array
 Joining and splitting of arrays: Combining multiple arrays into one, and splitting one array
into many

NumPy Array Attributes


Example 1
How to create Single dimensional array using Numpy?

import numpy as np
a=np.array([1,2,3])
print a

The output is as follows:

[1, 2, 3]

Example 2
How to create two dimensional array using Numpy?

22
# more than one dimensions
import numpy as np
a = np.array([[1, 2], [3, 4]])
print a

The output is as follows:

[[1, 2]
[3, 4]]

Example 3

# minimum dimensions
import numpy as np
a=np.array([1, 2, 3,4,5], ndmin=2)
print a

The output is as follows:

[[1, 2, 3, 4, 5]]

Example 4
How to use dtype attribute in Numpy?

# dtype parameter
import numpy as np
a = np.array([1, 2, 3], dtype=complex)
print a

The output is as follows:

[ 1.+0.j, 2.+0.j, 3.+0.j]

import numpy as np
np.random.seed(0) # seed for reproducibility
x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5)) # Three-dimensional array
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)

23
print("x3 size: ", x3.size)
print("dtype:", x3.dtype)
Output
x3 ndim: 3
x3 shape: (3, 4, 5)
x3 size: 60
dtype: int32

NumPy – Array Attributes


ndarray.shape
This array attribute returns a tuple consisting of array dimensions. It can also be
used to resize the array.
Example 1
import numpy as np
a=np.array([[1,2,3],[4,5,6]])
print a.shape

The output is as follows:

(2, 3)

Example 2
# this resizes the ndarray
import numpy as np
a=np.array([[1,2,3],[4,5,6]])
a.shape=(3,2)
print a

The output is as follows:


[[1 2]
[3 4]
[5 6]]

Array Indexing: Accessing Single Elements


Numpy Array Indexing
In NumPy, each element in an array is associated with a number. The number is known as an array
index.

Indexing is used to access individual elements. It is also possible to extract entire rows, columns, or
planes from multi-dimensional arrays with numpy indexing. Indexing starts from 0. Let's see an
array example below to understand the concept of indexing:

Element of array 2 3 11 9 6 4 10 12

24
Element of array 2 3 11 9 6 4 10 12
Index 0 1 2 3 4 5 6 7

Indexing in 1 dimension
import numpy as np
arr1=np.arange(4)
print("Array arr11:",arr1)
print("Element at index 0 of arr1 is:",arr1[0])
print("Element at index 1 of arr1 is:",arr1[1])

Output
Array arr11: [0 1 2 3]
Element at index 0 of arr1 is: 0
Element at index 1 of arr1 is: 1

Explanation: In the above code example, an array of shape 4 is created using the np.arange function.
The elements at index 0 and 1 of the array are printed as output.
numpy.arange() function
Syntax:

numpy.arange([start, ]stop, [step, ]dtype=None)

Parameters:

Required
Name Description /
Optional
start Start of interval. The interval includes this value. The default start value is 0. Optional
End of interval. The interval does not include this value, except in some cases
stop Required
where step is not an integer and floating point round-off affects the length of out.
Spacing between values. For any output out, this is the distance between two
step adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified as a Optional
position argument, start must also be given.
The type of the output array. If dtype is not given, infer the data type from the
dtytpe Optional
other input arguments.

Return value:

arange : ndarray - Array of evenly spaced values.


For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating
point overflow, this rule may result in the last element of out being greater than stop.

Example: arange() function in NumPy.

25
import numpy as np
np.arange(5)

Output
array([0, 1, 2, 3, 4])

np.arange(5.0)

Output
array([ 0., 1., 2., 3., 4.])

np.arange(5,9)

Output
array([5, 6, 7, 8])

Output
np.arange(5,9,3)
array([5, 8])

In the above example the first line of the code creates an array of integers from 0 to 4 using
np.arange(5). The arange() function takes only one argument, which is the stop value, and defaults
to start value 0 and step size of 1.

The second line of the code creates an array of floating-point numbers from 0.0 to 4.0 using
np.arange(5.0). Here, 5.0 is provided as the stop value, indicating that the range should go up to (but
not include) 5.0. Since floating-point numbers are used, the resulting array contains floating-point
values.
Both arrays have the same length and contain evenly spaced values.

'np.arange(5,9)' creates an array of integers from 5 to 8 (inclusive).


In the second example, 'np.arange(5,9,3)' creates an array of integers from 5 to 8 (inclusive) with a
step size of 3. Since there is no integer between 5 and 8 that is evenly divisible by 3, the resulting
array only contains two values, 5 and 8.

Indexing in 2 Dimensions
Let's look at the example below to understand how numpy indexing is done in a 2-D array:

12(0,0) 11(0,1) 9 (0,2)


8(1,0) 0(1,1) 3(1,2)
10(2,0) 2(2,1) 5(2,2)

Import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("Element at 0th row and 0th column of arr1 is:",arr1[0,0])
print("Element at 1st row and 2nd column of arr1 is:",arr1[1,2])

26
Output
Array arr1:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Element at 0th row and 0th column of arr1 is: 0
Element at 1st row and 2nd column of arr1 is: 6

Picking a Row or Column in 2-D NumPy Array

import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("1st row :\n",arr1[1])

Ouput
Array arr1:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

1st row :
[4 5 6 7]

Explanation: As discussed above, both rows and columns are used for indexing as two dimensions.

In the above code example, a 2-D array is created using the np.arange function, which is used for
creating the 1-D array, and the np.reshape function, which is used for transforming a 1-D array into 2
rows and 4 columns.

Here, 1 in a 2-D array stands for the row at index 1 of an array, i.e., [4 5 6 7]. As a result, the row at
index 1 is printed as output.

Indexing in 3 Dimensions
There are three dimensions in a 3-D array, suppose we have three dimensions as (i, j, k), where i
stands for the 1st dimension, j stands for the 2nd dimension and, k stands for the 3rd dimension.

Let's look at the given examples for a better understanding. Remember: Indexing starts from zero.

import numpy as np
arr=np.arange(12)
arr1=arr.reshape(2,2,3)
print("Array arr1:\n",arr1)
print("Element:",arr1[1,0,2])

27
Output
Array arr1:
[[[ 0 1 2]
[ 3 4 5]]

[[ 6 7 8]
[ 9 10 11]]]
Element: 8

Explanation: We want to access the element of an array at index(1,0,2)


Here 1 represents the 1st dimension, and the 1st dimension has two arrays:
1st array: [0,1,2] [3,4,5] and: 2nd array: [6,7,8] [9,10,11]
Indexing starts from 0.
We have the 2nd array as we select 1: [[6,7,8] [9,10,11]
The 2nd digit 0, stands for the 2nd dimension, and the 2nd dimension also contains two arrays: Array
1: [6, 7, 8] and: Array 2: [9, 10, 11]
0 is selected and we have 1st array : [6, 7, 8]
The 3rd digit 2, represents the 3rd dimension, 3rd dimension further has three values: 6,7,8
As 2 is selected, 8 is the output.

Modify Array Elements Using Index


We can use indices to change the value of an element in a NumPy array. For example,

import numpy as np

# create a numpy array


numbers = np.array([2, 4, 6, 8, 10])

# change the value of the first element


numbers[0] = 12
print("After modifying first element:",numbers) # prints [12 4 6 8 10]

# change the value of the third element


numbers[2] = 14
print("After modifying third element:",numbers) # prints [12 4 14 8 10]

Output
After modifying first element: [12 4 6 8 10]
After modifying third element: [12 4 14 8 10]

In the above example, we have modified elements of the numbers array using array indexing.

 numbers[0] = 12 - modifies the first element of numbers and sets its value to 12
 numbers[2] = 14 - modifies the third element of numbers and sets its value to 14

28
NumPy Negative Array Indexing
NumPy allows negative indexing for its array. The index of -1 refers to the last item, -2 to the second

last item and so on.

import numpy as np

# create a numpy array


numbers = np.array([1, 3, 5, 7, 9])

# access the last element


print(numbers[-1]) # prints 9

# access the second-to-last element


print(numbers[-2]) # prints 7

Output
9
7

Modify Array Elements Using Negative Indexing


Similar to regular indexing, we can also modify array elements using negative indexing. For example.

import numpy as np

# create a numpy array


numbers = np.array([2, 3, 5, 7, 11])
# modify the last element
numbers[-1] = 13
print(numbers)
# modify the second-to-last element
numbers[-2] = 17
print(numbers)

Output
[ 2 3 5 7 13]
[ 2 3 5 17 13]

Array Slicing: Accessing Subarrays


Slicing a NumPy array refers to accessing elements from a NumPy array in a specific range using
NumPy indexing. It obtains a subtuple, substring, or sublist from a tuple, string, or list.

 As slicing is performed on Python lists, in the same way, it is performed on NumPy arrays.

Syntax: arr_name[start:stop:step]

29
Start: Starting index
Stop: Ending index
Step: Difference between the indexes.

Let's see how slicing is performed in arrays of different dimensions.


Slicing 1D NumPy Arrays
import numpy as np

arr = np.arange(6)

print("array arr:",arr)

print("sliced element of array:",arr[1:5])

Output
array arr: [0 1 2 3 4 5]
sliced element of the array: [1 2 3 4]

Explanation: As indexing starts from 0, we want elements that start from the 1st index and stop
before index 5.

Element of array 0 1 2 3 4 5
Index 0 1 2 3 4 5
Slicing a 2D Array
In a 2-D array, we have to specify start:stop 2 times. One for the row and 2nd one for the column.

Code:

import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("elements of 1st row and 1st column upto last column :\n",arr1[1:,1:4])

Output
Array arr1:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

elements of 1st row and 1st column upto last column :


[[ 5 6 7]
[ 9 10 11]]

Explanation: The 1st number represents the row, so slicing starts from the 1st row and goes till the
last as no ending index is mentioned. Then elements from the 1st column to the 3rd column are
sliced and printed as output.

Here, rows and columns index are mentioned for better understanding.

30
Rows ↓, cols → 0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
Slicing a 3D Array
We have to use start:stop:step 3 times. The 1st one is for the planes or layers, 2nd one is for the
rows and the last one is for columns.

Let's have a quick look at the example given below:

Code:

import numpy as np

arr = np.array([[[1, 2, 3], [11, 13, 14], [21, 22, 23]],


[[4, 5, 6], [15, 16, 17], [24, 25, 26]],
[[7, 8, 9], [18, 19, 20], [27, 28, 29]]])
print(arr)
print("\n")
print("sliced array: \n",arr[:2,1:,:2])
Output
[[[ 1 2 3]
[11 13 14]
[21 22 23]]

[[ 4 5 6]
[15 16 17]
[24 25 26]]

[[ 7 8 9]
[18 19 20]
[27 28 29]]]

sliced array:
[[[11 13]
[21 22]]

[[15 16]
[24 25]]]

Explanation: we want to slice the array as arr[:2,1:,:2] This selects:

 As 1st set represents the plane or layer, as no start value is mentioned, slicing begins from
the start. So, it selects the first two planes.

A 3-D array contains more than one array in a single array in layers, which are called planes.

Note: the end value is always excluded. Like, if [1:4], this is the case, we start from 1 and go to the 3
indexes.

31
 The 2nd set is for rows, rows from index 1 to the last index are sliced as no stopping value is
mentioned (the last 2 rows of each selected plane).
 The third set is for columns, as no starting value is mentioned, slicing begins from the starts
and goes to the columns of index 2 (the first 2 columns).
 In this case, there are 3 planes mentioned in the above example. Let's take a quick look at
planes in the above array.

1st plane:

1 2 3
11 13 14
21 22 23

2nd plane:

4 5 6
15 16 17
24 25 26
3rd plane:
7 8 9

18 19 20
27 28 29
Full Slices
It is used to select all the planes, columns, or rows. Let's look at the examples below:

Code:

import numpy as np
arr = np.arange(12)
arr1=arr.reshape(3,4)
print(arr1)
print("\n")
print("2-D array sliced from first row to last:\n",arr1[1:3])

Output
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

2-D array sliced from the first row to the last:


[[ 4 5 6 7]
[ 8 9 10 11]]

Explanation: [1:3] 1 is for the 2nd row which has index 1. 3 is for columns up to index 3.

32
Rows ↓, cols → 0 1 2 3

0 0 1 2 3

1 4 5 6 7

2 8 9 10 11

import numpy as np
arr = np.array([[[1, 2, 3], [11, 13, 14], [21, 22, 23]],
[[4, 5, 6], [15, 16,17], [24, 25, 26]],
[[7, 8, 9], [18, 19, 20], [27, 28, 29]]])
print("3-D array:\n",arr)
print("\n")
print("3-D array sliced from first row to last:\n",arr[1:,:,:])

Output
3-D array:
[[[ 1 2 3]
[11 13 14]
[21 22 23]]

[[ 4 5 6]
[15 16 17]
[24 25 26]]

[[ 7 8 9]
[18 19 20]
[27 28 29]]]
3-D array sliced from the first row to the last:
[[[ 4 5 6]
[15 16 17]
[24 25 26]]

[[ 7 8 9]
[18 19 20]
[27 28 29]]]

Explanation: [1:,:,:] = This selects from the 2nd plane to the end of the 3-D array because no
stopping value is given. There are three planes in the 3-D array: 1st plane:

1 2 3
11 13 14
21 22 23
2nd plane:

4 5 6
15 16 17

33
4 5 6
24 25 26
3rd plane:

7 8 9
18 19 20
27 28 29
Negative Slicing and Indexing
Negative indexing begins when the array sequence ends, i.e. the last element will be the first
element with an index of -1 in negative indexing, and the slicing occurs by using this negative
indexing.

import numpy as np
arr = np.array([10,20,30,40,50,60,70,80,90])
print("Element at index 2 or -7 of an array arr:",arr[-7])
print("Sliced Element from index -8 or 2 and -3 or 6 of an array arr:",arr[-8:-3])

Output
Element at index 2 or -7 of an array arr: 30
Sliced element from index -8 or 2 and -3 or 6 of an array arr: [20 30 40 50 60]

Slices vs Indexing
Slicing in Python refers to extracting a subset or specific part of the sequence list, tuple, or string in a
specific range. While indexing refers to accessing a single element from an array, it is used to get
slices of arrays.

Let's see in the following table how slicing is different from indexing in Python.

Slicing Indexing
Accesses a substring or sub-part of an array during A single item from an array is returned
slicing and returns a new tuple/list through indexing
Out-of-range indices are handled smoothly when used IndexError will be thrown if you try to use
for slicing. an index that is too large.
If a single element is assigned to slicing, it will return We can assign a single assignment or
TypeError. Only iterables are accepted. iterable
The length of the list can be changed or even the list can The length of the list cannot be changed by
be cleared in slicing item assignment.

Subarrays as No-Copy Views


Unlike Python list slices, NumPy array slices are returned as views rather than copies of the array
data. Consider our two-dimensional array from before:

Import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)

34
Output
Array arr1:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

Let's extract a 2x2 subarray from this:

arr1_sub = arr1[:2, :2]


print(arr1_sub)

Output
[[0 1]
[4 5]]

Now if we modify this subarray, we'll see that the original array is changed! Observe:

arr1_sub[0, 0] = 99
print(arr1_sub)

Output
[[99 1]
[ 4 0]]

print(arr1)

Output
[[99 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

Some users may find this surprising, but it can be advantageous: for example, when working with
large datasets, we can access and process pieces of these datasets without the need to copy the
underlying data buffer.

Creating Copies of Arrays


Despite the nice features of array views, it is sometimes useful to instead explicitly copy the data
within an array or a subarray. This can be most easily done with the copy method:

arr1_sub_copy = arr1[:2, :2].copy()


print(arr1_sub_copy)

Output
[[99 1]
[ 4 5]]

35
If we now modify this subarray, the original array is not affected:
arr1_sub_copy[0, 0] = 42
print(arr1_sub_copy)

[[42 1]
[ 4 0]]

print(arr1)

[[99 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]

Reshaping of Arrays
Another useful type of operation is reshaping of arrays, which can be done with the reshape
method. For example, if you want to put the numbers 1 through 9 in a 3x3 grid, you can do the
following:

Import numpy as np
grid = np.arange(1, 10).reshape(3, 3)
print(grid)
Output
[[1 2 3]
[4 5 6]
[7 8 9]]
Note that for this to work, the size of the initial array must match the size of the reshaped array, and
in most cases the reshape method will return a no-copy view of the initial array.

A common reshaping operation is converting a one-dimensional array into a two-dimensional row or


column matrix:

Another common reshaping pattern is the conversion of a one-dimensional array into a two-
dimensional row or column matrix. You can do this with the reshape method, or more easily by
making use of the newaxis keyword within a slice operation:

In[39]: x = np.array([1, 2, 3])

# row vector via reshape x.reshape((1, 3))

Out[39]: array([[1, 2, 3]])

In[40]: # row vector via newaxis x[np.newaxis, :]

Out[40]: array([[1, 2, 3]])

In[41]: # column vector via reshape x.reshape((3, 1))

Out[41]: array([[1], [2], [3]])

36
In[42]: # column vector via newaxis x[:, np.newaxis]

Out[42]: array([[1], [2], [3]])

Computations on arrays

NumPy ufuncs

What are ufuncs?


ufuncs stands for "Universal Functions" and they are NumPy functions that operate on the ndarray
object.

Why use ufuncs?


ufuncs are used to implement vectorization in NumPy which is way faster than iterating over
elements.
They also provide broadcasting and additional methods like reduce, accumulate etc. that are very
helpful for computation.
ufuncs also take additional arguments, like:
where boolean array or condition defining where the operations should take place.
dtype defining the return type of elements.
out output array where the return value should be copied.

What is Vectorization?
Converting iterative statements into a vector based operation is called vectorization.
It is faster as modern CPUs are optimized for such operations.
Add the Elements of Two Lists
list 1: [1, 2, 3, 4]

list 2: [4, 5, 6, 7]

One way of doing it is to iterate over both of the lists and then sum each elements.

Example
Without ufunc, we can use Python's built-in zip() method:

x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
z = []
for i, j in zip(x, y):
z.append(i + j)
print(z)
[5, 7, 9, 11]
Definition and Usage
The zip() function returns a zip object, which is an iterator of tuples where the first item in each
passed iterator is paired together, and then the second item in each passed iterator are paired
together etc.

37
If the passed iterators have different lengths, the iterator with the least items decides the length of
the new iterator.

Syntax
zip(iterator1, iterator2, iterator3 ...)
Parameter Values
Parameter Description
iterator1, iterator2,
Iterator objects that will be joined together
iterator3 ...
Example
Join two tuples together:
a = ("John", "Charles", "Mike")
b = ("Jenny", "Christy", "Monica")

x = zip(a, b)
(('John', 'Jenny'), ('Charles', 'Christy'), ('Mike', 'Monica'))
NumPy has a ufunc for this, called add(x, y) that will produce the same result.
Example
With ufunc, we can use the add() function:

import numpy as np

x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
z = np.add(x, y)

print(z)
[ 5 7 9 11]
How To Create Your Own ufunc
To create your own ufunc, you have to define a function, like you do with normal functions in
Python, then you add it to your NumPy ufunc library with the frompyfunc() method.

The frompyfunc() method takes the following arguments:

1. function - the name of the function.


2. inputs - the number of input arguments (arrays).
3. outputs - the number of output arrays.

Example
Create your own ufunc for addition:

import numpy as np
def myadd(x, y):
return x+y
myadd = np.frompyfunc(myadd, 2, 1)
print(myadd([1, 2, 3, 4], [5, 6, 7, 8]))
[6 8 10 12]

38
Check if a Function is a ufunc
Check the type of a function to check if it is a ufunc or not.

A ufunc should return <class 'numpy.ufunc'>.

Example
Check if a function is a ufunc:

import numpy as np
print(type(np.add))
<class 'numpy.ufunc'>
If it is not a ufunc, it will return another type, like this built-in NumPy function for joining two or
more arrays:
Example
Check the type of another function: concatenate():

import numpy as np
print(type(np.concatenate))
<class 'builtin_function_or_method'>
If the function is not recognized at all, it will return an error:
Example
Check the type of something that does not exist. This will produce an error:

import numpy as np
print(type(np.blahblah))
Traceback (most recent call last):
File "./prog.py", line 3, in <module>
AttributeError: module 'numpy' has no attribute 'blahblah'

Simple Arithmetic
You could use arithmetic operators + - * / directly between NumPy arrays, but this section discusses
an extension of the same where we have functions that can take any array-like objects e.g. lists,
tuples etc. and perform arithmetic conditionally.

Arithmetic condition: It means that we can define conditions where the arithmetic operation should
happen.

All of the discussed arithmetic functions take a where parameter in which we can specify that
condition.

Addition
The add() function sums the content of two arrays, and return the results in a new array.

Example
Add the values in arr1 to the values in arr2:

import numpy as np

arr1 = np.array([10, 11, 12, 13, 14, 15])

39
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.add(arr1, arr2)
print(newarr)
[30 32 34 36 38 40]
Subtraction
The subtract() function subtracts the values from one array with the values from another array, and
return the results in a new array.

Example
Subtract the values in arr2 from the values in arr1:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.subtract(arr1, arr2)
print(newarr)
[-10 -1 8 17 26 35]
Multiplication
The multiply () function multiplies the values from one array with the values from another array, and
return the results in a new array.

Example Multiply the values in arr1 with the values in arr2:


import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.multiply(arr1, arr2)
print(newarr)
[ 200 420 660 920 1200 1500]
The example above will return [200 420 660 920 1200 1500] which is the result of 10*20, 20*21,
30*22 etc.

Division
The divide() function divides the values from one array with the values from another array, and
return the results in a new array.

Example
Divide the values in arr1 with the values in arr2:

import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 10, 8, 2, 33])
newarr = np.divide(arr1, arr2)
print(newarr)
[ 3.33333333 4. 3. 5. 25. 1.81818182]
The example above will return [3.33333333 4. 3. 5. 25. 1.81818182] which is the result of 10/3, 20/5,
30/10 etc.

Power
The power() function rises the values from the first array to the power of the values of the second
array, and return the results in a new array.

40
Example
Raise the valules in arr1 to the power of values in arr2:

import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 6, 8, 2, 33])
newarr = np.power(arr1, arr2)
print(newarr)
[ 1000 3200000 729000000 6553600000000 2500 0]
The example above will return [1000 3200000 729000000 6553600000000 2500 0] which is the
result of 10*10*10, 20*20*20*20*20, 30*30*30*30*30*30 etc.

Remainder
Both the mod() and the remainder() functions return the remainder of the values in the first array
corresponding to the values in the second array, and return the results in a new array.

Example
Return the remainders:

import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])
newarr = np.mod(arr1, arr2)
print(newarr)
[ 1 6 3 0 0 27]
The example above will return [1 6 3 0 0 27] which is the remainders when you divide 10 with 3
(10%3), 20 with 7 (20%7) 30 with 9 (30%9) etc.

You get the same result when using the remainder () function:

Example
Return the remainders:

import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])
newarr = np.remainder(arr1, arr2)
print(newarr)
[ 1 6 3 0 0 27]
Quotient and Mod
The divmod() function return both the quotient and the the mod. The return value is two arrays, the
first array contains the quotient and second array contains the mod.

Example
Return the quotient and mod:

import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 7, 9, 8, 2, 33])
newarr = np.divmod(arr1, arr2)

41
print(newarr)
(array([ 3, 2, 3, 5, 25, 1]), array([ 1, 6, 3, 0, 0, 27]))
The example above will return:
(array([3, 2, 3, 5, 25, 1]), array([1, 6, 3, 0, 0, 27]))
The first array represents the quotients, (the integer value when you divide 10 with 3, 20 with 7, 30
with 9 etc.
The second array represents the remainders of the same divisions.

Absolute Values
Both the absolute() and the abs() functions do the same absolute operation element-wise but we
should use absolute() to avoid confusion with python's inbuilt math.abs()

import numpy as np
arr = np.array([-1, -2, 1, 2, 3, -4])
newarr = np.absolute(arr)
print(newarr)
[1 2 1 2 3 4]
The example above will return [1 2 1 2 3 4].

Rounding Decimals

Rounding Decimals
There are primarily five ways of rounding off decimals in NumPy:
 truncation
 fix
 rounding
 floor
 ceil

Truncation
Remove the decimals, and return the float number closest to zero. Use the trunc() and fix()
functions.
Example
Truncate elements of following array:
import numpy as np
arr = np.trunc([-3.1666, 3.6667])
print(arr)

Output
[-3. 3.]
Example
Same example, using fix():
import numpy as np
arr = np.fix([-3.1666, 3.6667])
print(arr)

Output
[-3. 3.]

42
Rounding
The around() function increments preceding digit or decimal by 1 if >=5 else do nothing.
E.g. round off to 1 decimal point, 3.16666 is 3.2
Example
Round off 3.1666 to 2 decimal places:
import numpy as np
arr = np.around(3.1666, 2)

print(arr)

Output 3.17

Floor
The floor() function rounds off decimal to nearest lower integer.
E.g. floor of 3.166 is 3.
Example
Floor the elements of following array:
import numpy as np
arr = np.floor([-3.1666, 3.6667])
print(arr)

Output [-4. 3.]

Ceil
The ceil() function rounds off decimal to nearest upper integer.
E.g. ceil of 3.166 is 4.
Example
Ceil the elements of following array:

import numpy as np
arr = np.ceil([-3.1666, 3.6667])
print(arr)

Output [-3. 4.]


Trigonometric Functions
NumPy provides the ufuncs sin(), cos() and tan() that take values in radians and produce the
corresponding sin, cos and tan values.

Example:Find sine value of PI/2:


import numpy as np
x = np.sin(np.pi/2)
print(x)
1.0

Example:Find sine values for all of the values in arr:


import numpy as np
arr = np.array([np.pi/2, np.pi/3, np.pi/4, np.pi/5])
x = np.sin(arr)
print(x)

43
[1. 0.8660254 0.70710678 0.58778525]

import numpy as np
x = np.cos(np.pi/2)
print(x)
6.123233995736766e-17

import numpy as np
x = np.tan(np.pi/2)
print(x)
1.633123935319537e+16

Convert Degrees Into Radians


By default all of the trigonometric functions take radians as parameters but we can convert radians
to degrees and vice versa as well in NumPy.
Note: radians values are pi/180 * degree_values.

1 Radian is about 57.2958 degrees.

Example:Convert all of the values in following array arr to radians:


import numpy as np
arr = np.array([90, 180, 270, 360])
x = np.deg2rad(arr)
print(x)
[1.57079633 3.14159265 4.71238898 6.28318531]
Radians to Degrees
Example:Convert all of the values in following array arr to degrees:
import numpy as np
arr = np.array([np.pi/2, np.pi, 1.5*np.pi, 2*np.pi])
x = np.rad2deg(arr)
print(x)
[ 90. 180. 270. 360.]

Example:Find the angle of 1.0:

import numpy as np
x = np.arcsin(1.0)
print(x)

1.5707963267948966

44
Finding Angles
Finding angles from values of sine, cos, tan. E.g. sin, cos and tan inverse (arcsin, arccos, arctan).

NumPy provides ufuncs arcsin(), arccos() and arctan() that produce radian values for corresponding

Example:Find the angle for all of the sine values in the array.
import numpy as np
arr = np.array([1, -1, 0.1])
x = np.arcsin(arr)
print(x)
[ 1.57079633 -1.57079633 0.10016742]
sin, cos and tan values given.Angles of Each Value in Arrays

numpy.exp() in Python
numpy.exp(array, out = None, where = True, casting = ‘same_kind’, order = ‘K’, dtype = None) :
This mathematical function helps user to calculate exponential of all the elements in the input array.
Parameters :

array : [array_like]Input array or object whose elements, we need to test.


out : [ndarray, optional]Output array with same dimensions as Input array,
placed with result.
**kwargs : Allows you to pass keyword variable length of argument to a function.
It is used when we want to handle named argument in a function.
where : [array_like, optional]True value means to calculate the universal
functions(ufunc) at that position, False value means to leave the
value in the output alone.
Return :

An array with exponential of all elements of input array.


# Python program explaining exp() function

import numpy as np
in_array = [1, 3, 5]
print ("Input array : ", in_array)
out_array = np.exp(in_array)
print ("Output array : ", out_array)
Input array : [1, 3, 5]
Output array : [ 2.71828183 20.08553692 148.4131591 ]
NumPy Hyperbolic Functions
There are functions for calculation of hyperbolic functions which are the analogs of the
trigonometric functions. There are functions for the calculation of hyperbolic and inverse hyperbolic
sine, cosine, and tangent.

1. np.sinh()- This function returns the hyperbolic sine of the array elements.

import numpy as np
arr = np.array([30,60,90])
#hyperbolic sine function
print(np.sinh(arr * np.pi / 180))
Output

45
[0.54785347 1.24936705 2.3012989 ]
2. np.cosh()- This function returns the hyperbolic cosine of the array elements.

import numpy as np
arr = np.array([30,60,90])
#hyperbolic cosine function
print(np.cosh(arr * np.pi / 180))
Output

[1.14023832 1.60028686 2.50917848]


3. np.tanh()- This function returns the hyperbolic tan of the array elements.

import numpy as np
arr = np.array([30,60,90])
#hyperbolic tangent function
print(np.tanh(arr * np.pi / 180))
Output

[0.48047278 0.78071444 0.91715234]


4. np.arcsinh()- This function returns the hyperbolic inverse sine of the array elements.

import numpy as np
arr = np.array([150,60,90])
#hyperbolic inverse sine function
print(np.arcsinh(arr * np.pi / 180))
Output

[1.69018317 0.91435666 1.23340312]


5. np.arccosh()- This function returns the hyperbolic inverse cosine of the array elements.

import numpy as np
arr = np.array([150,60,90])
#hyperbolic inverse cosine function
print(np.arccosh(arr * np.pi / 180))
Output
[1.61690509 0.30604211 1.02322748]
6. np.arctanh()- This function returns the hyperbolic inverse tan of the array elements.

import numpy as np
arr = np.array([1,2,3])
#hyperbolic inverse tangent function
print(np.arctanh(arr * np.pi / 180))
Output

[0.01745507 0.03492077 0.05240781]


---------------------------------------------------------------------------------------------------------------------

NumPy Rounding Functions


There are various rounding functions that are used to round decimal float numbers to a particular
precision of decimal numbers. Let’s discuss them below:

46
1. np.around()- This function is used to round off a decimal number to desired number of positions.
The function takes two parameters: the input number and the precision of decimal places.

import numpy as np
arr = np.array([20.8999,67.89899,54.63409])
print(np.around(arr,1))
Output

[20.9 67.9 54.6]


2. np.floor()- This function returns the floor value of the input decimal value. Floor value is the
largest integer number less than the input value.

import numpy as np
arr = np.array([20.8,67.99,54.09])
print(np.floor(arr))
Output
[20. 67. 54.]
3. np.ceil()- This function returns the ceiling value of the input decimal value. Ceiling value is the
smallest integer number greater than the input value.

import numpy as np
arr = np.array([20.8,67.99,54.09])
print(np.ceil(arr))
Output

[21. 68. 55.]


NumPy Exponential and Logarithmic Functions
1. np.exp()- This function calculates the exponential of the input array elements.

import numpy as np
arr = np.array([1,8,4])
#exponential function
print(np.exp(arr))
Output

[2.71828183e+00 2.98095799e+03 5.45981500e+01]


2. np.log()- This function calculates the natural log of the input array elements. Natural Logarithm of
a value is the inverse of its exponential value.

import numpy as np
arr = np.array([6,8,4])
#logarithmic funtion
print(np.log(arr))
Output

[1.79175947 2.07944154 1.38629436]


NumPy Complex Functions
1. np.isreal()- This function returns the output of a test for real numbers.

import numpy as np
arr = np.array([1+2j])

47
#real test funtion
print(np.isreal(arr))
Output

[False]
2. np.conj()- This function is useful for calculation of conjugate of complex numbers.

import numpy as np
arr = np.array([1+2j])
#conjugate funtion
print(np.conj(arr))
Output

[1.-2.j]
NumPy Logs

Logs
NumPy provides functions to perform log at the base 2, e and 10.
We will also explore how we can take log for any base by creating a custom ufunc.
All of the log functions will place -inf or inf in the elements if the log can not be computed.

Log at Base 2
Use the log2() function to perform log at the base 2.

Example:Find log at base 2 of all elements of following array:


import numpy as np
arr = np.arange(1, 10)
print(np.log2(arr))
[0. 1. 1.5849625 2. 2.32192809 2.5849625
2.80735492 3. 3.169925 ]
Note: The arange(1, 10) function returns an array with integers starting from 1 (included) to 10 (not
included).

Log at Base 10
Use the log10() function to perform log at the base 10.

Example:Find log at base 10 of all elements of following array:


import numpy as np
arr = np.arange(1, 10)
print(np.log10(arr))
[0. 0.30103 0.47712125 0.60205999 0.69897 0.77815125
0.84509804 0.90308999 0.95424251]
Natural Log, or Log at Base e
Use the log() function to perform log at the base e.

Example:Find log at base e of all elements of following array:


import numpy as np
arr = np.arange(1, 10)
print(np.log(arr))
[0. 0.69314718 1.09861229 1.38629436 1.60943791 1.79175947

48
1.94591015 2.07944154 2.19722458]
Log at Any Base
NumPy does not provide any function to take log at any base, so we can use the frompyfunc()
function along with inbuilt function math.log() with two input parameters and one output
parameter:

Example
from math import log
import numpy as np
nplog = np.frompyfunc(log, 2, 1)
print(nplog(100, 15))
1.7005483074552052

Specialized ufuncs NumPy has many more ufuncs available, including hyperbolic trig functions,
bitwise arithmetic, comparison operators, conversions from radians to degrees, rounding and
remainders, and much more. A look through the NumPy documentation reveals a lot of interesting
functionality.

Advanced Ufunc Features Many NumPy users make use of ufuncs without ever learning their full set
of features. We’ll outline a few specialized features of ufuncs here.

Specifying output For large calculations, it is sometimes useful to be able to specify the array where
the result of the calculation will be stored. Rather than creating a temporary array, you can use this
to write computation results directly to the memory location where you’dlike them to be. For all
ufuncs, you can do this using the out argument of the function:

In[24]: x = np.arange(5)
y = np.empty(5)
np.multiply(x, 10, out=y)
print(y)

[ 0. 10. 20. 30. 40.]


This can even be used with array views. For example, we can write the results of a computation to
every other element of a specified array:

In[25]: y = np.zeros(10)
np.power(2, x, out=y[::2])
print(y)

[ 1. 0. 2. 0. 4. 0. 8. 0. 16. 0.]

If we had instead written y[::2] = 2 ** x, this would have resulted in the creation of a temporary
array to hold the results of 2 ** x, followed by a second operation copying those values into the y
array. This doesn’t make much of a difference for such a small computation, but for very large arrays
the memory savings from careful use of the out argument can be significant.
Aggregate and Statistical Functions in Numpy – Python
First, we have to import Numpy as import numpy as np. To make a Numpy array, you can just use
the np.array() function. The aggregate and statistical functions are given below:

1. np.sum(m): Used to find out the sum of the given array.

49
2. np.prod(m): Used to find out the product(multiplication) of the values of m.
3. np.mean(m): It returns the mean of the input array m.
4. np.std(m): It returns the standard deviation of the given input array m.
5. np.var(m): Used to find out the variance of the data given in the form of array m.
6. np.min(m): It returns the minimum value among the elements of the given array m.
7. np.max(m): It returns the maximum value among the elements of the given array m.
8. np.argmin(m): It returns the index of the minimum value among the elements of the array m.
9. np.argmax(m): It returns the index of the maximum value among the elements of the array m.
10. np.median(m): It returns the median of the elements of the array m.

The code using the above all the function is given below:

import numpy as np
a=np.array([1,2,3,4,5])
print("a :",a)
sum=np.sum(a)
print("sum :",sum)
product=np.prod(a)
print("product :",product)
mean=np.mean(a)
print("mean :",mean)
standard_deviation=np.std(a)
print("standard_deviation :",standard_deviation)
variance=np.var(a)
print("variance :",variance)
minimum=np.min(a)
print("minimum value :",minimum)
maximum=np.max(a)
print("maximum value :",maximum)
minimum_index=np.argmin(a)
print("minimum index :",minimum_index)
maximum_index=np.argmax(a)
print("maximum-index :",maximum_index)
median=np.median(a)
print("median :",median)

Output is:
a : [1 2 3 4 5]
sum : 15
product : 120
mean : 3.0
standard_deviation : 1.4142135623730951
variance : 2.0
minimum value : 1
maximum value : 5
minimum index : 0
maximum-index : 4
median : 3.0

50
Python numpy sum
Python numpy sum function calculates the sum of values in an array.

arr1.sum()
arr2.sum()
arr3.sum()

This Python numpy sum function allows you to use an optional argument called an axis. This Python
numpy Aggregate Function helps to calculate the sum of a given axis.

For example, axis = 0 returns the sum of each column in an Numpy array.

arr2.sum(axis = 0)
arr3.sum(axis = 0)
axis = 1 returns the sum of each row in an array

arr2.sum(axis = 1)
arr3.sum(axis = 1)

51
You don’t have to use this axis name inside those Python array sum parentheses. I mean,
arr2.sum(axis = 1) is same as arr2.sum(1).

arr2.sum(0)
arr2.sum(1)
arr3.sum(0)
arr3.sum(1)

Python numpy average


Python numpy average function returns the average of a given array.

np.average(arr1)
np.average(arr2)
np.average(arr3)
Average of x and Y axis

52
np.average(arr2, axis = 0)
np.average(arr2, axis = 1)
Calculate numpy array Average without using the axis name.

np.average(arr3, 0)
np.average(arr3, 1)

Python numpy prod


Python numpy prod function finds the product of all the elements in a given array. This numpy prod
function returns 1 for an empty array

np.prod([])
np.prod(arr1)
np.prod(arr2) # any number multiply by zero gives zero
This time we are using a two-dimensional array.

x = np.array([[1, 2, 3], [4, 5, 6]])


np.prod(x)

y = np.random.randint(1, 10, size = (5, 5))


np.prod(y)

Next, we would like to calculate the product of all the numbers on the X-axis and Y-axis separately.

np.prod(x, axis = 0)
np.prod(x, axis = 1)
np.prod(y, axis = 0)
np.prod(y, axis = 1)
Find the numpy array product without using the axis name.

np.prod(x, 1)
np.prod(y, 1)
prod(y, 1)

53
Python numpy min
The Python numpymin function returns the minimum value in an array or a given axis.

arr1.min()
arr2.min()
arr3.min()
We are finding the numpy array minimum value in the X and Y-axis.

arr2.min(axis = 0)
arr2.min(axis = 1)
arr3.min(0)
arr3.min(1)

54
Python Array minimum
Unlike the min function, this Python array minimum function accepts two arrays. Next, numpy array
minimum performs one to one comparison of each array item in one array with other and returns an
array of minimum values.

This time we are applying the Python array minimum function on randomly generated 5 * 5
matrixes.

import numpy as np
x = np.random.randint(1, 10, size = (5, 5))
print(x)
print()

y = np.random.randint(1, 10, size = (5, 5))


print(y)

print('\n-----Minimum Array----')

55
print(np.minimum(x, y))

Python numpy max


The Python numpy max function returns the maximum number from a given array or in a given axis.

Same as min

Broadcasting is a mechanism that permits NumPy to operate with arrays of different shapes when
performing arithmetic operations:

# Rule 1: Two dimensions are operatable if they are equal

# Create an array of two dimension


A =np.ones((6, 8))
# Shape of A
print(A.shape)

# Create another array


B = np.random.random((6,8))
# Shape of B
print(B.shape)
# Sum of A and B, here the shape of both the matrix is same.
print(A + B)

Secondly, two dimensions are also compatible when one of the dimensions of the array is 1. Check
the example given here:

56
# Rule 2: Two dimensions are also compatible when one of them is 1
# Initialize `x`
x = np.ones((3,4))
print(x)
# Check shape of `x`
print(x.shape)
# Initialize `y`
y = np.arange(4)
print(y)
# Check shape of `y`
print(y.shape)
# Subtract `x` and `y`
print(x - y)

Lastly, there is a third rule that says two arrays can be broadcast together if they are compatible in
all of the dimensions. Check the example given here:

# Rule 3: Arrays can be broadcast together if they are compatible in all dimensions

x = np.ones((6,8))
y = np.random.random((10, 1, 8))
print(x + y)

The dimensions of x(6,8) and y(10,1,8) are different. However, it is possible to add them. Why is
that? Also, change y(10,2,8) or y(10,1,4) and it will give ValueError. Can you find out why? (Hint:
check rule 1).

57

You might also like