0% found this document useful (0 votes)
5 views29 pages

Numpy 1

Uploaded by

kbhavana0602
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views29 pages

Numpy 1

Uploaded by

kbhavana0602
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Introduction to NumPy

Follow the installation instructions found there. Once you do, you can im- port
NumPy and double-check the version:

In[1]:import numpy
numpy. version

Out[1]:'1.11.1'

For the pieces of the package discussed here, I’d recommend NumPy version
1.8 or later. By convention, you’ll find that most people in the SciPy/PyData world
will import NumPy using npas an alias:

In[2]:import numpy as np

/* C code */
int result = 0;
for(int i=0; i<100; i++){
result += i;
}

While in Python the equivalent operation could be written this way:

# Python code
result = 0
for i in range(100):
result += i
Creating Arrays from Python Lists

First, we can use np.array to create arrays from Python lists:

In[8]:#integerarray:
np.array([1,4,2,5,3])

Out[8]:array([1,4,2,5,3])

Remember that unlike Python lists, NumPy is constrained to arrays that all
contain the same type. If types do not match, NumPy will upcast if possible (here,
integers are upcast to floating point):

In[9]:np.array([3.14,4,2,3])

Out[9]:array([3.14,4.,2.,3.])

If we want to explicitly set the data type of the resulting array, we can use the
dtype keyword:

In[10]:np.array([1,2,3,4],dtype='float32')

Out[10]:array([1.,2.,3.,4.],dtype=float32)

Creating Arrays from Scratch

Especially for larger arrays, it is more efficient to create arrays from scratch using
routines built in to NumPy. Here are several examples:

In[12]: # Create a length-10 integer array filled with zeros


np.zeros(10, dtype=int)

Out[12]:array([0,0,0,0,0,0,0,0,0,0])

In[13]: # Create a 3x5 floating-point array filled with 1s


np.ones((3, 5), dtype=float)
Out[13]:array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])

In[14]: # Create a 3x5 array filled with 3.14


np.full((3, 5), 3.14)

Out[14]:array([[ 3.14, 3.14, 3.14, 3.14, 3.14],


[ 3.14, 3.14, 3.14, 3.14, 3.14],
[ 3.14, 3.14, 3.14, 3.14, 3.14]])

In[15]: # Create an array filled with a linear sequence


# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)

Out[15]:array([0,2,4,6,8,10,12,14,16,18])

In[16]: # Create an array of five values evenly spaced between 0 and 1


np.linspace(0, 1, 5)

Out[16]:array([0.,0.25,0.5,0.75,1.])

In[17]: # Create a 3x3 array of uniformly distributed #


random values between 0 and 1 np.random.random((3, 3))

Out[17]:array([[ 0.99844933, 0.52183819, 0.22421193],


[ 0.08007488, 0.45429293, 0.20941444],
[ 0.14360941, 0.96910973, 0.946117]])

In[19]: # Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

Out[19]:array([[2, 3, 4],
[5, 7, 8],
[0, 5, 0]])
In[20]: # Create a 3x3 identity matrix
np.eye(3)

Out[20]:array([[ 1., 0., 0.],


[ 0., 1., 0.],
[ 0., 0., 1.]])
Table2-1.StandardNumPydatatypes

Datatype Description

bool_ Boolean(TrueorFalse)storedasabyte

Defaultintegertype(sameasClong;normallyei- therint64
int_ orint32)

intc IdenticaltoCint(normallyint32orint64)

Integerusedforindexing(sameasCssize_t;nor- mally
intp eitherint32 orint64)

int8 Byte(–128to 127)

int16 Integer(–32768to32767)

int32 Integer(–2147483648to2147483647)

Integer(–9223372036854775808to
int64 9223372036854775807)

uint8 Unsignedinteger(0to255)

uint16 Unsignedinteger(0to65535)

uint32 Unsignedinteger(0to4294967295)

uint64 Unsignedinteger(0to18446744073709551615)

float_ Shorthandforfloat64

Half-precision float: sign bit, 5 bits exponent, 10 bits


float16 mantissa

Single-precision float: sign bit, 8 bits exponent, 23 bits


float32
Datatype Description

Double-precision float: sign bit, 11 bits exponent, 52 bits


float64 mantissa

complex_ Shorthandforcomplex128

complex64 Complexnumber,representedbytwo32-bit floats

complex128 Complexnumber,representedbytwo64-bit floats

The Basics of NumPy Arrays!


Few categories of basic array manipulations:

Attributes of arrays
Determining the size, shape, memory consumption, and data types of
arrays

Indexing of arrays
Getting and setting the value of individual array elements

Slicing of arrays
Getting and setting smaller sub arrays with in a larger array

Reshaping of arrays

Changing the shape of a given array

Joining and splitting of arrays


Combining multiple arrays into one, and splitting one array into many
NumPy Array Attributes

First let’s discuss some useful array attributes. We’ll start by defining three
random arrays: a one-dimensional, two-dimensional, and three di- mensional
array. We’ll use NumPy’s random number generator, which we will seed with a
set value in order to ensure that the same random arrays are generated each
time this code is run:

In[1]:import numpy as np
np.random.seed(0)#seedforreproducibility

x1=np.random.randint(10,size=6)#One-dimensional array
x2=np.random.randint(10,size=(3,4))#Two-dimensional array
x3=np.random.randint(10,size=(3,4,5))#Three-dimensional array

Each array has attributes ndim(the number of dimensions),shape(the size of each


dimension), and size (the total size of the array):

In[2]: print("x3 ndim: ", x3.ndim)


print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3ndim:3
x3shape:(3,4,5)
x3size:60

Another useful attribute is the dtype, the data type of the array.

In[3]:print("dtype:",x3.dtype)

dtype:int64
Array Indexing: Accessing Single Elements

If you are familiar with Python’s standard list indexing, indexing in NumPy will
feel quite familiar. In a one-dimensional array, you can ac-cess the ith
value(counting from zero)by specifying the desired index in square brackets, just
as with Python lists:

In[5]:x1
Out[5]:array([5,0,3,3,7,9])

In[6]:x1[0]
Out[6]:5

In[7]:x1[4]
Out[7]:7

To index from the end of the array, you can use negative indices:

In[8]:x1[-1]
Out[8]:9

In[9]:x1[-2]
Out[9]:7

In a multidimensional array, you access items using a comma-separated tuple


of indices:
In[10]:x2

out[10]:array([[3, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])

In[11]:x2[0,0]
Out[11]:3

In[12]:x2[2,0]
Out[12]:1

In[13]:x2[2,-1]
Out[13]:7
Youcanalsomodifyvaluesusinganyoftheaboveindexnotation:

In[14]:x2[0,0]=12
x2

Out[14]:array([[12, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])

In[11]:x2[0,0]

Out[11]:3

In[12]:x2[2,0]

Out[12]:1

In[13]:x2[2,-1]

Out[13]:7
ArraySlicing:AccessingSubarrays

Just as we can use square brackets to access individual array elements, we can
also use them to access subarrays with the slice notation, marked by the colon
(:) character. The NumPy slicing syntax follows that of the standard Python list;
to access a slice of an arrayx, use this:

x[start:stop:step]

If any of these are unspecified, they default to the valuesstart=0,


stop=sizeofdimension,step=1.We’lltakealookataccessingsubar- rays in one
dimension and in multiple dimensions.

One-dimensionalsubarrays

In[16]:x=np.arange(10)
x

Out[16]:array([0,1,2,3,4,5,6,7,8,9])

In[17]:x[:5]#firstfiveelements

Out[17]:array([0,1,2,3,4])

In[18]:x[5:]#elementsafterindex5
Out[18]:array([5,6,7,8,9])

In[19]:x[4:7]#middlesubarray

Out[19]:array([4,5,6])

In[20]:x[::2]#everyotherelement

Out[20]:array([0,2,4,6,8])

In[21]:x[1::2]#everyotherelement,startingatindex1

Out[21]:array([1,3,5,7,9])

A potentially confusing case is when thestepvalue is negative. In this


case,thedefaultsforstartandstopareswapped.Thisbecomesacon- venient way
to reverse an array:

In[22]:x[::-1]#allelements,reversed

Out[22]:array([9,8,7,6,5,4,3,2,1,0])

In[23]:x[5::-2]#reversedeveryotherfromindex5

Out[23]:array([5,3,1])

Multidimensionalsubarrays

Multidimensionalslicesworkinthesameway,withmultipleslicessepa- rated by
commas. For example:

In[24]:x2
Out[24]:array([[12, 5, 2, 4],
[7, 6, 8, 8],
[1, 6, 7, 7]])

In[25]:x2[:2,:3]#tworows,threecolumns

Out[25]:array([[12, 5, 2],
[7, 6, 8]])

In[26]:x2[:3,::2]#allrows,everyothercolumn

Out[26]:array([[12, 2],
[7, 8],
[1, 7]])

Finally,subarraydimensionscanevenbereversedtogether:

In[27]:x2[::-1,::-1]

Out[27]:array([[ 7, 7, 6, 1],
[ 8, 8, 6, 7],
[ 4, 2, 5, 12]])

Accessingarrayrowsandcolumns

One commonly needed routine is accessing single rows or columns of an


array.Youcandothisbycombiningindexingandslicing,usinganempty slice marked
by a single colon ( :):

In[28]:print(x2[:,0])#firstcolumnofx2

[1271]

In[29]:print(x2[0,:])#firstrowofx2
[12524]
In the case of row access, the empty slice can be omitted for a more com- pact
syntax:

In[30]:print(x2[0])#equivalenttox2[0,:]

[12524]

Subarraysasno-copyviews

One important—and extremely useful—thing to know about array slices is that


they return views rather than copies of the array data. This is one
areainwhichNumPyarrayslicingdiffersfromPythonlistslicing:inlists, slices will be
copies. Consider our two-dimensional array from before:

In[31]:print(x2)

[[12 5 2 4]
[7 6 8 8]
[1 6 7 7]]

Let’sextracta2×2subarrayfromthis:

In[32]: x2_sub = x2[:2, :2]


print(x2_sub)

[[125]
[76]]

Nowifwemodifythissubarray,we’llseethattheoriginalarrayis changed! Observe:

In[33]:x2_sub[0,0]=99
print(x2_sub)

[[995]
[76]]
In[34]:print(x2)

[[99 5 2 4]
[7 6 8 8]
[1 6 7 7]]

This default behavior is actually quite useful: it means that when we work with
large datasets, we can access and process pieces of these datasets without the
need to copy the underlying data buffer.

Creatingcopiesofarrays

Despite the nice features of array views, it is sometimes useful to instead


explicitly copy the data within an array or a subarray. This can be most easily
done with thecopy() method:

In[35]: x2_sub_copy = x2[:2, :2].copy()


print(x2_sub_copy)

[[995]
[76]]

Ifwenowmodifythissubarray,theoriginalarrayisnottouched:

In[36]: x2_sub_copy[0, 0] = 42
print(x2_sub_copy)

[[425]
[76]]

In[37]:print(x2)

[[99 5 2 4]
[7 6 8 8]
[1 6 7 7]]

ReshapingofArrays
Another useful type of operation is reshaping of arrays. The most flexible way of
doing this is with thereshape()method. For example, if you want to put the
numbers 1 through 9 in a 3×3 grid, you can do the following:

In[38]: grid = np.arange(1, 10).reshape((3, 3))


print(grid)

[[1 2 3]
[4 5 6]
[7 8 9]]

Note that for this to work, the size of the initial array must match the size
ofthereshapedarray.Wherepossible,thereshapemethodwilluseano- copy view of
the initial array, but with noncontiguous memory buffers this is not always the
case.

Another common reshaping pattern is the conversion of a one-dimen-


sionalarrayintoatwo-dimensionalroworcolumnmatrix.Youcando this with
thereshapemethod, or more easily by making use of the newaxis keyword
within a slice operation:

In[39]:x=np.array([1,2,3])

# row vector via reshape


x.reshape((1, 3))

Out[39]:array([[1,2,3]])

In[40]: # row vector via newaxis


x[np.newaxis, :]

Out[40]:array([[1,2,3]])

In[41]: # column vector via reshape


x.reshape((3, 1))
Out[41]:array([[1],
[2],
[3]])

In[42]: # column vector via newaxis


x[:, np.newaxis]

Out[42]:array([[1],
[2],
[3]])

We will see this type of transformation often throughout the remainder of the
book.

ArrayConcatenationandSplitting

Alloftheprecedingroutinesworkedonsinglearrays.It’salsopossibleto combine
multiple arrays into one, and to conversely split a single array into multiple
arrays. We’ll take a look at those operations here.

Concatenationofarrays

Concatenation, or joining of two arrays in NumPy, is primarily accom-


plishedthroughtheroutinesnp.concatenate,np.vstack,and
np.hstack.np.concatenatetakesatupleorlistofarraysasitsfirstar- gument, as we
can see here:

In[43]:x=np.array([1,2,3])
y = np.array([3, 2, 1])
np.concatenate([x, y])

Out[43]:array([1,2,3,3,2,1])

Youcanalsoconcatenatemorethantwoarraysatonce:

In[44]:z=[99,99,99]
print(np.concatenate([x,y,z]))
[123321999999]

np.concatenatecanalsobeusedfortwo-dimensionalarrays:

In[45]:grid=np.array([[1,2,3],
[4,5,6]])

In[46]: # concatenate along the first axis


np.concatenate([grid, grid])

Out[46]:array([[1, 2, 3],
[4, 5, 6],
[1, 2, 3],
[4, 5, 6]])

In[47]: # concatenate along the second axis (zero-indexed)


np.concatenate([grid, grid], axis=1)

Out[47]:array([[1,2,3,1,2,3],
[4,5,6,4,5,6]])

Forworkingwitharraysofmixeddimensions,itcanbeclearertousethe
np.vstack(verticalstack)andnp.hstack(horizontalstack)functions:

In[48]:x=np.array([1,2,3])
grid=np.array([[9,8,7],
[6,5,4]])

# vertically stack the arrays


np.vstack([x, grid])

Out[48]:array([[1, 2, 3],
[9, 8, 7],
[6, 5, 4]])
In[49]: # horizontally stack the arrays
y = np.array([[99],
[99]])
np.hstack([grid,y])

Out[49]:array([[9, 8, 7,99],
[6, 5, 4,99]])

Similarly,np.dstackwillstackarraysalongthethirdaxis.

Splittingofarrays

The opposite of concatenation is splitting, which is implemented by the


functionsnp.split,np.hsplit,andnp.vsplit.Foreachofthese,we can pass a
list of indices giving the split points:

In[50]:x=[1,2,3,99,99,3,2,1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)

[123][9999][321]

NoticethatNsplitpointsleadtoN+1subarrays.Therelatedfunctions
np.hsplitandnp.vsplitaresimilar:

In[51]: grid = np.arange(16).reshape((4, 4))


grid

Out[51]:array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12,13,14,15]])

In[52]: upper, lower = np.vsplit(grid, [2])


print(upper)
print(lower)

[[01 2 3]
[45 6 7]]
[[8 9 1011]
[12131415]]
In[53]: left, right = np.hsplit(grid, [2])
print(left)
print(right)

[[ 0 1]
[ 4 5]
[ 8 9]
[12 13]]
[[2 3]
[6 7]
[10 11]
[14 15]]

Similarly,np.dsplitwillsplitarraysalongthethirdaxis.

ComputationonNumPyArrays: Universal
Functions
Up until now, we have been discussing some of the basic nuts and bolts of
NumPy; in the next few sections, we will dive into the reasons that NumPy is so
important in the Python data science world. Namely, it pro- vides an easy and
flexible interface to optimized computation with arrays of data.

Computation on NumPy arrays can be very fast, or it can be very slow. The key
to making it fast is to use vectorized operations, generally imple- mented through
NumPy’s universalfunctions(ufuncs).This section moti-vates the need for
NumPy’s ufuncs, which can be used to make repeated calculations on array
elements much more efficient. It then introduces many of the most common and
useful arithmetic ufuncs available in the NumPy package.

Table2-2liststhearithmeticoperatorsimplementedinNumPy.
Table2-2.ArithmeticoperatorsimplementedinNumPy

Operator Equivalentufunc Description

+ np.add Addition(e.g.,1+1=2)

- np.subtract Subtraction(e.g.,3-2=1)

- np.negative Unarynegation(e.g.,-2)

* np.multiply Multiplication(e.g.,2*3=6)

/ np.divide Division(e.g.,3/2=1.5)

// np.floor_divide Floordivision(e.g.,3//2=1)

** np.power Exponentiation(e.g.,2**3=8)

Modulus/remainder(e.g.,9%4=
% np.mod
1)

Additionally there are Boolean/bitwise operators; we will explore these in


“Comparisons, Masks, and Boolean Logic”.

Absolutevalue

JustasNumPyunderstandsPython’sbuilt-inarithmeticoperators,italso understands
Python’s built-in absolute value function:

In[11]:x=np.array([-2,-1,0,1,2])
abs(x)

Out[11]:array([2,1,0,1,2])

The corresponding NumPy ufunc is np.absolute, which is also available under


the aliasnp.abs:

In[12]:np.absolute(x)
Out[12]:array([2,1,0,1,2])

In[13]:np.abs(x)

Out[13]:array([2,1,0,1,2])

This ufunc can also handle complex data, in which the absolute value re- turns
the magnitude:

In[14]: x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])


np.abs(x)

Out[14]:array([5.,5.,2.,1.])

Out[12]:array([0.8967576,0.99196818,0.6687194])

Other aggregation functions

NumPy provides many other aggregation functions, but we won’t discuss them in
detail here. Additionally, most aggregates have aNaN-safe coun- terpart that
computes the result while ignoring missing values, which are marked by the
special IEEE floating-pointNaNvalue (for a fuller discus- sion of missing data, see
“HandlingMissing Data”). Some of theseNaN- safe functions were not added until
NumPy 1.8, so they will not be avail- able in older NumPy versions.

Table 2-3provides a list of useful aggregation functions available in NumPy.


Table2-3.AggregationfunctionsavailableinNumPy

NaN-safeVersion Description
Function
Name

np.sum np.nansum Computesumofelements

Computeproductof
np.prod np.nanprod elements

Computemedianof
np.mean np.nanmean elements

Computestandard
np.std np.nanstd deviation

np.var np.nanvar Computevariance

np.min np.nanmin Findminimumvalue

np.max np.nanmax Findmaximumvalue

Findindexofminimum value
np.argmin np.nanargmin

Findindexofmaximum value
np.argmax np.nanargmax

Computemedianof
np.median np.nanmedian elements

Computerank-basedstatis- tics
np.percentile np.nanpercentile of elements

Evaluatewhetheranyele- ments
np.any N/A are true
NaN-safeVersion Description
Function
Name

Evaluatewhetherallele- ments
np.all N/A are true
Wewill see these aggregates often throughout the rest of the book.

In[37]:A=np.array([1,0,1,0,1,0],dtype=bool)
B = np.array([1, 1, 1, 0, 1, 1], dtype=bool)
A | B

Out[37]:array([True,True,True,False,True,True],dtype=bool)

Usingoronthesearrayswilltrytoevaluatethetruthorfalsehoodofthe entire array


object, which is not a well-defined value:

In[38]:AorB

ValueError Traceback(mostrecentcalllast)

<ipython-input-38-5d8e4f2e21c0>in<module>()
---->1AorB

ValueError:Thetruthvalueofanarraywithmorethanoneelementis...

You might also like