0% found this document useful (0 votes)
200 views

Data Science Python

This document provides an introduction to data science and Python. It defines data science as using data to draw meaningful conclusions. It discusses the data science workflow of data collection, preparation, exploration/visualization, and experimentation/prediction. It also covers common data science tools and applications, including machine learning, IoT, and fraud detection. Finally, it introduces Python as an interpreted programming language commonly used for data science tasks like data analysis, machine learning, and web development.

Uploaded by

ZADOD YASSINE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
200 views

Data Science Python

This document provides an introduction to data science and Python. It defines data science as using data to draw meaningful conclusions. It discusses the data science workflow of data collection, preparation, exploration/visualization, and experimentation/prediction. It also covers common data science tools and applications, including machine learning, IoT, and fraud detection. Finally, it introduces Python as an interpreted programming language commonly used for data science tasks like data analysis, machine learning, and web development.

Uploaded by

ZADOD YASSINE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

INTRODUCTION TO

Zakaria KERKAOU
Zakaria.kerkaou@e-
polytechnique.ma
What is Data Science ?
• It's a set of methodologies for taking in thousands of forms of data
that are available to us today, and using them to draw meaningful
conclusions.
• Data is being collected all around us. Every like, click, email, credit
card swipe, or tweet is a new piece of data that can be used to
better describe the present or better predict the future.
What can data do ?
• Data can describe our current state.
• It can help detect anomalous events.
• Data can also diagnose the causes of observed events and
behaviours.
• Finally, Data can predict future events.
Why now ?
• We're collecting more data than ever before.
Data science workflow
• In data science, we generally have four steps to any project.

Data collection

Data Preparation

Data exploration and visualization

Experimentation and predictions on the data.
Application of Data science
• Some areas of data science :

Machine learning (ML).

Internet Of Things (IoT).

Deep Learning.
Application of Data science
• Example Machine Learning :

Fraud detection
Application of Data science
• Data science problem begins with a well-defined question:

What is the probability that this transaction is fraudulent?
• A set of example Data

Transaction (from database) labelled as “valid” or “Fraudulent”.
• A new set of data to use our algorithm on

New transaction.
Application of Data science
Another example of IoT :
Monitor and auto-detect different activities.
Application of Data science
Your smart watch is part of a fast growing field called "the Internet
of Things", also known as IoT, which is often combined with Data
Science.
IoT refers to gadgets that are not standard computers, but still have
the ability to transmit data.

Smart w atches.

Internet connected home security systems.

Electronic toll collection systems.

Building energy management systems.

Much more !
IoT data is a great resource for data science projects
Data science roles.
Generally, there's four categories of jobs in data science:
• Data Engineer,
• Data Analyst,
• Data Scientist,
• Machine Learning Scientist.
Data science roles: Data
Engineer.
Data engineers control the flow of data:
• They build custom data pipelines and storage systems.
• Information architects.
• Maintain Data access.
Data science roles: Data
Engineer.
Data engineers tools:
• SQL.

To store and organize data.
• Java, Scala or python.

Programming languages to process data
• Shell.

Command lines to automate and run tasks
• Cloud computing.

AWS, Azure, Google cloud platform.
Data science roles: Data
Analyst.
Data analysts describe the present via data:
• Perform simpler analyses that describe data.
• Create reports and dashboards to summarize data.
• Clean data for analysis.
Data science roles: Data
Analyst.
Data analysts tools:
• SQL.

Retreive and agregate Data.
• Spreadsheets (Excel, Google Sheets).

Simple analysis.
• BI tools (Power BI, Tableau, Looker).

Dashboard and visualisation.
• Sometimes Python.
Data science roles: Data
Scientist.
Data Scientists have a strong background in statistics, enabling
them to find new insights from data.
• Versed in statistical methods.
• Run experiments and analyses for insights.
• Traditional machine learning.
Data science roles: Data
Scientist.
Data scientist tools:
• SQL.

Retreive and agregate Data.
• Python or/and R.

Datascience libraries, for example
Panda (python), tidyverse.

Machine learning libraries such as
Sklearn.
Data science roles: Machine
Learning Scientist.
Machine learning scientists are similar to data scientists, but with a
machine learning specialization.
• Predictions and extrapolations.
• Classification.
• Deep learning:

Image processing, computer vision.

Natural language processing.
Data science roles: Machine
Learning Scientist.
Machine learning scientists tools.
• Python or/and R.

Datascience libraries, for example
Pandas (python), tidyverse

Machine learning and deep learning
libraries : SkLearn, Tensorflow,
Spark.
Data science roles.
Introduction to
What is Python? And What can Python do?
• Python is an interpreted high-level general-purpose computer
programming language often used to build websites and software,
automate tasks, and conduct data analysis.
• Python is a general purpose language, it can be used to create a
variety of different programs and isn’t specialized for any specific
problems.
• Its versatility and beginner-friendliness, has made it one of the
most-used programming languages today.
• The most recent version of Python is Python 3.
• Python 2, although not being updated with anything other than
security updates, is still quite popular.
What is Python? And What can Python do?
• Python can be used for :
 web development (server-side),
 software development,
 mathematics,
 System scripting.
 Game development.
 Scientific computing and Datascience
 AI and machine learning.
Google Colaboratory

• Colaboratory, or Colab for short, is a Google Research product,


which allows developers to write and execute Python code through
their browser.
• There are several reasons to opt to use Google Colab instead of a
plain files:
 Pre-Installed Libraries
 Saved on the Cloud
 Collaboration
 Free GPU and TPU Use
Python Syntax
• Python code can be executed either by :
• Writing directly in the Command Line

• Creating a python file on using the .py file extension, and running
it in the Command Line.

• Or running it on a jupyter notebook (or google colab)


Indentation

Indentation refers to the spaces at the beginning of a code line.

the indentation in Python is very important. Python uses
indentation to indicate a block of code.
Comments

Comments can be used to explain Python code.

Comments can be used to make the code more readable.

Python does not really have a syntax for multi line comments.
However Since Python ignores string literals that are not
assigned to a variable, you can add a multiline string (triple
quotes) in your code, and place your comment inside it
Variables


In python there are no command for
declaring a variable. They are created
the moment you first assign a value to it.

Variables do not need to be declared with
any particular type, and can even change
type after they have been set.

However, If you want to specify the data
type of a variable, this can be done with
casting.
Variable names


In Python a variable name :

Must start with a letter or the underscore character

Cannot start with a number

Can only contain alpha-numeric characters and underscores
(A-z, 0-9, and _ )

Is case-sensitive (age, Age and AGE are three different
variables)
Built-in Data Types

Python has the following data types built-in by default, in
these categories:

Text Type: str

Numeric Types: int, float, complex

Sequence Types: list, tuple, range

Mapping Type: dict

Set Types: set, frozenset

Boolean Type: bool

Binary Types: bytes, bytearray, memoryview

To get the datatype of any object you can use the function
type()
Input / output

Python provides numerous built-in functions that are readily available to us to perform
I/O task in Python.

The input() function helps to enter data at run time by the user and the output
function print() is used to display the result of the program on the screen after
execution.

Python 3.6 uses the input() method and Python 2.7 uses the raw_input() method.
Lists

Lists are just like dynamically sized arrays similar to vector in C++ and ArrayList in
Java.

List items are ordered, changeable, and allow duplicate values.

In Python, a list is created by placing elements inside square brackets [], separated
by commas.

We can use the index operator [] to access an item in a list which starts at 0.

Python allows negative indexing for its sequences. The index of -1 refers to the last
item.

To determine how many items a list has, use the len() function.
Arithmetic operators

+ Add two operands or unary plus

- Subtract right operand from the left or unary minus

* Multiply two operands

/ Divide left operand by the right one (always results into float)

% Modulus - remainder of the division of left operand by the right

// Floor division

** Exponent - left operand raised to the power of right
Comparison operators

> Greater than - True if left operand is greater than the right x > y

< Less than - True if left operand is less than the right x < y

== Equal to - True if both operands are equal x == y

!= Not equal to - True if operands are not equal x != y

>= Greater than or equal to x >= y

<= Less than or equal to - True if left operand is less than or equal to the right
Logical operators

and True if both the operands are true x and y

or True if either of the operands is true x or y

not True if operand is false (complements the operand) not x
Bitwise operators

& Bitwise AND x & y = 0 (0000 0000)

| Bitwise OR x | y = 14 (0000 1110)

~ Bitwise NOT ~x = -11 (1111 0101)

^ Bitwise XOR x ^ y = 14 (0000 1110)

>> Bitwise right shift x >> 2 = 2 (0000 0010)

<< Bitwise left shift x << 2 = 40 (0010 1000)
Assignment operators

= x=5 x=5 
//= x //= 5 x = x // 5


+= x += 5 x=x+5 
**= x **= 5 x = x ** 5


-= x -= 5 x=x–5 
&= x &= 5 x=x&5


*= x *= 5 x=x*5 
|= x |= 5 x=x|5


/= x /= 5 x=x/5 
^= x ^= 5 x=x^5


%= x %= 5 x = x % 5 
>>= x >>= 5 x = x >> 5
Special operators

Is True if the operands are identical (refer to the same object) x is True

is not True if the operands are not identical x is not True

Membership operators

in True if value/variable is found in the sequence 5 in x

not in True if value/variable is not found in the sequence 5 not in x
Conditions and If statements

Conditional statements are required when we want to execute a code only if a certain
condition is satisfied.

The if…elif…else statement is used in Python for decision making.


Example : we check if the number is positive ornegative or zero and display an
appropriate message

Ps : Python relies on indentation (whitespace


at the beginning of a line) to define scope in the
code. Other programming languages often use
curly-brackets for this purpose.
Python for Loop

The for loop in Python is used to iterate over a sequence (list, tuple, string) or other
iterable objects. Iterating over a sequence is called traversal.

The body of for loop is separated from the rest of the code using indentation.

We can generate a sequence of numbers using range() function. range(10) will
generate numbers from 0 to 9 (10 numbers).

Example : Program to find the sum


of all numbers stored in a list
Python while loop

The while loop in Python is used to iterate over a block of code as long as the test
expression (condition) is true.

This loop is generally used when we don't know the number of times to iterate
beforehand.


Example : Program to add natural

numbers up to

sum = 1+2+3+...+n
Python break and continue

In Python, break and continue statements can change the normal flow of a loop.

The break statement terminates the loop containing it.

The continue statement is used to skip the rest of the code inside a loop for the
current iteration only.

You might also like