0% found this document useful (0 votes)
107 views13 pages

Name: Akshansh Aswal - Course: B.Tech 3Rd Year - Section: B' - Roll No: 07 - Graphic Ea Hill University Dehradun Campus

This document provides details about a student named Akshansh Aswal enrolled in a 9-week introduction to data science course. The course commenced on August 4th 2020 and ends on October 5th 2020. It is being conducted online through the Udemy platform. The document then outlines topics to be covered in the course, including data all around us, what data science is, examples of data types and amounts of data collected, and what can be done with large datasets through techniques like data mining and machine learning.

Uploaded by

Ashish Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views13 pages

Name: Akshansh Aswal - Course: B.Tech 3Rd Year - Section: B' - Roll No: 07 - Graphic Ea Hill University Dehradun Campus

This document provides details about a student named Akshansh Aswal enrolled in a 9-week introduction to data science course. The course commenced on August 4th 2020 and ends on October 5th 2020. It is being conducted online through the Udemy platform. The document then outlines topics to be covered in the course, including data all around us, what data science is, examples of data types and amounts of data collected, and what can be done with large datasets through techniques like data mining and machine learning.

Uploaded by

Ashish Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

INTRODUCTION

•NAME : AKSHANSH ASWAL


•COURSE : B.TECH 3rd Year
•SECTION : ‘B’
•ROLL NO : 07

• GRAPHIC EA HILL UNIVERSITY


DEHRADUN CAMPUS
Course Details:

• The name of my topic is “Introduction


To Data Science”.
• Registered from online platform
Udemy.
• Duration of the course is 9 weeks.
• The course commenced on 4 August
2020 and ends on 5 october 2020
• The faculty in charge is
Prof. Hedelin de ponteves
Topics Outline:

• Data
- Data All Around

• Data Science
– Introduction

• What to do with that data


Data All Around

• Lots of data is being collected


and warehoused
– Web data, e-commerce
– Financial transactions, bank/credit transactions
– Online trading and purchasing
– Social Network
How Much Data Do We have?

• Google processes 20 PB a day (2008)


• Facebook has 60 TB of daily logs
• eBay has 6.5 PB of user data + 50 TB/day (5/2009)
• 1000 genomes project: 200 TB

• Cost of 1 TB of disk: $35


• Time to read 1 TB disk: 3 hrs
(100 MB/s)
What is Data Science?

• An area that manages, manipulates, extracts, and


interprets knowledge from tremendous amount of data
• Data science (DS) is a multidisciplinary field of study with
goal to address the challenges in big data
• Data science principles apply to all data – big and small
• Data science uses techniques such as machine learing
and artificial intelligence.

https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/
Types of Data We Have

• Relational Data (Tables/Transaction/Legacy Data)


• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF), …
• Streaming Data
• You can afford to scan the data once
Data Science
Why is it sexy?
• Gartner’s 2014 Hype Cycle
What To Do With These Data?

• Aggregation and Statistics


– Data warehousing and OLAP
• Knowledge discovery
– Data Mining
– Statistical Modeling
Real Life Examples

• Companies learn your secrets, shopping patterns, and


preferences
– For example, can we know if a woman is pregnant, even if she
doesn’t want us to know? Target case study
• Data Science and election (2008, 2012)
– 1 million people installed the Obama Facebook app that gave
access to info on “friends”
Concentration in Data Science
• Mathematics and Applied Mathematics
• Applied Statistics/Data Analysis
• Solid Programming Skills (R, Python, Julia, SQL)
• Data Mining
• Data Base Storage and Management
• Machine Learning and discovery
ThankYou

You might also like