Big Data Introduction

The document provides an overview of various data types including structured, semi-structured, quasi-structured, and unstructured data, along with their characteristics, merits, and demerits. It also discusses the concept of Big Data, its importance, and the 10 V's associated with it, as well as data processing and analytics tools like Hadoop, Hive, MADlib, and Pig. The document emphasizes the significance of understanding these data types for effective data management and analysis in organizations.

Uploaded by

qygsjxmc8k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Big Data Introduction

Uploaded by

qygsjxmc8k

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

Specialized Topics in

Computer (CSC 411)

Introduction to Big Data

and Data Types.
STRUCTURED DATA , SEMI-STRUCTURED
DATA, QUASI-STRUCTURED DATA AND
UNSTRUCTURED DATA
Learning Objectives
• What is Data and Information? • Semi-Structured
• What is digital data • Quasi-Structured
• What is Big Data • Unstructured
• 10 V’s of Big Data • Data Analytics Tools
• Data and Information • Conclusion – The Future of
Processing Data.
• Structured
What is Data and Information
What is Data and Information?
• The terms Data and Information are closely connected and it’s common for the two
terms to be used interchangeably.
• However, it’s vital to grasp the distinction between Data and Information.
• Data can be described as the quantities, characters, or symbols on which operations
are performed by a computer, being stored and transmitted in the form of electrical
signals and recorded on magnetic, optical, or mechanical recording media.
• Data is a collection of discrete values that convey information, describing quantity,
quality, fact, statistics, other basic units of meaning, or simply sequences of symbols
that may be further interpreted.
• Data is a statement of fact about an entity
What is Data and Information?
• Information
• It is data that has been
processed in such a way as to
be meaningful to the person
who receives it.
• It is a processed data with a
purpose.
• It is any thing that is
communicated.
Big Data
What is Big Data ?
• The term big data started to show up sparingly in
the early 1990s, and its prevalence and importance
increased exponentially as years passed.
• Nowadays big data is often seen as integral to a
company's data strategy.
• Big Data is a broad term used to refer to the huge
volume of digital information generated by
various businesses.
• This big data is not only generated by traditional
information exchange and software, but also from
sensors of various types embedded in a variety of
environments; hospitals, metro stations, markets,
and virtually every electrical device that produces
data.
What is Big Data ?
• Big Data puts an inordinate focus
on the issue of information volume.
It exceeds the capacity of
traditional data management
technologies creating the need for
new tools and technologies to
handle the extremely large volume.
• It not only presents a challenge in
storing large volumes of data but
also the new capabilities of
analyzing this huge volume of data.
10 V’s of Big Data
10 V’s of Big Data
• Volume • Vulnerabilit
• Velocity y

• Variety • Volatility

• Variability • Visualizatio
n
• Veracity
• Value
• Validity
What is Digital Data?
What is Digital Data?
• In computing world, digital data is
considered as a collection of facts that is
transmitted and saved in an electronic
format, and processed through software
system.
• Digital Data is generated by various devices,
like desktops, laptops, tablets, mobile
phones, and electronic sensors.
• Digital data is stored as a strings of binary
values (0s and 1s) on a storage medium
that’s either internal or external to the
devices generating or accessing the
information.
What is Digital Data?
• The storage devices could also be of various
varieties, like magnetic, optical, or solid state
storage devices.
• Examples of digital data are electronic
documents, text files, e-mails, e-books, digital
pictures, digital audio, and digital video.
Data and Information Processing
• Processing and analyzing information is significant and critical to any
organization.
• It allows organizations to derive value from information to take intelligent
decisions and improve organizational effectiveness.
• It is easier to analyze the structured data because it is stored in organised
format.
• On the opposite hand, processing non-structured data and extracting value
from it using traditional applications is tough, long, and needs to increase
the hardware resources.
• New architectures, technologies, and techniques have emerged that modify
storing, managing, analyzing, and bringing value from unstructured information
which is coming from various sources.
Data Types
Structured Data Type
• It is the type of data that is stored in a relational databases such as SQL
and Oracle where data is organised in rows and columns within named
tables.
• It is highly specific and is stored in a predefined format
• Structured data also adheres to predefined rules for formatting and
labeling information.
• It consists of clearly defined data types with patterns that make them
easily searchable.
• It usually resides in relational databases (RDBMs). Fields store length-
delimited data like phone numbers, Social Security numbers, or ZIP
codes, and records even contain text strings of variable length like
names, making it a simple matter to search.
Structured Data
• Data may be human- or machine-generated, as long as the data is created
within an RDB structure.
• This format is eminently searchable, both with human-generated queries
and via algorithms using types of data and field names, such as
alphabetical or numeric, currency, or date.
• Common relational database applications with structured data include
airline reservation systems, inventory control, sales transactions, and ATM
activity.
• Structured Query Language (SQL) enables queries on this type of
structured data within relational databases.
Characteristics of Structured Data
• The structured data conform to a data model with a predefined
structure.
• Data is organized into entities such as tables, and these columns
are linked together using relationships.
• All data stored in a table column have similar attributes. For
example, if a table contains the [FirstName] column as string data,
it will always store the string data for all records in the column.
• It does not allow dynamic structure change for a specific record.
Merits of Structured Data
• The fixed and well-defined schema helps easy management, less storage, and access
to the data.
• The data can be indexed based on its attributes. The indexing helps to read data from
a database quickly.
• Data security can be implemented at the granular level, i.e., row, column, or table.
• The structured data can be accessed easily by the machine learning algorithms.
Therefore, you can quickly do data manipulation and calculations.
• You can perform Business Intelligence operations with Increased access to more tools.
• The structured data enables users to understand and analyze different data
relationships quickly.
Demerits of Structured Data
• You need to define the schema well in advance, typical for all data requirements. If you need
an additional column requirement, it requires structure modification for all records in the
table. Therefore, the structured data is less flexible.

• It can be used for its intended goal with limiting business use case.

• Limitations On Use: Due to the organization style of structured data, it is more difficult to
have flexibility or varied use cases.

• Limited Storage: Structured data is stored in specific spaces of data warehouses. While
accessing the data is easy, scalability can be difficult. Changes within data warehouses can
become hard to manage. Using cloud data centers help with the storage problems.

• High Overhead: Data centers or other storage for structured data can become expensive and
be part of the structured data ordeal. Again, cloud data centers are recommended, but the
storage can still require significant work to keep the data maintained properly .
Examples of Structured Data
• Spreadsheets. • Phone numbers
• Relational databases • Email addresses
such as Microsoft • ATM activity
SQL Server, Oracle.
• Inventory control
• Online Transaction
Processing – OLTP • Student fee payment
Systems. databases

• Sales transactions. • Airline reservation

and ticketing
• ZIP codes
Semi-structured Data
• This type of data does not have a standard data model but it has
clear self-describing patterns and structure.
• The Semi-structured data does not conform to a specific data model.
• However, it has structural properties for quick data analysis.
• It can be considered as a combined version of Structured and
Unstructured Data.
• Examples of semi-structured data are Excel spreadsheets that have
a row and column structure and XML files that are defined by an
XML schema.
Examples of Semi-structured Data
• Emails: Emails are an excellent example of semi-structured data. It has different
tags for sender, recipients, date, subject, importance and can be easily
categorized into different folders Inbox, Sent, Spam, Promotions.
• Markup language XML has a set of document encoding rules for defining the
human and machine-readable formats.
• The JavaScript Object Notation (JSON) offers a semi-structured data interchange
format. It can be used for transmitting data between web servers and
applications. It is widely popular for data exchange and supported by various
relational and non-relational databases.
• The No-SQL databases (MongoDB, documentDB, Couchbase) use flexible data
model that can be used with semi-structured data for storing, importing, and
exporting.
The following image shows semi-structured
data that contains student records in JSON
format.
Quasi-structured Data
• This type of data consists of textual content with erratic data
formats, and its formatted with effort, software system tools, and
time.
• Quasi-structured data is more of a textual data with erratic data
formats. It can be formatted. with effort, tools, and time.
• This data type includes web clickstream data such as Google
Searches
• An example of quasi-structured data is the data about webpages a
user visited and in what order.
Unstructured Data
• This type of data doesn’t have an information schema table format.
model and isn’t organized in any specific • It allows dynamic data generation and storage.
format.
• We can use non-relational databases such as
• It does not contain a predefined schema
MongoDB, Couchbase, Apache Cassandra,
structure or does not belong to a data model.
Redis, DocumentDB for storing unstructured
• Therefore, we cannot store them in relational data.
databases. • Some samples of unstructured data are e-mails,
• We can use non-relational databases such as displays, images, text documents, PDF files and
MongoDB, Couchbase, Apache Cassandra, videos.
Redis, DocumentDB for storing unstructured • Approx 90% of the digital data generated these
data. days is non-structured data which is either
• It might have internal structural elements, but semi-, quasi-, and unstructured data.
it does not store information in a predefined
Characteristics of Unstructured
Data

• It works with data that does not have a specific

format or sequence.
• You do not define a specific schema or structure for
data storage.
• It allows dynamic data storage for individual records.
• Data is portable and scalable.
Merits of Unstructured Data

• As unstructured data does not have predefined rules,

you can use it for more than one intended purpose.
• It is quick to adapt the unstructured data because it
uses dynamic schema, and you do not need to edit
all records for updating a single record.
• It can work efficiently with the heterogeneity of
sources.
Disadvantages of Unstructured
Data
• You need more experienced persons such as data analysts
and data scientists to work with the unstructured data and
draw value from it.
• You need specific data management tools for data analysis.
• Indexing unstructured data is complex and prone to error
due to flexible structure and a lack of predefined attributes.
• Its storage cost is high as compared to structured data.
Examples of Unstructured Data
• As per the recent report, 80% to 90% of data such as social media messages.
enterprise data is unstructured. • Media files: All sorts of media files such as
• Therefore, it emphasizes the importance images, audio, video.
and criticality of working with unstructured •
Communication: Mobile communication
data. data, SMS messages, location data, live
• Emails: The Email body or message is a chat, IM, collaboration software.
popular unstructured data we use daily for • Books, Magazines, articles, blogs, press
email communication.
releases, Medical records (X-Rays, ECG or
• Documents: Word files, spreadsheets, PDF, imaginary data).
Powerpoint presentations. • Scientific research data.
• Websites: Youtube, Facebook, Instagram,
• Satellite imagery, and sensor data.
LinkedIn contents can contain unstructured
Differences Between Structured
and Unstructured Data
• Structured data is highly specific in • However, if any information does not
comparison to unstructured data. comply with the schema requirements, it
• Structured data is stored in a predefined fails to store in a database.
schema or format, whereas unstructured • The unstructured data offers flexibility
data is a conglomeration of many and scalability without defining a fixed
different types of information. schema before working with any
• Structured data has a fixed schema and document.
is referred to as organized data. • It allows storing data in various formats.
• The information can usually easily be • However, it is slightly challenging to
searched for and processed in a work in comparison with Structured
database. data.
Structured vs. Unstructured Data: Comparison Table
The following table summarizes the difference between structured and
unstructured data.
How to Convert Unstructured Data into
Structured Data
• The data conversion process is time-consuming and requires
experience resources.
• It might involve the following phases.
• Define your structure data requirements.
• Data cleansing — removing duplicates, cleanup columns.
• Refine data.
• The data conversion might use the machine learning models with the
Python, R services, or third-party tools such as Azure Data factory, log
parser tools, Cogito Semantic Technology, Zoho Analytics, SAS Viya,
TextMiner, RapidMiner.
Hadoop Data Analytics Tool
• It is an open-source framework for distributed storage and
processing of large sets of data on commodity hardware.
• It enables businesses to quickly gain insight from massive
amounts of structured and unstructured data.
• It is designed to scale from a single server to thousands of
machines, with a very high degree of fault tolerance.
• Rather than relying on high-end hardware, the resiliency of
these clusters comes from the software’s ability to detect and
handle failures at the application layer.
Benefits of Hadoop Data Analytics
Tool
• Hadoop enables a computing solution that is:
• Scalable: New nodes can be added as needed without changing the data formats,
how data is loaded, how jobs are written, or the applications on the top.
• Cost-effective: Hadoop brings massively parallel computing to commodity
servers. The result is a sizeable decrease in the cost per terabyte of storage,
which in turn makes it affordable to model all your data.
• Flexible: Hadoop is schema-less and can absorb any type of data from any
number of sources. Data from multiple sources can be joined and aggregated in
arbitrary ways enabling deeper analyses than any one system can provide.
• Fault tolerant: When a node is lost, the system redirects work to another
location of the data and continues processing without missing a fright beat.
Hive Data Analytics Tool
• Hive data warehouse software facilitates querying and
managing large datasets residing in distributed storage.
• It provides a mechanism to project structure onto this
data and query the data using a SQL-like language called
HiveQL.
• At the same time, this language also allows traditional
map/reduce programmers to plug in their custom
MADlib Data Analytics Tool
• MADlib is an open-source library for scalable in-
database analytics that can help improve data analysis
efficiency and accuracy.
• It provides data parallel implementations of
mathematical, statistical, and machine-learning
methods for structured and unstructured data.
• These SQL-based algorithms for machine learning, data
mining, and statistics run at speed and scale
Pig Data Analytics Tool
• Pig is a platform for analysing large data sets that consists of a
high-level language for expressing data analysis programs,
coupled with infrastructure for evaluating these programs.
• The salient property of Pig programs is that their structure is
amenable to substantial parallelization, which in turn enables
them to handle very large data sets.
• At the present time, Pig's infrastructure layer consists of a
compiler that produces sequences of MapReduce programs, for
which large-scale parallel implementations already exist (e.g. the
Hadoop subproject).
Pig Data Analytics Tool
• Pig's language layer currently consists of a textual language called Pig
Latin, which has the following key properties:
• Ease of programming: It is trivial to achieve parallel execution of simple,
"embarrassingly parallel" data analysis tasks.
• Complex tasks comprised of multiple interrelated data transformations are
explicitly encoded as data flow sequences, making them easy to write,
understand, and maintain.
• Optimization opportunities: The way tasks are encoded permits the system to
optimize their execution automatically, allowing the user to focus on semantics
rather than efficiency.
• Extensibility: Users can create their own functions to do special-purpose
processing.
MapReduce Data Analytics Tool
• A software framework that allows developers to write programs that
process massive amounts of unstructured data in parallel across a
distributed cluster of processors or standalone computers.
• The framework is divided into two parts:
• Map: A function that parcels out work to different nodes in the
distributed cluster.
• Reduce: A function that collates the work and resolves the results
into a single value.
• The first is the map job, which takes a set of data and converts it into
another set of data, where individual elements are broken down into
tuples (key/value pairs).
MapReduce Data Analytics Tool
• The reduce job takes the output from a map as input and
combines those data tuples into a smaller set of tuples.
• As the sequence of the name MapReduce implies, the
reduce job is always performed after the map job.
• MapReduce is important because it allows ordinary
developers to use MapReduce library routines to create
parallel programs without having to worry about
programming for intra-cluster communication, task
monitoring, or failure handling.
Conclusion: The Future of Data
• Data is at the heart of our businesses in today’s digital world,
whether a business professional or a consumer.
• Data is collected at every moment, and it forms the basis of our
many decisions.
• In the future, data may take on a more significant role in our lives,
but it will likely be used in new ways.
• Each organization includes structured, unstructured and semi-
structured data.
• You might interchange data formats for data import, export or
consume them in a standard format
References
• https://blog.skyvia.com/structured-vs-unstructured-data/
• https://www.datamation.com/big-data/structured-vs-unstructured-data
/
• https://www.mycloudwiki.com/san/data-and-information-basics/
• The 10 Vs of Big Data | Transforming Data with Intelligence (tdwi.org
)

NORA (Updated)
No ratings yet
NORA (Updated)
376 pages
Big data aktu unit 1
No ratings yet
Big data aktu unit 1
85 pages
DA_Unit_1
No ratings yet
DA_Unit_1
44 pages
Unit - Big - Data
No ratings yet
Unit - Big - Data
107 pages
UNIT 1 INTRODUCTION TO BIGDATA by MIT
No ratings yet
UNIT 1 INTRODUCTION TO BIGDATA by MIT
12 pages
Data and Data Storage
No ratings yet
Data and Data Storage
29 pages
3. AI primer
No ratings yet
3. AI primer
24 pages
Big Data Unit-1 Kcs-061
No ratings yet
Big Data Unit-1 Kcs-061
64 pages
UNIT 1-2
No ratings yet
UNIT 1-2
78 pages
Structured and Unstructured Data: Learning Outcomes
100% (1)
Structured and Unstructured Data: Learning Outcomes
13 pages
Unit - I Part I
No ratings yet
Unit - I Part I
48 pages
Unit - Big - Data - (DK - PPT) - Part - 1
No ratings yet
Unit - Big - Data - (DK - PPT) - Part - 1
70 pages
All
No ratings yet
All
62 pages
Big Data and Analytics Cse448 Module 1 L
No ratings yet
Big Data and Analytics Cse448 Module 1 L
38 pages
Cloud computing
No ratings yet
Cloud computing
86 pages
5.1. - Structured and Unstrucutred Data
No ratings yet
5.1. - Structured and Unstrucutred Data
22 pages
Chapter 2 - Types of digital data
No ratings yet
Chapter 2 - Types of digital data
12 pages
Unit - IV XML Databases Adbt 25 Pages
No ratings yet
Unit - IV XML Databases Adbt 25 Pages
13 pages
Structured vs. Unstructured Data Understanding Differences
No ratings yet
Structured vs. Unstructured Data Understanding Differences
9 pages
DA(Unit-1)
No ratings yet
DA(Unit-1)
45 pages
BD Unit 1
No ratings yet
BD Unit 1
72 pages
Big Data & Analytics (CSE448) L1 (1)
No ratings yet
Big Data & Analytics (CSE448) L1 (1)
51 pages
Unit 4 DigitalData
No ratings yet
Unit 4 DigitalData
22 pages
Cse Big Data 702 Notes
No ratings yet
Cse Big Data 702 Notes
91 pages
Unit 1: To Data Science
No ratings yet
Unit 1: To Data Science
56 pages
2023_IT_22IT405_U1-LM1 (1)
No ratings yet
2023_IT_22IT405_U1-LM1 (1)
11 pages
Chapter 01: Types of Digital Data
No ratings yet
Chapter 01: Types of Digital Data
80 pages
BigData_1
No ratings yet
BigData_1
14 pages
UNIT4
No ratings yet
UNIT4
20 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
35 pages
Datas
No ratings yet
Datas
27 pages
Big Data Processing
No ratings yet
Big Data Processing
19 pages
Chapter 01: Types of Digital Data
No ratings yet
Chapter 01: Types of Digital Data
79 pages
Big Data Analytics Unit 1
No ratings yet
Big Data Analytics Unit 1
26 pages
Practical No.10 Aim:Case Study Case Study Topic: Structureddata vs. Unstructureddata
No ratings yet
Practical No.10 Aim:Case Study Case Study Topic: Structureddata vs. Unstructureddata
5 pages
Module 1
No ratings yet
Module 1
27 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
74 pages
Unit 1
No ratings yet
Unit 1
17 pages
Chapter 2 - Types of digital data
No ratings yet
Chapter 2 - Types of digital data
12 pages
Introduction To Bigdata
No ratings yet
Introduction To Bigdata
31 pages
subtitle (7)
No ratings yet
subtitle (7)
3 pages
2 Data Engineering (Storing Data)
No ratings yet
2 Data Engineering (Storing Data)
11 pages
Structured, Semi Structured and Unstructured Data
No ratings yet
Structured, Semi Structured and Unstructured Data
13 pages
Unit I-KCS-061
No ratings yet
Unit I-KCS-061
42 pages
Big Data & Analytics (CSE448) L1
No ratings yet
Big Data & Analytics (CSE448) L1
50 pages
UNIT 3 Notes by ARUN JHAPATE
No ratings yet
UNIT 3 Notes by ARUN JHAPATE
9 pages
BDA Question Answer
No ratings yet
BDA Question Answer
29 pages
Data Science
No ratings yet
Data Science
32 pages
A.I PROJECT
No ratings yet
A.I PROJECT
12 pages
Module 1 BDA
No ratings yet
Module 1 BDA
103 pages
Unit 1 (Big Data)
No ratings yet
Unit 1 (Big Data)
20 pages
UNIT-1 Bda Kalyan
No ratings yet
UNIT-1 Bda Kalyan
25 pages
Big Data - Unit-1 - KCS-061
No ratings yet
Big Data - Unit-1 - KCS-061
63 pages
Types of Big Data
No ratings yet
Types of Big Data
14 pages
Sybca Bigdata Notes
100% (1)
Sybca Bigdata Notes
11 pages
Data Categories
No ratings yet
Data Categories
4 pages
DATA ANALYTICS note
No ratings yet
DATA ANALYTICS note
52 pages
Digital Data
No ratings yet
Digital Data
32 pages
Big Data Chapter-I_new
No ratings yet
Big Data Chapter-I_new
49 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Online Assignment Feedback System Report
No ratings yet
Online Assignment Feedback System Report
52 pages
UNSPSC Classification Guidelines-040209-Revised Final
No ratings yet
UNSPSC Classification Guidelines-040209-Revised Final
79 pages
The Logical Data Model
No ratings yet
The Logical Data Model
4 pages
Code No.: ETEC 301 L T C Paper: Digital Circuits & Systems - II 3 1 4 Unit - I
No ratings yet
Code No.: ETEC 301 L T C Paper: Digital Circuits & Systems - II 3 1 4 Unit - I
6 pages
Infinity Components2
No ratings yet
Infinity Components2
4 pages
Course Out Line
No ratings yet
Course Out Line
4 pages
Database Slides
No ratings yet
Database Slides
23 pages
Hospital Management System Project Report
No ratings yet
Hospital Management System Project Report
40 pages
Reasoning in Corporate Memory Systems: A Case Study of Group Competencies
No ratings yet
Reasoning in Corporate Memory Systems: A Case Study of Group Competencies
12 pages
Ax2012 Enus Deviv 08 PDF
100% (1)
Ax2012 Enus Deviv 08 PDF
54 pages
Geology Data Model
No ratings yet
Geology Data Model
2 pages
TDD Template
No ratings yet
TDD Template
20 pages
Data Modeling Essentials 3rd ed Edition Graeme Simsion download
No ratings yet
Data Modeling Essentials 3rd ed Edition Graeme Simsion download
56 pages
QA Chapter2 PDF
No ratings yet
QA Chapter2 PDF
3 pages
Data Engineering Interview Preparation Questions
No ratings yet
Data Engineering Interview Preparation Questions
7 pages
DBMS
No ratings yet
DBMS
139 pages
Concepts and Data Model - SAP Documentation
No ratings yet
Concepts and Data Model - SAP Documentation
2 pages
70 Series IEC 61850 Protocol Manual - RevQ - FINAL
No ratings yet
70 Series IEC 61850 Protocol Manual - RevQ - FINAL
202 pages
Questões Certificado Succes - Aluno
No ratings yet
Questões Certificado Succes - Aluno
28 pages
Concepts and Architecture: Database Systems
No ratings yet
Concepts and Architecture: Database Systems
50 pages
Placement Information System
55% (20)
Placement Information System
50 pages
Collaborative Work Management (CWM)
No ratings yet
Collaborative Work Management (CWM)
8 pages
Monitoring and Supporting Data Conversion
No ratings yet
Monitoring and Supporting Data Conversion
4 pages
Eclinic Presentation For PRJ
No ratings yet
Eclinic Presentation For PRJ
62 pages
Fundametals of Database Module
No ratings yet
Fundametals of Database Module
153 pages
Dbmss
No ratings yet
Dbmss
14 pages
Security Big Data
No ratings yet
Security Big Data
26 pages
Erwin (A Data Modeling and Design Tool) : - Oracle COE, LGS LTD
No ratings yet
Erwin (A Data Modeling and Design Tool) : - Oracle COE, LGS LTD
14 pages
Physical Database Design Lecture Slides
No ratings yet
Physical Database Design Lecture Slides
113 pages

Big Data Introduction

Uploaded by

Big Data Introduction

Uploaded by

Specialized Topics in

Computer (CSC 411)

Introduction to Big Data

• Sales transactions. • Airline reservation

• It works with data that does not have a specific

• As unstructured data does not have predefined rules,

You might also like