0% found this document useful (0 votes)

72 views

Nosql Column-Family Stores

Column-family stores are NoSQL databases that store data in columns grouped by key-value mappings. Cassandra is an example of a column-family store that stores data across multiple nodes for high availability and scalability. It uses a column-family data model and provides tunable consistency, availability and partitioning. Cassandra is suitable for applications requiring high write performance and the ability to scale out by adding nodes.

Uploaded by

nguyentthai96

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views

Nosql Column-Family Stores

Uploaded by

nguyentthai96

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

BỘ GIÁO DỤC VÀ ĐÀO TẠO

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN TP.HCM

KHOA CÔNG NGHỆ THÔNG TIN

NoSQL
Column-Family Stores
Báo cáo môn Các hệ cơ sở dữ liệu nâng cao
NoSQL - Not Only SQL GVHD: Ts. Nguyễn Trần Minh Thư
Nhóm 07:
1. 19C11015 - Đỗ Huy Gia Cát
2. 21C12003 - Đào Thanh Danh
3. 21C11026 - Nguyễn Thành Thái
1
CONTENTS

• Column-Family Stores NoSQL

o Overview
o Column-Family Databases
• Cassandra's Structure and Features
• Compare Colum-Family Data Store with others
• Query features
• Expand analyse
• Scaling
• Some compare Cassandra and HBase
• Apply suitable usecases
2
Introduction

3
Wide Column / Column Family Database
• Column-family stores are databases in which data is stored by key-
value mapping and values group into multiple column families, with
each being a map of data
• Keyword comparison between RDBMS and Cassandra
RDBMS Cassandra
Database instance Cluster
Database Keyspace
Table Column Family
Row Row
Column (same for all rows) Column (can be different per row)

4
Column Family Database

6
Column Family Unit Structure Storage
• Column: the basic storage unit,
consist of a name-value pair with
the name also acts as the key, and
stored with a timestamp value

• Super column: column

whose value is a map of columns

7
Column Family Unit Structure Storage
• Standard column family: column
family where all columns are
simple columns

• Super column family: column

family where exists at least one
super column

8
Cassandra's Features
• Consistency
• Transactions
• Availability
• Scaling

9
Consistency
• Cassandra stores replicas on multiple nodes to ensure reliability
• Cassandra provides three consistency levels:
ONE: Only need one of the nodes to respond to the request, good for
high write performance requirements
QUOROM: Ensures that majority of the node respond to the request
ALL: All nodes will have to respond to the requests
• If a node is down, the data will be stored later when it comes back via
hints (hinted handoff) or repair command.

10
Transactions
• In Cassandra, transactions are atomic and isolated
• Atomic: inserted or updating columns in a row is treated as a write
operation
• Isolation: writes to a row are isolated to client and not visible to other
uses until completion

11
Availability
• Availability is governed by the formula
(R + W) > N
R, W: minimum number of nodes read/write request is successfully
responded; N: number of replicas of data
• Keyspaces should be set up depending on your need – higher
availability for read or write

12
Scaling
• Cassandra handles scaling by adding additional nodes to the cluster
• Allows clusters to be scaled on the fly without operations => maxium
uptime

13
Database - Open-source NoSQL - Column Family
- Store data no relationship on column-family model
Scalability - Scalabilitiable by increasing nodes
Replication - Replica data on multi node

15
Infrastructure Design independence, can integrate Base on Hadoop, can integrate with
with DBMS other and Storm, Hadoop Zookeeper
HBase master, HBase data node,
name node

Support Support ordered partitioning Not-support ordered partitioning

Node Multi seed node in clusster Node master monitoring/coordinator

nodes
Query language Cassandra Query Language – CQL Only support HBase shell
Cassandra Query Language Shell -
CQLSH

16
17
18
Basic Queries CQL
• Cassandra Query Language • Only support HBase Shell
• Cassandra Query Language Shell - • Apache Phoenix -> Query Engine
CQLSH https://data-flair.training/blogs/hba
se-shell-commands/

19
Cassandra Query Language
• CREATE KEYSPACE <identifier> WITH <properties>
• CREATE KEYSPACE videodb WITH REPLICATION =
{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 };

SimpleStrategy
NetworkTopologyStrategy

20
Cassandra Query Language
CREATE TABLE video_rating
CREATE (TABLE | COLUMNFAMILY)
<tablename> (

('<column-definition>' , '<column-definition>') videoid uuid,

(WITH <option> AND <option>) rating_counter counter,
rating_total counter,
USE videodb; PRIMARY KEY (videoid)
CREATE TABLE videos );
(
videoid uuid,
CREATE TABLE video_event
videoname varchar,
(
username varchar,
videoid uuid,
description varchar,
username varchar,
location map<varchar,varchar>,
event varchar,
tags set<varchar>,
event_timestamp timeuuid,
upload_date timestamp,
video_timestamp bigint,
PRIMARY KEY (videoid)
PRIMARY KEY ((videoid, username), event_timestamp, event)
);
) WITH CLUSTERING ORDER BY (event_timestamp DESC, event ASC); 21
Cassandra Query Language
• Built-In Data Type: boolean, int, bigint, variant, float, double, decimal,
ascii, varchar, text, timestamp, blob, inet, timeuuid, uuid,…
• Collection Data Type: LIST, SET, MAP
• User-Defined Data Type

22
Cassandra Query Language
• User-Defined Data Type
CREATE TYPE <keyspace>.<data type>
(variable,variable)

CREATE TYPE records (

name text,
branch text,
phone int,
city text,
id set<int>
);

23
Cassandra Query Language
SELECT Clause, WHERE Clause & ORDERBY

INSERT INTO <table name>

(<field name 1>,<field name 2>,<field name 3>.,...)
VALUES ('value 1','value 2','value 3',....)
USING <update parameter>;

UPDATE <table name> USING <update parameter>

SET <field name 1>=< value 1>,
< field name 2>=< value 2>,
< field name 3>=<value 3>, .....
WHERE <field>=<value>;

24
Cassandra Query Language
DELETE <table name>
USING <update parameter>
WHERE <identifier>

BEGIN BATCH
//different data manipulation command syntax -> INSERT, UPDATE // DELETE
APPLY BATCH;

25
Cassandra Query Language
• Advanced Queries and Indexing

• CREATE INDEX <field name> ON <table name>

Indexes are implemented as bit-mapped indexes and perform well for
low-cardinality column values.

• USE, CREATE, ALTER, DROP, TRUNCATE,...

26
Stores writing

• Memory space - memtable

• Disk store SSTable

27
Retried reading

• Memory space - memtable

• Disk store SSTable

28
Suitable Use Cases
• A great choice to store event information, such as application state or errors
encountered by the application

• Content Management Systems, Blogging Platforms

=> store blog entries with tags, categories, links, and trackbacks
• Count and categorize visitors of a page to calculate analytics
• Data for specific time -> as ad banners on a website
29
When Not to Use
• Systems that require ACID transactions for writes and reads
• The database to aggregate the data using queries (such as SUM or
AVG)
• Sample product prototypes or initial tech spikes

30
31

Conclusion
32

Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
Ict Starter q1
100% (2)
Ict Starter q1
7 pages
Cassandra Presentation Final
100% (3)
Cassandra Presentation Final
71 pages
Learn Cassandra
100% (1)
Learn Cassandra
37 pages
Module 4
No ratings yet
Module 4
22 pages
Cassandra
No ratings yet
Cassandra
25 pages
Apache Cassandra: Database
No ratings yet
Apache Cassandra: Database
55 pages
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
No ratings yet
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
37 pages
Cassandra_Complete_Notes
No ratings yet
Cassandra_Complete_Notes
5 pages
Cassandra Introduction
No ratings yet
Cassandra Introduction
99 pages
BDA
No ratings yet
BDA
18 pages
Cassandra
No ratings yet
Cassandra
31 pages
Apache Cassandra: by Chethan Gowda
No ratings yet
Apache Cassandra: by Chethan Gowda
12 pages
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
No ratings yet
An Overview of Apache Cassandra: Cassandra Essentials Tutorial Series
20 pages
4 unit
No ratings yet
4 unit
10 pages
Cassandra
No ratings yet
Cassandra
5 pages
Casandra
No ratings yet
Casandra
57 pages
Lab Exam Notes
No ratings yet
Lab Exam Notes
3 pages
Cassandra Interview Questions Answers
No ratings yet
Cassandra Interview Questions Answers
10 pages
Cassandra As Used by Facebook
100% (1)
Cassandra As Used by Facebook
12 pages
Cassandra Quick Guide
No ratings yet
Cassandra Quick Guide
60 pages
Wide-Column Stores: Big Data Management Phil Bartie
No ratings yet
Wide-Column Stores: Big Data Management Phil Bartie
46 pages
2 marks answers-co3-co4.docx
No ratings yet
2 marks answers-co3-co4.docx
7 pages
NoSql-Unit-2
No ratings yet
NoSql-Unit-2
72 pages
Cassandra Article review
No ratings yet
Cassandra Article review
10 pages
Cassandra Data Model
No ratings yet
Cassandra Data Model
17 pages
Cassandra_Query_Language
No ratings yet
Cassandra_Query_Language
7 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
47 pages
Intro to NoSQL
No ratings yet
Intro to NoSQL
18 pages
Project PPT (8 Sem)
No ratings yet
Project PPT (8 Sem)
16 pages
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
No ratings yet
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
4 pages
Cassandra Notes
No ratings yet
Cassandra Notes
6 pages
Compare Mongodb and Cassandra
No ratings yet
Compare Mongodb and Cassandra
6 pages
Intro to Data Science_week 10_LAQ's
No ratings yet
Intro to Data Science_week 10_LAQ's
4 pages
L20 Cassandra - Fa12
No ratings yet
L20 Cassandra - Fa12
27 pages
Cassandra Design Patterns - Sample Chapter
No ratings yet
Cassandra Design Patterns - Sample Chapter
32 pages
Apache Cassandra: Het Patel Kajal Patel
No ratings yet
Apache Cassandra: Het Patel Kajal Patel
8 pages
Introductiontocassandra 180218073404
No ratings yet
Introductiontocassandra 180218073404
37 pages
Cassandra
No ratings yet
Cassandra
7 pages
No SQL lab manual
No ratings yet
No SQL lab manual
19 pages
gheribchaimal2repport2
No ratings yet
gheribchaimal2repport2
5 pages
Features of Cassandra
No ratings yet
Features of Cassandra
6 pages
Cqlsh-20 Update
No ratings yet
Cqlsh-20 Update
9 pages
Cassandra data model
No ratings yet
Cassandra data model
17 pages
Chapter 3 - Columnar DB
No ratings yet
Chapter 3 - Columnar DB
26 pages
lec17
No ratings yet
lec17
21 pages
BDS-Session-5_NoSQL-DB
No ratings yet
BDS-Session-5_NoSQL-DB
51 pages
SS1123 - D2T - Apache Cassandra Overview PDF
100% (1)
SS1123 - D2T - Apache Cassandra Overview PDF
45 pages
Apache_Cassandra_Nosql_SonuJha_04
No ratings yet
Apache_Cassandra_Nosql_SonuJha_04
14 pages
Cassandra
No ratings yet
Cassandra
10 pages
04-Introduction-to-CassandraDB-
No ratings yet
04-Introduction-to-CassandraDB-
19 pages
Cassandra: A Distributed Database With No Single Point of Failure
No ratings yet
Cassandra: A Distributed Database With No Single Point of Failure
9 pages
Cassandra
No ratings yet
Cassandra
6 pages
NoSQL Database Revolution
No ratings yet
NoSQL Database Revolution
5 pages
Apache Cassandra
No ratings yet
Apache Cassandra
7 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
37 pages
Cassendra
100% (1)
Cassendra
21 pages
LabTask-CassendraCRUDoperations (2)
No ratings yet
LabTask-CassendraCRUDoperations (2)
45 pages
Dzone Refcard 153 Apache Cassandra 2020
No ratings yet
Dzone Refcard 153 Apache Cassandra 2020
11 pages
App Ache
No ratings yet
App Ache
55 pages
Learn Cassandra in 24 Hours
From Everand
Learn Cassandra in 24 Hours
Alex Nordeen
No ratings yet
Create A Centralized and Decentralized Organizational Model For Business Intelligence
No ratings yet
Create A Centralized and Decentralized Organizational Model For Business Intelligence
8 pages
Relational Data Model
No ratings yet
Relational Data Model
41 pages
Outline: What Is A Distributed DBMS Problems Current State-Of-Affairs
No ratings yet
Outline: What Is A Distributed DBMS Problems Current State-Of-Affairs
15 pages
CSCI 720 - Project
No ratings yet
CSCI 720 - Project
23 pages
Word 2 Vec
No ratings yet
Word 2 Vec
6 pages
Introduction To Transaction Processing
No ratings yet
Introduction To Transaction Processing
5 pages
Restaurant Billing System: Sujit Maharjan
No ratings yet
Restaurant Billing System: Sujit Maharjan
61 pages
115 SQL Interview Questions and Answers
100% (1)
115 SQL Interview Questions and Answers
34 pages
Databricks Associate Data Engineer Notes
No ratings yet
Databricks Associate Data Engineer Notes
39 pages
Virtual Reality Introduction: Byron Alfonso Pérez-Gutiérrez
No ratings yet
Virtual Reality Introduction: Byron Alfonso Pérez-Gutiérrez
19 pages
Cio Presentation Stanfords Data Governance Program Final
No ratings yet
Cio Presentation Stanfords Data Governance Program Final
28 pages
Nosqlmodule 1
100% (1)
Nosqlmodule 1
102 pages
Database Principles: Fundamentals of Design, Implementation, and Management
No ratings yet
Database Principles: Fundamentals of Design, Implementation, and Management
39 pages
Implementing Powerexchange Oracle CDC With Logminer in A Non-Rac Environment
No ratings yet
Implementing Powerexchange Oracle CDC With Logminer in A Non-Rac Environment
38 pages
Business Intelligance Pradeep Sir
100% (1)
Business Intelligance Pradeep Sir
5 pages
Veritas Netbackup Cheat Sheet
No ratings yet
Veritas Netbackup Cheat Sheet
4 pages
Datastage Info
No ratings yet
Datastage Info
28 pages
#Practical 1 - Select and Write Down The Problem Statement For A Real Time System of Relevance
No ratings yet
#Practical 1 - Select and Write Down The Problem Statement For A Real Time System of Relevance
14 pages
Juan Rosas - Case Study 1 - GitMeal
No ratings yet
Juan Rosas - Case Study 1 - GitMeal
28 pages
Aplikasi SIRUS Zulhalim Sirs Slide
No ratings yet
Aplikasi SIRUS Zulhalim Sirs Slide
330 pages
How To Read An Execution Plan
No ratings yet
How To Read An Execution Plan
8 pages
Chapter 5
No ratings yet
Chapter 5
14 pages
PTP Open PO Report
No ratings yet
PTP Open PO Report
18 pages
Dbms Questions: 1.) Define: Schema, Sub-Schema, Instances, Entity, Attribute, and Domain
No ratings yet
Dbms Questions: 1.) Define: Schema, Sub-Schema, Instances, Entity, Attribute, and Domain
17 pages
Inside Fortianalyzer 50
No ratings yet
Inside Fortianalyzer 50
4 pages
Search Engines: by Bhaswanth 16311A0507
No ratings yet
Search Engines: by Bhaswanth 16311A0507
23 pages
SAML Presentation01
No ratings yet
SAML Presentation01
18 pages
DBMS Paper 4
No ratings yet
DBMS Paper 4
2 pages
Healthcaree
No ratings yet
Healthcaree
47 pages

Nosql Column-Family Stores

Uploaded by

Nosql Column-Family Stores

Uploaded by

BỘ GIÁO DỤC VÀ ĐÀO TẠO​

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN TP.HCM​

• Column-Family Stores NoSQL

• Super column: column

• Super column family: column

Support Support ordered partitioning Not-support ordered partitioning

Node Multi seed node in clusster Node master monitoring/coordinator

('<column-definition>' , '<column-definition>') videoid uuid,

CREATE TYPE records (

INSERT INTO <table name>

UPDATE <table name> USING <update parameter>

• CREATE INDEX <field name> ON <table name>​

• USE, CREATE, ALTER, DROP, TRUNCATE,...

• Memory space - memtable

• Memory space - memtable

• Content Management Systems, Blogging Platforms

You might also like

BỘ GIÁO DỤC VÀ ĐÀO TẠO

TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN TP.HCM

• CREATE INDEX <field name> ON <table name>