Nosql Column-Family Stores
Nosql Column-Family Stores
NoSQL
Column-Family Stores
Báo cáo môn Các hệ cơ sở dữ liệu nâng cao
NoSQL - Not Only SQL GVHD: Ts. Nguyễn Trần Minh Thư
Nhóm 07:
1. 19C11015 - Đỗ Huy Gia Cát
2. 21C12003 - Đào Thanh Danh
3. 21C11026 - Nguyễn Thành Thái
1
CONTENTS
3
Wide Column / Column Family Database
• Column-family stores are databases in which data is stored by key-
value mapping and values group into multiple column families, with
each being a map of data
• Keyword comparison between RDBMS and Cassandra
RDBMS Cassandra
Database instance Cluster
Database Keyspace
Table Column Family
Row Row
Column (same for all rows) Column (can be different per row)
4
Column Family Database
6
Column Family Unit Structure Storage
• Column: the basic storage unit,
consist of a name-value pair with
the name also acts as the key, and
stored with a timestamp value
7
Column Family Unit Structure Storage
• Standard column family: column
family where all columns are
simple columns
8
Cassandra's Features
• Consistency
• Transactions
• Availability
• Scaling
9
Consistency
• Cassandra stores replicas on multiple nodes to ensure reliability
• Cassandra provides three consistency levels:
ONE: Only need one of the nodes to respond to the request, good for
high write performance requirements
QUOROM: Ensures that majority of the node respond to the request
ALL: All nodes will have to respond to the requests
• If a node is down, the data will be stored later when it comes back via
hints (hinted handoff) or repair command.
10
Transactions
• In Cassandra, transactions are atomic and isolated
• Atomic: inserted or updating columns in a row is treated as a write
operation
• Isolation: writes to a row are isolated to client and not visible to other
uses until completion
11
Availability
• Availability is governed by the formula
(R + W) > N
R, W: minimum number of nodes read/write request is successfully
responded; N: number of replicas of data
• Keyspaces should be set up depending on your need – higher
availability for read or write
12
Scaling
• Cassandra handles scaling by adding additional nodes to the cluster
• Allows clusters to be scaled on the fly without operations => maxium
uptime
13
Database - Open-source NoSQL - Column Family
- Store data no relationship on column-family model
Scalability - Scalabilitiable by increasing nodes
Replication - Replica data on multi node
15
Infrastructure Design independence, can integrate Base on Hadoop, can integrate with
with DBMS other and Storm, Hadoop Zookeeper
HBase master, HBase data node,
name node
16
17
18
Basic Queries CQL
• Cassandra Query Language • Only support HBase Shell
• Cassandra Query Language Shell - • Apache Phoenix -> Query Engine
CQLSH https://data-flair.training/blogs/hba
se-shell-commands/
19
Cassandra Query Language
• CREATE KEYSPACE <identifier> WITH <properties>
• CREATE KEYSPACE videodb WITH REPLICATION =
{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
SimpleStrategy
NetworkTopologyStrategy
20
Cassandra Query Language
CREATE TABLE video_rating
CREATE (TABLE | COLUMNFAMILY)
<tablename> (
22
Cassandra Query Language
• User-Defined Data Type
CREATE TYPE <keyspace>.<data type>
(variable,variable)
23
Cassandra Query Language
SELECT Clause, WHERE Clause & ORDERBY
24
Cassandra Query Language
DELETE <table name>
USING <update parameter>
WHERE <identifier>
BEGIN BATCH
//different data manipulation command syntax -> INSERT, UPDATE // DELETE
APPLY BATCH;
25
Cassandra Query Language
• Advanced Queries and Indexing
26
Stores writing
27
Retried reading
28
Suitable Use Cases
• A great choice to store event information, such as application state or errors
encountered by the application
30
31
Conclusion
32