UPPSC AE ICT by Swati Ma'Am
UPPSC AE ICT by Swati Ma'Am
This Chapter deals with hardware, software, data representation and Boolean logic.
Let us discuss them one by one.
1. Hardware
• Computer hardware includes the physical
parts of a computer, such as the central
processing unit (CPU), random access
memory (RAM), motherboard, computer
data storage, graphics card, sound card,
and computer case.
• It includes external devices such as a monitor, mouse, keyboard, and speakers.
CPU
• A CPU (Central Processing Unit) is the primary component of any computer or
electronic device.
• It is responsible for carrying out the instructions given to it by the user.
• The CPU acts as the “brain” of the computer. It reads and interprets commands
from software programs and uses them to control other components within the
machine.
a. Elements of a CPU
• Control Unit (CU) • Arithmetic Logic Unit (ALU)
• Registers • Cache
• Bus Interface Unit (BIU) • Instruction Decoder
b. Functions of a CPU
• Fetching Instructions • Decoding Instructions
• Executing Instructions • Managing Registers
• Controlling Program Flow
• Handling Interrupts
• Managing Caches
• Coordinating with Other System Components
• Arithmetic and Logic Operations
• Virtual Memory Management
• I/O Operations
Computer Memory
• To save data and instructions, memory is required.
• Memory is divided into cells, and they are stored in the storage space present in the
computer.
• Every cell has its unique location/address.
• Memory is essential for a computer as this is the way it becomes somewhat more
similar to a human brain.
• Following are the type of memories.
RAM
• RAM stands for Random Access Memory, which is a type of computer memory
that stores data temporarily for quick access by the CPU.
• RAM is essential for a computer to run applications and open files.
• It is a form of electronic computer memory that can be read and changed in any
order, typically used to store working data and machine code.
• A random-access memory device allows data items to be read or written in almost
the same amount of time irrespective of the physical location of data inside.
• RAM is further divided into two types, SRAM: Static Random Access Memory
and DRAM: Dynamic Random Access Memory.
SRAM (Static Random Access memory)
• The function of SRAM is that it provides a direct interface with the Central
Processing Unit at higher speeds.
• SRAM is used as the Cache memory inside the computer.
• SRAM is known to be the fastest among all memories.
• SRAM is costlier.
• SRAM has a lower density (number of memory cells per unit area).
• The power consumption of SRAM is less but when it is operated at higher
frequencies, the power consumption of SRAM is compatible with DRAM.
SRAM DRAM
SRAM stand for Static Random DRAM stand for Dynamic Random
Access memory Access memory
More power is required Less power is required
More expensive Less expensive
Faster Slower
ROM
• ROM, which stands for read only memory, is a memory device or storage medium
that stores information permanently.
• It is also the primary memory unit of a computer along with the random access
memory (RAM).
• It is called read only memory as we can only read the programs and data stored on
it but cannot write on it.
• It is restricted to reading words that are permanently stored within the unit.
• The manufacturer of ROM fills the programs into the ROM at the time of
manufacturing the ROM. After this, the content of the ROM can't be altered.
Storage Devices:
HDD/SSD
• A computer hard disk drive (HDD) is a non-volatile data storage device.
• Non-volatile refers to storage devices that maintain stored data when turned off.
• All computers need a storage device, and HDDs are just one example of a type of
storage device.
• HDDs are usually installed inside desktop computers, mobile devices, consumer
electronics and enterprise storage arrays in data centers.
• They can store operating systems, software programs and other files using magnetic
disks.
• The disk is divided into tracks.
• Each track is further divided into sectors.
• The point to be noted here is that the outer
tracks are bigger than the inner tracks but
they contain the same number of sectors
and have equal storage capacity.
• The read-write (R-W) head moves over the rotating hard disk.
• It is this Read-Write head that performs all the read and write operations on the disk
and hence, the position of the R-W head is a major concern.
• To perform a read or write operation on a memory location, we need to place the R-W
head over that position.
SSD
• An SSD, or solid-state drive, is a type of storage device used in computers.
• This non-volatile storage media stores persistent data on solid-state flash memory.
• Unlike traditional hard disk drives (HDDs), SSDs have no moving parts, allowing them
to deliver faster data access speeds, reduced latency, increased resistance to physical
shock, lower power consumption, and silent operation.
• Solid state drives (SSDs) use a combination of NAND flash memory technology and
advanced controller algorithms.
• NAND flash memory is the primary storage component, divided into blocks and
pages.
• An SSD contains a controller chip that manages data storage, retrieval, and
optimization.
Input / Output Devices
Input Device
• An input device is a computer device or hardware that allows the user to provide
data, input, and instructions to the computer system.
• Examples of input devices include
keyboards, computer mice, scanners,
cameras, joysticks, and microphones.
Output Device
• An output device is a hardware component of a computer system that displays
information to users.
• Some common examples of output devices are:
Monitor/Display, Printer, Speakers,
Headphones/Earphones, Projectors,
Plotters, Braille Display, Digital
Signage Display, Touch-screens, etc.
2. Software
• An operating system (OS) is the program that, after being initially loaded into the
computer by a boot program, manages all the other application programs in a
computer.
• It is a program that acts as an interface between the system hardware and the user.
• Some of the types of Operating System are:
• Batch OS – A set of similar jobs are stored in the main memory for execution. A
job gets assigned to the CPU, only when the execution of the previous job
completes.
• Multiprogramming OS – The main memory consists of jobs waiting for CPU time.
The OS selects one of the processes and assigns it to the CPU. Whenever the
executing process needs to wait for any other operation (like I/O), the OS selects
another process from the job queue and assigns it to the CPU. This way, the CPU is
never kept idle and the user gets the flavor of getting multiple tasks done at once.
• Multitasking OS – Multitasking OS combines the benefits of Multiprogramming OS
and CPU scheduling to perform quick switches between jobs. The switch is so quick
that the user can interact with each program as it runs
• Time Sharing OS – Time-sharing systems require interaction with the user to
instruct the OS to perform various tasks. The OS responds with an output. The
instructions are usually given through an input device like the keyboard.
• Real Time OS – Real-Time OS are usually built for dedicated systems to
accomplish a specific set of tasks within deadlines.
• Examples of Operating Systems are Windows, Linux, macOS, Android, iOS, etc.
Windows
• The Windows Operating System (OS) is one of the most popular and widely used
operating systems in the world.
• Windows Operating System (OS) is a graphical user interface (GUI) based
operating system developed by Microsoft Corporation.
• It is designed to provide users with a user-friendly interface to interact with their
computers.
• The first version of the Windows Operating System was introduced in 1985, and
since then, it has undergone many updates and upgrades.
• Examples of Windows operating systems are as follows.
Windows 95, Windows 98, Windows XP, Windows vista, Windows 7, Windows 8
Windows 10
Linux
• Linux is a free, open-source operating system (OS) that is used for computers,
servers, mobile devices, and more.
• It is known for its compatibility with many hardware and software platforms, and
is a popular choice for servers. It was released in 1991.
• The Linux Operating System is a type of operating system that is similar to Unix,
and it is built upon the Linux Kernel.
• The Linux Kernel is like the brain of the operating system because it manages how
the computer interacts with its hardware and resources.
• It makes sure everything works smoothly and efficiently.
macOS
• macOS is Apple's operating system for Mac computers.
• It's the second most used desktop operating system in the world, after Microsoft
Windows.
• macOS is designed to work with the hardware of Apple devices, and it includes a
suite of apps, iCloud integration, and privacy and security features.
• It is developed and marketed by Apple since 2001.
• Some of the features of macOS include:
• User Friendly Interface: The window and menu that makeup macOS user friendly
interface which make it simple to navigate and find the information you need.
• Built in Apps: In macOS so many apps are pre-installed like Mail for email and
Photos for picture management. similarly, safari is used for online browsing is
also pre installed in it.
System Software
• System software refers to the low-level software that manages and controls a
computer’s hardware and provides basic services to higher-level software.
• There are two main types of software: systems software and application software.
• Systems software includes the programs that are dedicated to managing the
computer itself, such as the operating system, file management utilities, and disk
operating system (or DOS).
Application Software
• The term “application software” refers to software that performs specific
functions for a user.
• When a user interacts directly with a piece of software, it is called application
software.
• The sole purpose of application software is to assist the user in doing specified
tasks.
• Microsoft Word and Excel, as well as popular web browsers like Firefox and
Google Chrome, are examples of application software.
Data Representation
Binary Number System
• Binary Number System uses two digits, 0 and 1, and is the foundation for all
modern computing.
• The word binary is derived from the word “bi” which means two.
• Binary Number System is the number system in which we use two digits “0” and “1”
to perform all the necessary operations.
• In the Binary Number System, we have a base of 2.
• The base of the Binary Number System is also called the radix of the number
system.
Bits and Bytes
• A bit (binary digit) is the smallest unit of data that a computer can process and
store.
• It's short for "binary digit" and is the fundamental building block of all digital
computing and communication systems.
• It is used for storing information and has a value of true/false or on/off.
• An individual bit has a value of either 0 or 1.
• A group of 8 bits is called a byte.
• A group of 4 bits is called a nibble.
Data encoding
• Data encoding is the process of converting data into a format that can be processed
by computers and other systems.
• Decoding is the reverse process of encoding which is to extract the information from
the converted format.
a. ASCII:
• ASCII (American Standard Code for Information Interchange) is the most common
character encoding format for text data in computers and on the internet.
• In standard ASCII-encoded data, there are unique values for 128 alphabetic,
numeric or special additional characters and control codes.
• It is a 7-bit character code where each individual bit represents a unique character.
b. Unicode:
• Unicode is an international character encoding standard that provides a unique
number for every character across languages and scripts.
• Unicode uses two encoding forms: 8-bit and 16-bit.
• The default encoding form is 16-bit, where each character is 16 bits (2 bytes) wide.
• It is basically a text encoding standard maintained by the Unicode Consortium
designed to support the use of text in all of the world's writing systems that can be
digitized.
Boolean logic
Logic Gates
• Logic Gates are the fundamental building blocks in digital electronics.
• By combining different logic gates complex operations are performed and circuits
like flip-flop, counters, and processors are designed.
• Logic Gates are designed by using electrical components like diodes, transistors,
resistors, and more.
Truth Tables
Applications of Logic Gates
• Logic gates are used in digital computers to perform arithmetic, logical, and control
functions.
• In memory devices, logic gates are used to implement memory cells to store digital
data in the form of bits.
• Logic gates are used in manufacturing microprocessors and microcontrollers.
• In systems used in the digital signal processing, the logic gates play an important
role to perform various operations such as modulation, filtering, algorithm execution,
etc.
• Logic gates are also used in digital communication systems to perform functions
like encoding, decoding, signal processing, etc.
• In control systems, logic gates are used to manage and control the operations of
machinery.
• Logic gates are also used to implement automated operation of security systems.
Computer Networks
• A computer network is a system that connects many independent computers to
share information (data) and resources.
• The integration of computers and other different devices allows users to
communicate more easily.
• A computer network is a collection of two or more computer systems that are linked
together.
• A network connection can be established using either cable or wireless media.
• Hardware and software are used to connect computers and tools in any network.
Network Basics
Types of Networks
There are mainly five types of Computer Networks
Network Topologies
Mesh Topology
• In a mesh topology, every device is connected to another device via a particular
channel.
• Every device is connected to another via dedicated channels.
• These channels are known as links.
• In Mesh Topology, the protocols used are AHCP (Ad Hoc Configuration Protocols),
DHCP (Dynamic Host Configuration Protocol), etc.
Star Topology
• In Star Topology, all the devices are connected to a single hub through a cable.
• This hub is the central node and all other nodes are connected to the central node.
• Coaxial cables or RJ-45 cables are used to connect the computers.
• In Star Topology, many popular Ethernet LAN protocols are used as CD(Collision
Detection), CSMA (Carrier Sense Multiple Access), etc.
Bus Topology
• Bus Topology is a network type in which every computer and network device is
connected to a single cable.
• It is bi-directional.
• It is a multi-point connection and a non-robust topology because if the backbone
fails the topology crashes.
• In Bus Topology, various MAC (Media Access Control) protocols are followed by
LAN ethernet connections like TDMA, Pure Aloha, CDMA, Slotted Aloha, etc.
Ring Topology
• In a Ring Topology, it forms a ring connecting devices with exactly two neighboring
devices.
• A number of repeaters are used for Ring topology with a large number of nodes,
because if someone wants to send some data to the last node in the ring
topology with 100 nodes, then the data will have to pass through 99 nodes to
reach the 100th node.
• Hence to prevent data loss repeaters are used in the network.
• The data flows in one direction, i.e. it is unidirectional, but it can be made
bidirectional by having 2 connections between each Network Node, it is called
Dual Ring Topology.
• In-Ring Topology, the Token Ring Passing protocol is used by the workstations to
transmit the data.
Tree Topology
• Tree topology is the variation of the Star topology.
• This topology has a hierarchical flow of data.
• In Tree Topology, protocols like DHCP and SAC (Standard Automatic
Configuration) are used.
Hybrid Topology
• Hybrid Topology is the combination of all the various types of topologies we have
studied above.
• Hybrid Topology is used when the nodes are free to take any form.
• It means these can be individuals such as Ring or Star topology or can be a
combination of various types of topologies seen above.
• Each individual topology uses the protocol that has been discussed earlier.
Network Protocols
• A network protocol is a set of rules that govern data communication between
different devices in the network.
• It determines what is being communicated, how it is being communicated, and when
it is being communicated.
• It permits connected devices to communicate with each other, irrespective of
internal and structural differences. Let us discuss some of the Network
Protocols.
TCP/IP
The TCP/IP reference model is divided into four layers.
• Application layer
• Transport layer
• Internet or Network layer
• Network Access Layer/Link layer
Each layer has different responsibilities, and all work together for successful
communication across networks.
Layer 1: Application layer
• The application layer deals with communication between the sender and the
receiver.
• It ensures that the data reaches the receiver error-free and protects the upper-layer
applications from data complexity.
• UDP stands for User Datagram Protocol, and it provides a delivery service for
datagrams - a specific type of packet used by some devices. Devices use UDP
when they need to send a smaller amount of data because it eliminates the process
of validating a connection between the devices.
HTTP
• HTTP (Hypertext Transfer Protocol) is a fundamental protocol of the Internet,
enabling the transfer of data between a client and a server.
• It is the foundation of data communication for the World Wide Web.
• HTTP provides a standard between a web browser and a web server to establish
communication.
• It is a set of rules for transferring data from one computer to another.
• Data such as text, images, and other multimedia files are shared on the World Wide
Web.
Key Points
• Basic Structure: HTTP forms the foundation of the web, enabling data
communication and file sharing.
• Web Browsing: Most websites use HTTP, so when you click on a link or download a
file, HTTP is at work.
• Client-Server Model: HTTP works on a request-response system. Your browser
(client) asks for information, and the website’s server responds with the data.
• Application Layer Protocol: HTTP operates within the Internet Protocol Suite,
managing how data is transmitted and received.
FTP
• FTP or File Transfer Protocol is said to be one of the earliest and also the most
common forms of transferring files on the internet.
• Located in the application layer of the OSI model, FTP is a basic system that helps
in transferring files between a client and a server.
• FTP is especially useful for:
➢Transferring Large Files: FTP can transfer large files in one shot; thus applicable
when hosting websites, backing up servers, or sharing files in large quantities.
➢Remote File Management: Files on a remote server can be uploaded, downloaded,
deleted, renamed, and copied according to the users’ choices.
➢Automating File Transfers: FTP is a great protocol for the execution of file transfers
on predefined scripts and employments.
➢Accessing Public Files: Anonymous FTP means that everybody irrespective of the
identity is allowed to download some files with no permissions needed.
SMTP
• Simple Mail Transfer mechanism (SMTP) is a mechanism for exchanging email
messages between servers.
• It is an essential component of the email communication process and operates at the
application layer of the TCP/IP protocol stack.
• SMTP is a protocol for transmitting and receiving email messages.
• SMTP is an application layer protocol.
• The client who wants to send the mail opens a TCP connection to the SMTP server
and then sends the mail across the connection.
• The SMTP server is an always-on listening mode.
• As soon as it listens for a TCP connection from any client, the SMTP process
initiates a connection through port 25.
• After successfully establishing a TCP connection the client process sends the mail
instantly.
DNS
• The Domain Name System (DNS) is like the internet’s phone book.
• It helps you find websites by translating easy-to-remember names (like www.example.com)
into the numerical IP addresses (like 192.0.2.1) that computers use to locate each other on
the internet.
• Without DNS, you would have to remember long strings of numbers to visit your favorite
websites.
• Every host is identified by the IP address but remembering numbers is very difficult for
people also the IP addresses are not static therefore a mapping is required to change the
domain name to the IP address.
• So DNS is used to convert the domain name of the websites to their numerical IP address.
Internet Technologies
WWW
The World Wide Web (WWW), often called the Web, is a system of interconnected
webpages and information that you can access using the Internet.
• It was created to help people share and find information easily, using links that
connect different pages together.
• The Web allows us to browse websites, watch videos, shop online, and connect with
others around the world through our computers and phones.
• WWW stands for World Wide Web and is commonly known as the Web.
• The WWW was started by CERN in 1989.
• WWW is defined as the collection of different websites around the world, containing
different information shared via local servers (or computers).
EMAIL
• Electronic mail, commonly known as email, is a method of exchanging messages
over the internet.
• Electronic Mail (e-mail) is one of most widely used services of Internet.
• This service allows an Internet user to send a message in formatted manner (mail) to
the other Internet user in any part of world.
• Message in mail not only contain text, but it also contains images, audio and videos
data.
• The person who is sending mail is called sender and person who receives mail is
called recipient.
• It is just like postal mail service.
Here are the basics of email:
1. An email address: This is a unique identifier for each user, typically in the format of
[email protected]. (Eg. - [email protected])
2. An email client: This is a software program used to send, receive and manage
emails, such as Gmail, Outlook, or Apple Mail.
3. An email server: This is a computer system responsible for storing and forwarding
emails to their intended recipients.
To send an email:
1. Compose a new message in your email client.
2. Enter the recipient’s email address in the “To” field.
3. Add a subject line to summarize the content of the message.
4. Write the body of the message.
5. Attach any relevant files if needed.
6. Click “Send” to deliver the message to the recipient’s email server.
7. Emails can also include features such as cc (carbon copy) and bcc (blind carbon
copy) to send copies of the message to multiple recipients, and reply, reply all, and
forward options to manage the conversation.
Services provided by E-mail system:
• Composition – The composition refer to process that creates messages and
answers. For composition any kind of text editor can be used.
• Transfer – Transfer means sending procedure of mail i.e. from the sender to
recipient.
• Reporting – Reporting refers to confirmation for delivery of mail. It help user to
check whether their mail is delivered, lost or rejected.
• Displaying – It refers to present mail in form that is understand by the user.
• Disposition – This step concern with recipient that what will recipient do after
receiving mail i.e save mail, delete before reading or delete after reading.
SEARCH ENGINES
• Search engines are programs that allow users to search and retrieve information
from the vast amount of content available on the internet.
• They use algorithms to index and rank web pages based on relevance to a user’s
query, providing a list of results for users to explore.
• Popular search engines include Google, Bing, and Yahoo.
Cloud Computing
• Cloud Computing means storing and accessing the data and programs on remote servers
that are hosted on the internet instead of the computer’s hard drive or local server.
• Cloud computing is also referred to as Internet-based computing, it is a technology where
the resource is provided as a service through the Internet to the user.
• The data that is stored can be files, images, documents, or any other storable document.
• The following are some of the Operations that can be performed with Cloud Computing
•Storage, backup, and recovery of data
•Delivery of software on demand
•Development of new applications and services
•Streaming videos and audio
Network Security
Firewalls:
• A firewall is a network security device, either hardware or software-based, which
monitors all incoming and outgoing traffic and based on a defined set of security rules
accepts, rejects, or drops that specific traffic.
• It filters incoming and outgoing network traffic with security policies that have previously
been set up inside an organization.
• A firewall is essentially the wall that separates a private internal network from the open
Internet at its very basic level.
• Before Firewalls, network security was performed by Access Control Lists (ACLs)
residing on routers. ACLs are rules that determine whether network access should be
granted or denied to specific IP address.
• But ACLs cannot determine the
nature of the packet it is blocking.
• Also, ACL alone does not have the
capacity to keep threats out of the network.
• Hence, the Firewall was introduced.
• The data in common databases is modeled in tables, making querying and processing
efficient.
• It is a program that allows us to create, delete, and update a relational database.
• A Relational Database is a database system that stores and retrieves data in a
tabular format organized in the form of rows and columns.
• It is a smaller subset of DBMS which was designed by E.F Codd in the 1970s.
• The major DBMSs like SQL, My-SQL, and ORACLE are all based on the principles
of relational DBMS.
Features of RDBMS
•Data must be stored in tabular form in DB file, that is, it should be organized in the
form of rows and columns.
•Each row of table is called record/tuple . Collection of such records is known as the
cardinality of the table
•Each column of the table is called an attribute/field. Collection of such columns is
called the arity of the table.
•No two records of the DB table can be same. Data duplicity is therefore avoided by
using a candidate key. Candidate Key is a minimum set of attributes required to identify
each record uniquely.
•Tables are related to each other with the help for foreign keys.
•Database tables also allow NULL values, that is if the values of any of the element of
the table are not filled or are missing, it becomes a NULL value, which is not equivalent
to zero. (NOTE: Primary key cannot have a NULL value).
SQL
•SQL, or Structured Query Language, is a programming language used to store,
retrieve, and manipulate data in relational databases.
•It is a standard Database language that is used to create, maintain, and retrieve the
relational database.
•SQL is a powerful language that can be used to carry out a wide range of operations
like insert ,delete and update.
SQL is mainly divided into four main categories:
1.Data definition language 2.Data manipulation language
3.Transaction control language 4.Data query language
Uses of SQL
•Data storage: SQL is used to store information in a database in tabular form, with
rows and columns representing different data attributes.
•Data retrieval: SQL is used to retrieve specific data items or a range of items from a
database.
•Data manipulation: SQL is used to add new data, remove or modify existing data.
•Access control: SQL is used to restrict a user's ability to retrieve, add, and modify
data.
•Data sharing: SQL is used to coordinate data sharing by concurrent users.
Components of a SQL System
A SQL system consists of several key components that work together to enable
efficient data storage, retrieval, and manipulation.
Some of the Key components of a SQL System are:
•Databases: Databases are structured collections of data organized into tables, rows,
and columns.
•Tables: Tables are the fundamental building blocks of a database, consisting of rows
(records) and columns (attributes or fields).
•Queries: Queries are SQL commands used to interact with databases.
Characteristics of SQL
•User-Friendly and Accessible, Declarative Language
•Efficient Database Management, Standardized Language
•SQL does not require a continuation character for multi-line queries, allowing
flexibility in writing commands across one or multiple lines.
•Queries are executed using a termination character (e.g., a semicolon ;), enabling
immediate and accurate command processing.
•SQL includes a rich set of built-in functions for data manipulation, aggregation, and
formatting, empowering users to handle diverse data-processing needs effectively.
Characteristics of a DBMS:
•Data storage: DBMS uses a digital repository on a server to store and manage
information.
•Data manipulation: DBMS provides a logical and clear view of the process that
manipulates data.
•Data security: DBMS provides data security.
•Data backup and recovery: DBMS contains automatic backup and recovery
procedures.
•ACID properties: DBMS contains ACID properties that maintain data in a healthy
state in case of failure.
•Concurrency control: In multi-user environments, DBMS manages concurrent access
to the database to prevent conflicts and ensure data consistency.
Types of DBMS
•Relational Database Management System RDBMS
•NoSQL DBMS
•Object-oriented database management system (OODBMS): This DBMS is based on
the principles of object-oriented programming.
•Hierarchical DBMS: This type of DBMS stores data in a parent-children relationship
node.
•Multi-model DBMS: This system supports more than one database model.
Application of Database
•Banking: Manages accounts, transactions, and financial records.
•Airlines: Handles bookings, schedules, and availability.
•E-commerce: Supports catalogs, orders, and secure transactions.
•Healthcare: Stores patient records and billing.
•Education: Manages student data and course enrollments.
•Telecom: Tracks call records and billing.
•Government: Maintains census and taxation data.
•Social Media: Stores user profiles and posts efficiently.
Database Languages
•Data Definition Language
•Data Manipulation Language
•Data Control Language
•Transactional Control Language
Disadvantages of DBMS
•Complexity: DBMS can be complex to set up and maintain, requiring specialized
knowledge and skills.
•Performance overhead: The use of a DBMS can add overhead to the performance of
an application, especially in cases where high levels of concurrency are required.
•Scalability: The use of a DBMS can limit the scalability of an application, since it
requires the use of locking and other synchronization mechanisms to ensure data
consistency.
•Cost: The cost of purchasing, maintaining and upgrading a DBMS can be high,
especially for large or complex systems.
•Limited Use Cases: Not all use cases are suitable for a DBMS, some solutions don’t
need high reliability, consistency or security and may be better served by other types of
data storage.
Data Structures
Some of the important data structures are as under:
Arrays
Array is a collection of items of the same variable type that are stored at contiguous
memory locations.
It is one of the most popular and simple data structures used in programming.
Types of Arrays
Arrays can be classified in two ways:
•On the basis of Size
•On the basis of Dimensions
•One-dimensional Array (1-D Array): You can imagine a 1-d array as a row, where
elements are stored one after another.
.Multi-dimensional Array: A multi-dimensional array is an array with more than one
dimension.
We can use multidimensional array to store complex data in the form of tables, etc. We
can have 2-D arrays, 3-D arrays, 4-D arrays and so on.
•Two-Dimensional Array (2-D Array or Matrix): 2-D Multidimensional arrays can be
considered as an array of arrays or as a matrix consisting of rows and columns.
•Three-Dimensional Array (3-D Array): A 3-D Multidimensional array contains three
dimensions, so it can be considered an array of two-dimensional arrays.
Linked Lists
A linked list is a fundamental data structure in computer science.
It mainly allows efficient insertion and deletion operations compared to arrays.
Like arrays, it is also used to implement other data structures like stack, queue and
deque
Comparison between Linked Lists and Arrays
Features of Linked List:
•Data Structure: Non-contiguous
•Memory Allocation: Typically allocated one by one to individual elements
•Insertion/Deletion: Efficient
•Access: Sequential
Features of Array:
•Data Structure: Contiguous
•Memory Allocation: Typically allocated to the whole array
•Insertion/Deletion: Inefficient
•Access: Random
Stacks
•A Stack is a linear data structure that follows a particular order in which the
operations are performed.
•The order may be LIFO(Last In First Out) or FILO(First In Last Out).
•LIFO implies that the element that is inserted last, comes out first and FILO implies
that the element that is inserted first, comes out last.
•It behaves like a stack of plates, where the last plate added is the first one to be
removed.
1.Pushing an element onto the stack is like adding a new plate on top.
2.Popping an element removes the top plate from the stack.
Types of Stack
Fixed Size Stack : As the name suggests, a fixed size stack has a fixed size and
cannot grow or shrink dynamically.
•If the stack is full and an attempt is made to add an element to it, an overflow error
occurs.
•If the stack is empty and an attempt is made to remove an element from it, an
underflow error occurs.
Dynamic Size Stack : A dynamic size stack can grow or shrink dynamically.
•When the stack is full, it automatically increases its size to accommodate the new
element, and when the stack is empty, it decreases its size.
•This type of stack is implemented using a linked list, as it allows for easy resizing of
the stack.
Types of Queues:
There are five different types of queues that are used in different scenarios. They are:
1.Input Restricted Queue (this is a Simple Queue)
2.Output Restricted Queue (this is also a Simple Queue)
3.Circular Queue
4.Double Ended Queue (Deque)
5.Priority Queue
•Ascending Priority Queue
•Descending Priority Queue
1.Circular Queue
•Circular Queue is a linear data structure in which the operations are performed based
on FIFO (First In First Out) principle and the last position is connected back to the first
position to make a circle.
•It is also called ‘Ring Buffer’.
5.Priority Queue
•A priority queue is a special type of queue in which each element is associated with a
priority and is served according to its priority.
•There are two types of Priority Queues. They are:
a)Ascending Priority Queue: Element can be inserted arbitrarily but only smallest
element can be removed. For example, suppose there is an array having elements 4, 2,
8 in the same order. So, while inserting the elements, the insertion will be in the same
sequence but while deleting, the order will be 2, 4, 8.
b)Descending priority Queue: Element can be inserted arbitrarily but only the largest
element can be removed first from the given Queue. For example, suppose there is an
array having elements 4, 2, 8 in the same order. So, while inserting the elements, the
insertion will be in the same sequence but while deleting, the order will be 8, 4, 2.
The time complexity of the Priority Queue is O(logn).
Trees
•Tree data structure is a hierarchical structure that is used to represent and organize
data in the form of parent child relationship.
•The following are some real world situations which are naturally a tree.
a)Folder structure in an operating system.
b)Tag structure in an HTML (root tag the as html tag) or XML document.
•The topmost node of the tree is called the root, and the nodes below it are called the
child nodes.
•Each node can have multiple child nodes, and these child nodes can also have their
own child nodes, forming a recursive structure.
Unstable sorting:
If a sorting algorithm, after sorting
the contents, changes the sequence
of similar content in which they
appear, it is called unstable sorting.
Adaptive and Non-Adaptive Sorting Algorithm
Adaptive Algorithm:
A sorting algorithm is said to be adaptive, if it takes advantage of already 'sorted'
elements in the list that is to be sorted.
That is, while sorting if the source list has some element already sorted, adaptive
algorithms will take this into account and will try not to re-order.
Non-adaptive algorithm:
A non-adaptive algorithm is one which does not take into account the elements which
are already sorted. They try to force every single element to be re-ordered to confirm
their sortedness.
Important Terms:
a) Increasing Order
A sequence of values is said to be in increasing order, if the successive element is
greater than the previous one. For example, 1, 3, 4, 6, 8, 9 are in increasing order, as
every next element is greater than the previous element.
b) Decreasing Order
A sequence of values is said to be in decreasing order, if the successive element is less
than the current one. For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as every
next element is less than the previous element.
c) Non-Increasing Order
A sequence of values is said to be in non-increasing order, if the successive element is
less than or equal to its previous element in the sequence. This order occurs when the
sequence contains duplicate values. For example, 9, 8, 6, 3, 3, 1 are in non-increasing
order, as every next element is less than or equal to (in case of 3) but not greater than
any previous element.
d) Non-Decreasing Order
A sequence of values is said to be in non-decreasing order, if the successive element is
greater than or equal to its previous element in the sequence. This order occurs when
the sequence contains duplicate values. For example, 1, 3, 3, 6, 8, 9 are in non-
decreasing order, as every next element is greater than or equal to (in case of 3) but
not less than the previous one.
Various Sorting Algorithms:
Bubble Sort
•Bubble sort is a simple sorting algorithm.
•This sorting algorithm is comparison-based algorithm in which each pair of adjacent
elements is compared and the elements are swapped if they are not in order.
•This algorithm is not suitable for large data sets as its average and worst case
complexity are of O(n2) where n is the number of items.
•Bubble Sort is an elementary sorting algorithm, which works by repeatedly
exchanging adjacent elements, if necessary.
•When no exchanges are required, the file is sorted.
Insertion Sort
•Insertion sort is a very simple method to sort numbers in an ascending or descending
order.
•This method follows the incremental method.
•It can be compared with the technique how cards are sorted at the time of playing a
game.
•This is an in-place comparison-based sorting algorithm.
•Here, a sub-list is maintained which is always sorted.
Selection sort
•Selection sort is a simple sorting algorithm.
•This sorting algorithm, like insertion sort, is an in-place comparison-based algorithm
in which the list is divided into two parts, the sorted part at the left end and the
unsorted part at the right end.
•Initially, the sorted part is empty and the unsorted part is the entire list.
•The smallest element is selected from the unsorted array and swapped with the
leftmost element, and that element becomes a part of the sorted array.
•This process continues moving unsorted array boundaries by one element to the
right.
•This algorithm is not suitable for large data sets as its average and worst case
complexities are of O(n2), where n is the number of items.
•This type of sorting is called Selection Sort as it works by repeatedly sorting
elements.
•That is we first find the smallest value in the array and exchange it with the element in
the first position, then find the second smallest element and exchange it with the
element in the second position, and we continue the process in this way until the entire
array is sorted.
Merge sort
•Merge sort is a sorting technique based on divide and conquer technique.
•With worst-case time complexity being Ο(n log n), it is one of the most used and
approached algorithms.
•Merge sort first divides the array into equal halves and then combines them in a sorted
manner.
•Merge sort keeps on dividing the list into equal halves until it can no more be divided.
•By definition, if it is only one element in the list, it is considered sorted.
•Then, merge sort combines the smaller sorted lists keeping the new list sorted too.
Shell sort
•Shell sort is a highly efficient sorting algorithm and is based on insertion sort algorithm.
•This algorithm avoids large shifts as in case of insertion sort, if the smaller value is to the
far right and has to be moved to the far left.
•This algorithm uses insertion sort on a widely spread elements, first to sort them and
then sorts the less widely spaced elements. This spacing is termed as interval.
•This algorithm is quite efficient for medium-sized data sets as its average and worst case
complexity are of O(n), where n is the number of items
Heap Sort
•Heap Sort is an efficient sorting technique based on the heap data structure.
•The heap is a nearly-complete binary tree where the parent node could either be
minimum or maximum.
•The heap with minimum root node is called min-heap and the root node with maximum
root node is called max-heap.
•The elements in the input data of the heap sort algorithm are processed using these two
methods.
•The time complexity of the heap sort algorithm is O(nlogn), similar to merge sort.
Quick sort
•Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of
data into smaller arrays.
•A large array is partitioned into two arrays one of which holds values smaller than the
specified value, say pivot, based on which the partition is made and another array holds
values greater than the pivot value.
•Quick sort partitions an array and then calls itself recursively twice to sort the two
resulting subarrays.
•The worst case complexity of Quick-Sort algorithm is O(n2). However, using this
technique, in average cases generally we get the output in O (n log n) time.
Searching Algorithms
•Searching is a process of finding a particular record, which can be a single element or
a small chunk, within a huge amount of data.
•The data can be in various forms: arrays, linked lists, trees, heaps, and graphs etc.
•With the increasing amount of data nowadays, there are multiple techniques to perform
the searching operation.
•Searching Algorithms in Data Structures
•Various searching techniques can be applied on the data structures to retrieve certain
data.
•A search operation is said to be successful only if it returns the desired element or
data; otherwise, the searching method is unsuccessful.
There are two categories of searching techniques.
Sequential Searching
Interval Searching
Sequential Searching
•As the name suggests, the sequential searching operation traverses through each
element of the data sequentially to look for the desired data.
•The data need not be in a sorted manner for this type of search.
Example − Linear Search
Interval Searching
•Unlike sequential searching, the interval searching operation requires the data to be in
a sorted manner.
•This method usually searches the data in intervals; it could be done by either dividing
the data into multiple sub-parts or jumping through the indices to search for an element.
Example − Binary Search, Jump Search etc.
Various Searching Algorithms
a)Linear Search Algorithm
•Linear search is a type of sequential searching algorithm.
•In this method, every element within the input array is traversed and compared with the
key element to be found.
•If a match is found in the array the search is said to be successful; if there is no match
found the search is said to be unsuccessful and gives the worst-case time complexity.
•The algorithm for linear search is relatively simple.
•Linear search traverses through every element sequentially therefore, the best case is
when the element is found in the very first iteration.
•The best-case time complexity would be O(1).
•However, the worst case of the linear search method would be an unsuccessful search
that does not find the key value in the array, it performs n iterations.
•Therefore, the worst-case time complexity of the linear search algorithm would be
O(n).
b Binary Search Algorithm
•Binary search is a fast search algorithm with run-time complexity of Ο(log n).
•This search algorithm works on the principle of divide and conquer, since it divides the
array into half before searching.
•For this algorithm to work properly, the data collection should be in the sorted form.
•Binary search looks for a particular key value by comparing the middle most item of
the collection.
•If a match occurs, then the index of item is returned.
•But if the middle item has a value greater than the key value, the right sub-array of the
middle item is searched.
•Otherwise, the left sub-array is searched.
•This process continues recursively until the size of a subarray reduces to zero.
•Binary Search algorithm is an interval searching method that performs the searching
in intervals only.
•The input taken by the binary search algorithm
must always be in a sorted array since
it divides the array into subarrays
based on the greater or lower values.
•The time complexity of the binary
search algorithm is O(log n).
c) Interpolation Search Algorithm
•Interpolation search is an improved variant of binary search.
•This search algorithm works on the probing position of the required value.
•For this algorithm to work properly, the data collection should be in a sorted form and
equally distributed.
•Runtime complexity of interpolation search algorithm is Ο(log (log n)) as compared to
Ο(log n) of BST in favorable situations.
d) Jump Search Algorithm
•Jump Search algorithm is a slightly modified version of the linear search algorithm.
•The main idea behind this algorithm is to reduce the time complexity by comparing lesser
elements than the linear search algorithm.
•The input array is hence sorted and divided into blocks to perform searching while
jumping through these blocks.
•The time complexity of the jump search technique is O(√n) and space complexity is O(1).
e) Exponential Search Algorithm
•Exponential search algorithm targets a range of an input array in which it assumes that
the required element must be present in and performs a binary search on that particular
small range.
•This algorithm is also known as doubling search or finger search.
•It is similar to jump search in dividing the sorted input into multiple blocks and conducting
a smaller scale search.
•However, the difference occurs while performing computations to divide the blocks and
the type of smaller scale search applied (jump search applies linear search and
exponential search applies binary search).
•Hence, this algorithm jumps exponentially in the powers of 2.
•In simpler words, the search is performed on the blocks divided using pow(2, k) where k is
an integer greater than or equal to 0.
•Once the element at position pow(2, n) is greater than the key element, binary search is
performed on the current block.
•Even though it is called Exponential search it does not perform searching in exponential
time complexity.
•But as we know, in this search algorithm, the basic search being performed is binary
search.
•Therefore, the time complexity of the exponential search algorithm will be the same as
the binary search algorithm’s, O(log n).
f) Fibonacci Search Algorithm
•As the name suggests, the Fibonacci Search Algorithm uses Fibonacci numbers to
search for an element in a sorted input array.
•Fibonacci Series is a series of numbers that have two primitive numbers 0 and 1.
•The successive numbers are the sum of preceding two numbers in the series.
•This is an infinite constant series, therefore, the numbers in it are fixed.
•The main idea behind the Fibonacci series is also to eliminate the least possible places
where the element could be found.
•In a way, it acts like a divide & conquer algorithm (logic being the closest to binary
search algorithm).
•This algorithm, like jump search and exponential search, also skips through the indices
of the input array in order to perform searching.
•The Fibonacci Search Algorithm makes use of the Fibonacci Series to diminish the
range of an array on which the searching is set to be performed.
•With every iteration, the search range decreases making it easier to locate the element
in the array.
•The Fibonacci Search algorithm takes logarithmic time complexity to search for an
element.
•Since it is based on a divide on a conquer approach and is similar to idea of binary
search, the time taken by this algorithm to be executed under the worst case
consequences is O(log n).
g) Sublist Search Algorithm
•Until now we have only seen how to search for one element in a sequential order of
elements.
•But the sublist search algorithm provides a procedure to search for a linked list in
another linked list.
•It works like any simple pattern matching algorithm where the aim is to determine
whether one list is present in the other list or not.
•The algorithm walks through the linked list where the first element of one list is
compared with the first element of the second list; if a match is not found, the second
element of the first list is compared with the first element of the second list.
•This process continues until a match is found or it reaches the end of a list.
•The main aim of this algorithm is to prove that one linked list is a sub-list of another
list.
•This process continues until a match is found or it reaches the end of a list.
•The main aim of this algorithm is to prove that one linked list is a sub-list of another
list.
•Searching in this process is done linearly, checking each element of the linked list one
by one; if the output returns true, then it is proven that the second list is a sub-list of
the first linked list.
•The time complexity of the sublist search depends on the number of elements present
in both linked lists involved.
•The worst case time taken by the algorithm to be executed is O(m*n) where m is the
number of elements present in the first linked list and n is the number of elements
present in the second linked list.
Emerging Technologies
Machine Learning
Machine learning is a type of artificial intelligence (AI) that allows computers to learn
and improve from data without being explicitly programmed.
It uses algorithms to analyze large amounts of data, identify patterns, and make
predictions.
Types of Machine Learning
Machine learning can be broadly
categorized into three types:
• Supervised Learning:
Trains models on labeled data to
predict or classify new, unseen data.
• Unsupervised Learning:
Finds patterns or groups in unlabeled
data, like clustering or dimensionality reduction.
• Reinforcement Learning:
Learns through trial and error to maximize rewards, ideal for decision-making tasks.
Learning Applications:
Computer vision
The first Deep Learning application is Computer vision.
In computer vision, Deep learning AI models can enable machines to identify and
understand visual data.
Some of the main applications of deep learning in computer vision include:
• Object detection and recognition: Deep learning model can be used to identify and
locate objects within images and videos, making it possible for machines to perform
tasks such as self-driving cars, surveillance, and robotics.
• Image classification: Deep learning models can be used to classify images into
categories such as animals, plants, and buildings. This is used in applications such
as medical imaging, quality control, and image retrieval.
• Image segmentation: Deep learning models can be used for image segmentation into
different regions, making it possible to identify specific features within images.
Reinforcement learning:
In reinforcement learning, deep learning works as training agents to take action in an
environment to maximize a reward. Some of the main applications of deep learning in
reinforcement learning include:
• Game playing: Deep reinforcement learning models have been able to beat human
experts at games such as Chess
• Robotics: Deep reinforcement learning models can be used to train robots to
perform complex tasks such as grasping objects, navigation, and manipulation.
• Control systems: Deep reinforcement learning models can be used to control
complex systems such as power grids, traffic management, and supply chain
optimization.
Cyber attacks
• A cyber attack is an intentional attempt to access a computer system, network,
or device to steal, alter, or destroy data.
• A cyber attack is any intentional effort to steal, expose, alter, disable, or
destroy data, applications, or other assets through unauthorized access to a
network, computer system or digital device.
Common types of cyber attacks
a) Malware
Malware is a term used to describe malicious software, including spyware,
ransomware, viruses, and worms.
Malware breaches a network through a vulnerability, typically when a user clicks a
dangerous link or email attachment that then installs risky software.
Once inside the system, malware can do the following:
• Blocks access to key components of the network (ransomware)
• Installs malware or additional harmful software
• Covertly obtains information by transmitting data from the hard drive (spyware)
• Disrupts certain components and renders the system inoperable
b) Phishing
Phishing is the practice of sending fraudulent communications that appear to come
from a reputable source, usually through email.
The goal is to steal sensitive data like credit card and login information or to install
malware on the victim’s machine.
Phishing is an increasingly common cyberthreat.
c) Man-in-the-middle attack
Man-in-the-middle (MitM) attacks, also known as eavesdropping attacks, occur when
attackers insert themselves into a two-party transaction.
Once the attackers interrupt the traffic, they can filter and steal data.
Two common points of entry for MitM attacks are:
1. On unsecure public Wi-Fi, attackers can insert themselves between a visitor’s device
and the network. Without knowing, the visitor passes all information through the
attacker.
2. Once malware has breached a device, an attacker can install software to process
all of the victim’s information.
d) Denial-of-service attack
A denial-of-service attack floods systems, servers, or networks with traffic to exhaust
resources and bandwidth.
As a result, the system is unable to fulfill legitimate requests.
Attackers can also use multiple compromised devices to launch this attack.
This is known as a distributed-denial-of-service (DDoS) attack.
e) SQL injection
A Structured Query Language (SQL) injection occurs when an attacker inserts
malicious code into a server that uses SQL and forces the server to reveal information
it normally would not.
An attacker could carry out a SQL injection simply by submitting malicious code into a
vulnerable website search box.
f) Zero-day exploit
A zero-day exploit hits after a network vulnerability is announced but before a patch or
solution is implemented.
Attackers target the disclosed vulnerability during this window of time.
Zero-day vulnerability threat detection requires constant awareness.
g) DNS Tunneling
• DNS tunneling utilizes the DNS protocol to communicate non-DNS traffic over port
53.
• It sends HTTP and other protocol traffic over DNS.
• There are various, legitimate reasons to utilize DNS tunneling.
• However, there are also malicious reasons to use DNS Tunneling VPN services.
• They can be used to disguise outbound traffic as DNS, concealing data that is
typically shared through an internet connection.
• For malicious use, DNS requests are manipulated to exfiltrate data from a
compromised system to the attacker’s infrastructure.
• It can also be used for command and control callbacks from the attacker’s
infrastructure to a compromised system.
Mitigation Strategies
Cyber risk mitigation is the application of policies, technologies and procedures to
reduce the likelihood and impact of a successful cyber attack.
It is a critical practice to help guide decision-making around risk control and mitigation
and allows your organization to stay protected and achieve its business goals.
Cybersecurity mitigation strategies are actions that can be taken to reduce the impact
of cyber attacks. These strategies include:
• Risk assessment: Evaluate your organization's level of risk and identify areas for
improvement
• Network access controls: Authenticate and authorize users who request access to
data or systems
• Incident response plan: Create a plan that describes how to respond to an attack to
minimize delays.
• Security patches and updates: Regularly update software, operating systems, and
applications to fix known vulnerabilities
• Network traffic monitoring: Continuously monitor network traffic to detect and
respond to cyber attacks
• Security awareness training: Educate employees about cybersecurity risks and how
to avoid them
• Multifactor authentication: Require users to provide at least two forms of ID
verification
• Firewall and threat detection software: Use software to detect and block threats
• Physical security: Review your organization's physical security measures
• Minimize attack surface: Reduce the number of ways that attackers can gain
access to your systems
Big Data
DataMining
• Data mining is the process of sorting through large data sets to identify patterns and
relationships that can help solve business problems through data analysis.
• Data mining techniques and tools help enterprises to predict future trends and make
more informed business decisions.
• Data mining is the process of extracting knowledge or insights from large amounts
of data using various statistical and computational techniques.
• The data can be structured, semi-structured or unstructured, and can be stored in
various forms such as databases, data warehouses, and data lakes.
• The primary goal of data mining is to discover hidden patterns and relationships in
the data that can be used to make informed decisions or predictions.
• This involves exploring the data using various techniques such as clustering,
classification, regression analysis, association rule mining, and anomaly detection.
Data Mining Architecture
Data mining architecture refers to the overall design and structure of a data mining
system.
A data mining architecture typically includes several key components, which work
together to perform data mining tasks and extract useful insights and information from
data.
Some of the key components of a typical data mining architecture include:
• Data Sources: Data sources are the sources of data that are used in data mining.
These can include structured and unstructured data from databases, files, sensors,
and other sources. Data sources provide the raw data that is used in data mining and
can be processed, cleaned, and transformed to create a usable data set for analysis.
• Data Visualization: Data visualization is the process of presenting data and insights
in a clear and effective manner, typically using charts, graphs, and other
visualizations. Data visualization is an important part of data mining, as it allows data
miners to communicate their findings and insights to others in a way that is easy to
understand and interpret.
Types of Data Mining
There are many different types of data mining, but they can generally be grouped into
three broad categories: descriptive, predictive, and prescriptive.
• Descriptive data mining involves summarizing and describing the characteristics of
a data set. This type of data mining is often used to explore and understand the data,
identify patterns and trends, and summarize the data in a meaningful way.
• Descriptive data mining involves summarizing and describing the characteristics of
a data set. This type of data mining is often used to explore and understand the data,
identify patterns and trends, and summarize the data in a meaningful way.
• Prescriptive data mining involves using data and models to make recommendations
or suggestions about actions or decisions. This type of data mining is often used to
optimize processes, allocate resources, or make other decisions that can help
organizations achieve their goals.
Data Warehousing
• A Data Warehouse is separate from DBMS, it stores a huge amount of data, which
is typically collected from multiple heterogeneous sources like files, DBMS, etc.
• The goal is to produce statistical results that may help in decision-making.
• An ordinary Database can store MBs to GBs of data and that too for a specific
purpose.
• For storing data of TB size, the storage shifted to the Data Warehouse.
• Besides this, a transactional database doesn’t offer itself to analytics.
• To effectively perform analytics, an organization keeps a central Data Warehouse
to closely study its business by organizing, understanding, and using its historical data
for making strategic decisions and analyzing trends.
Benefits of Data Warehouse
• Better business analytics: Data warehouse plays an important role in every
business to store and analysis of all the past data and records of the company. which
can further increase the understanding or analysis of data for the company.
• Faster Queries: The data warehouse is designed to handle large queries that’s why it
runs queries faster than the database.
• Improved data Quality: In the data warehouse the data you gathered from different
sources is being stored and analyzed it does not interfere with or add data by itself so
your quality of data is maintained and if you get any issue regarding data quality then
the data warehouse team will solve this.
• Historical Insight: The warehouse stores all your historical data which contains
details about the business so that one can analyze it at any time and extract insights
from it.
Features of Data Warehousing
• Centralized Data Repository: Data warehousing provides a centralized repository
for all enterprise data from various sources, such as transactional databases,
operational systems, and external sources. This enables organizations to have a
comprehensive view of their data, which can help in making informed business
decisions.
• Data Integration: Data warehousing integrates data from different sources into a
single, unified view, which can help in eliminating data silos and reducing data
inconsistencies.
• Historical Data Storage: Data warehousing stores historical data, which enables
organizations to analyze data trends over time. This can help in identifying patterns
and anomalies in the data, which can be used to improve business performance.
• Query and Analysis: Data warehousing provides powerful query and analysis
capabilities that enable users to explore and analyze data in different ways. This can
help in identifying patterns and trends, and can also help in making informed business
decisions.
• Data Transformation: Data warehousing includes a process of data transformation,
which involves cleaning, filtering, and formatting data from various sources to make it
consistent and usable. This can help in improving data quality and reducing data
inconsistencies.
• Data Mining: Data warehousing provides data mining capabilities, which enable
organizations to discover hidden patterns and relationships in their data. This can help
in identifying new opportunities, predicting future trends, and mitigating risks.
• Data Security: Data warehousing provides robust data security features, such as
access controls, data encryption, and data backups, which ensure that the data is
secure and protected from unauthorized access.
Data Visualization
Big Data Visualization refers to the techniques and tools used to graphically represent
large and complex datasets in a way that is easy to understand and interpret·
Given the volume, variety, and velocity of big data, traditional visualization methods
often fall short, requiring more sophisticated approaches to make sense of such vast
amounts of information.
Big Data Visualization is characterized by its ability to handle:
• Volume: The sheer amount of data requires robust visualization tools that can
summarize and present data effectively without losing critical information.
• Variety: Data comes in various formats (structured, unstructured, semi-structured)
from diverse sources· Effective visualizations must accommodate this diversity.
• Velocity: Data streams in at high speeds, especially from sources like social media
and IoTdevices· Real-time visualization is often necessary to keep pace with the influx.
Importance of Big Data Visualization
1. Enhanced Decision-Making
Big data visualization provides a clear and immediate understanding of data, enabling
businesses and organizations to make informed decisions quickly.
2. Improved Data Comprehension
It aids in transforming raw numbers into meaningful insights, making it easier for
analysts and non-technical users to understand the underlying trends and
relationships.
3. Increased Engagement
Interactive visualizations can engage users more effectively than static reports.
Dashboards and real-time visual data representations allow users to interact with the
data, drilling down into specifics and exploring different scenarios, leading to deeper
insights and more meaningful engagement with the information.
2. Network diagrams:
Network diagrams are particularly valuable for visualizing complex interconnections
between entities, making them essential in fields such as social network analysis,
internet infrastructure, and bioinformatics.
• These diagrams are composed of nodes
(representing entities) and edges
(representing relationships)·
• Ideal for visualizing relationships and
interactions in data, such as social
connections or data flows.
Geospatial Maps
Geospatial maps integrate geographical data with analytical data, providing a spatial
dimension to the visualization. This type of visualization is crucial for any data with a
geographic component, ranging from global economic trends to local event planning.
Integrating large-scale geographical data with traditional data sets to provide spatial
analysis.
Tree Maps
Tree maps display hierarchical data as a set of
nested rectangles, where each branch of the
tree is given a rectangle, which is then tiled with
smaller rectangles representing sub-branches.
This method is useful for visualizing
part-to-whole relationships within a dataset.
Stream Graphs
Stream graphs are a type of stacked area
graph that is ideal for representing changes
in data over time. They are often used to
display volume fluctuations in streams of
data and are particularly effective in
showing the presence of patterns and
trends across different categories.
Parallel Coordinates
Parallel Coordinates is used for plotting
individual data elements across multiple
dimensions. It is particularly useful for
visualizing and analyzing multivariate
data, allowing users to see correlations
and patterns across several
interconnected variables.
Chord Diagrams
Chord diagrams are used to show inter-relationships between data points in a circle.
The arcs connect data points, and their thickness or color can indicate the weight or
value of the relationship, useful for representing connectivity or flows within a system.
These diagrams help identify patterns, clusters, or trends within complex datasets,
aiding in decision-making and analysis.
Applications of Big Data
Visualization
1.Healthcare
2.Finance
3.Retail
4.Manufacturing
5.Marketing
6.Transport and logistics
Information Technology Applications
1.E-commerce
•E-commerce, or electronic commerce, is the buying and selling of goods and services,
or the transfer of funds or data, over an electronic network, primarily the internet.
•E-commerce draws on technologies such as mobile commerce, electronic funds
transfer, supply chain management, Internet marketing, online transaction processing,
electronic data interchange (EDI), inventory management systems, and automated
data collection systems.
Online Shopping
•Online shopping is the act of buying goods or services over the internet using a
computer, smartphone, or tablet.
•Online shopping is a form of electronic commerce, also known as "ecommerce".
Ecommerce businesses can be online-only or have a physical presence as well.
•Some advantages of online shopping include Safety, Convenience, Better prices,
Variety, Authenticity, No pressure shopping, and Time-saving.
•Online shopping is becoming more and more widespread and accepted due to these
many conveniences.
Online Business
•An online business is a business that
conducts its primary activities over the
internet, such as buying, selling,
or providing services online.
•Online businesses can be started by anyone and can be inexpensive to get off the
ground.
•Here are some examples of online businesses:
•E-commerce: A business that sells products or services online, or uses the internet to
generate sales leads. E-commerce businesses can be run from a single website or
through multiple online channels.
•Selling handmade goods: You can sell handmade products on sites like Etsy, Amazon,
and eBay.
•Selling art: You can sell your art as prints, canvases, framed posters, or digital
downloads. You can also use a print-on-demand service to have your artwork printed
on mugs, T-shirts, or other goods.
•Affiliate marketing: You can promote other businesses' products through affiliate
marketing.
•Software development: You can develop software solutions.
E-Learning
•E-learning, or online learning, is a way of learning that uses the internet and other
digital technologies to deliver instruction.
•It's an alternative to traditional classroom learning.
•E-learning courses can be accessed through electronic devices like computers,
tablets, and cell phones.
•E-learning courses can be live, pre-recorded, or a combination of both.
•E-learning courses can include videos, quizzes, games, and other interactive
elements.
•Learning management systems can help facilitate e-learning by storing courses,
assessments, and grades.
•Benefits of e-learning include:
•Flexibility: Students can learn at their own pace and from any location.
•Cost-effective: There's no need to spend money on travel, seminars, and hotel rooms.
•Wider coverage: E-learning can reach a large number of people in many different
locations.
•Personalized: E-learning can be tailored to the needs of individual learners.
Online Education
•Online education, also known as distance learning, e-learning, or remote learning, is
a way to learn and teach that uses the internet.
•Online education can be used in all sectors of education, from elementary schools to
higher education. It can be an effective alternative to traditional classroom education.
•It can include: Watching videos, Reading articles, Taking online courses, Interacting
with teachers and other students online, and Submitting work electronically.
•Online education can be flexible and allow students to study at their own pace. There
are several types of online learning, including:
Asynchronous
Students complete coursework and exams within a given time frame, and interaction
usually takes place through discussion boards, blogs, and wikis.
Synchronous
Students and the instructor interact online simultaneously through text, video, or audio
chat.
Distance Learning: Distance learning refers to the way of learning that does not
require you to be present physically at the university or institution.
•Learning materials and lectures are available online.
•Learners can stay at their homes while taking the course from an online university or
other institution.
•They will usually also have the opportunity to attend in-person workshops, residencies,
or other learning components, but the material is primarily taught through online
courses.
2.BharatNet:
High-speed broadband connectivity to over 2.5 lakh Gram Panchayats.
4.Digital Locker:
A secure cloud-based platform for storing and sharing government-issued documents.
10. eCourts:
Digitization of judiciary for faster case resolutions.
Impact of Digital India
•Increased Connectivity: Internet penetration has improved significantly in rural areas.
•Digital Payments: Widespread adoption of digital payment methods like UPI and mobile
wallets.
•Transparency: Reduction in corruption due to direct benefit transfers and e-
governance.
•Job Creation: Enhanced opportunities in IT, electronics manufacturing, and e-
services.
•Inclusive Growth: Bridging the digital divide between urban and rural areas. The Digital
India Mission continues to evolve, empowering citizens with digital tools, enhancing
infrastructure, and fostering innovation across the country.