0% found this document useful (0 votes)
15 views34 pages

LINUX UNIT II_231102_091614

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views34 pages

LINUX UNIT II_231102_091614

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

UNIT II

LINUX FILE STRUCTURE

LINUX FILE STRUCTURE: Introduction to LINUX file system and Structures, inode (Index Node), file
descriptors, system calls and device drivers. System Calls for File Management – create, open, close,
read, write, lseek, link, symlink, unlink, stat, fstat, lstat, chmod, chown, Directory API – opendir,
readdir, closedir, mkdir, rmdir, umask. File links – hard and soft links. Environment and path setting.
The /etc/passwd and /etc/shadow files. Add, modify and delete users.

1. Introduction to LINUX FILE SYSTEM

Linux file structure files are grouped according to purpose. Ex: commands, data files,
documentation. Parts of a Unix directory tree are listed below. All directories are grouped under
the root entry "/". That part of the directory tree is left out of the below diagram.

A Linux file system is a structured collection of files on a disk drive or a partition. A partition is a
segment of memory and contains some specific data. In our machine, there can be various
partitions of the memory. Generally, every partition contains a file system.
1|Page
The general-purpose computer system needs to store data systematically so that we can easily
access the files in less time. It stores the data on hard disks (HDD) or some equivalent storage type.
There may be below reasons for maintaining the file system:

• Primarily the computer saves data to the RAM storage; it may lose the data if it gets turned
off. However, there is non-volatile RAM (Flash RAM and SSD) that is available to maintain
the data after the power interruption.
• Data storage is preferred on hard drives as compared to standard RAM as RAM costs more
than disk space. The hard disks costs are dropping gradually comparatively the RAM.

The Linux file system contains the following sections:

• The root directory (/)


• A specific data storage format (EXT3, EXT4, BTRFS, XFS and so on)
• A partition or logical volume having a particular file system.

What is the Linux File System?

i. Linux file system is generally a built-in layer of a Linux operating systems used to handle
the data management of the storage. It helps to arrange the file on the disk storage. It
manages the file name, file size, creation date, and much more information about a file.
ii. Linux File System Structure

Linux file system has a hierarchal file structure as it contains a root directory and its
subdirectories. All other directories can be accessed from the root directory. A partition
usually has only one file system, but it may have more than one file system.

iii. A file system is designed in a way so that it can manage and provide space for non-volatile
storage data. All file systems required a namespace that is a naming and organizational
methodology. The namespace defines the naming process, length of the file name, or a
subset of characters that can be used for the file name. It also defines the logical structure
of files on a memory segment, such as the use of directories for organizing the specific files.
Once a namespace is described, a Metadata description must be defined for that particular
file.
iv. The data structure needs to support a hierarchical directory structure; this structure is
used to describe the available and used disk space for a particular block. It also has the

2|Page
other details about the files such as file size, date & time of creation, update, and last
modified.
v. Also, it stores advanced information about the section of the disk, such as partitions and
volumes.
vi. The advanced data and the structures that it represents contain the information about the
file system stored on the drive; it is distinct and independent of the file system metadata.
vii. Linux file system contains two-part file system software implementation architecture.
Consider the below image:

• The file system requires an API (Application programming interface) to access the function
calls to interact with file system components like files and directories. API facilitates tasks
such as creating, deleting, and copying the files. It facilitates an algorithm that defines the
arrangement of files on a file system.

The first two parts of the given file system together called a Linux virtual file system. It provides a
single set of commands for the kernel and developers to access the file system. This virtual file
system requires the specific system driver to give an interface to the file system.

2. Linux File System Features

In Linux, the file system creates a tree structure. All the files are arranged as a tree and its branches.
The topmost directory called the root (/) directory. All other directories in Linux can be accessed
from the root directory.

3|Page
Some key features of Linux file system are as following:

• Specifying paths: Linux does not use the backslash (\) to separate the components; it uses
forward slash (/) as an alternative. For example, as in Windows, the data may be stored in
C:\ My Documents\ Work, whereas, in Linux, it would be stored in /home/ My Document/
Work.

• Partition, Directories, and Drives: Linux does not use drive letters to organize the drive
as Windows does. In Linux, we cannot tell whether we are addressing a partition, a network
device, or an "ordinary" directory and a Drive.

• Case Sensitivity: Linux file system is case sensitive. It distinguishes between lowercase
and uppercase file names. Such as, there is a difference between test.txt and Test.txt in
Linux. This rule is also applied for directories and Linux commands.

• File Extensions: In Linux, a file may have the extension '.txt,' but it is not necessary that a
file should have a file extension. While working with Shell, it creates some problems for the
beginners to differentiate between files and directories. If we use the graphical file
manager, it symbolizes the files and folders.

• Hidden files: Linux distinguishes between standard files and hidden files, mostly the
configuration files are hidden in Linux OS. Usually, we don't need to access or read the
hidden files. The hidden files in Linux are represented by a dot (.) before the file name (e.g.,
.ignore). To access the files, we need to change the view in the file manager or need to use
a specific command in the shell.

4|Page
3. Types of Linux File System

When we install the Linux operating system, Linux offers many file systems such as Ext, Ext2, Ext3,
Ext4, JFS, ReiserFS, XFS, btrfs, and swap.

a) Ext, Ext2, Ext3 and Ext4 file system

The file system Ext stands for Extended File System. It was primarily developed for MINIX OS. The
Ext file system is an older version, and is no longer used due to some limitations.

Ext2 is the first Linux file system that allows managing two terabytes of data. Ext3 is developed
through Ext2; it is an upgraded version of Ext2 and contains backward compatibility. The major
drawback of Ext3 is that it does not support servers because this file system does not support file
recovery and disk snapshot.

Ext4 file system is the faster file system among all the Ext file systems. It is a very compatible
option for the SSD (solid-state drive) disks, and it is the default file system in Linux distribution.

b) JFS File System

JFS stands for Journaled File System, and it is developed by IBM for AIX Unix. It is an alternative to
the Ext file system. It can also be used in place of Ext4, where stability is needed with few
resources. It is a handy file system when CPU power is limited.

5|Page
c) ReiserFS File System

ReiserFS is an alternative to the Ext3 file system. It has improved performance and advanced
features. In the earlier time, the ReiserFS was used as the default file system in SUSE Linux, but
later it has changed some policies, so SUSE returned to Ext3. This file system dynamically supports
the file extension, but it has some drawbacks in performance.

d) XFS File System

XFS file system was considered as high-speed JFS, which is developed for parallel I/O processing.
NASA still using this file system with its high storage server (300+ Terabyte server).

e) Btrfs File System

Btrfs stands for the B tree file system. It is used for fault tolerance, repair system, fun
administration, extensive storage configuration, and more. It is not a good suit for the production
system.

f) Swap File System

The swap file system is used for memory paging in Linux operating system during the system
hibernation. A system that never goes in hibernate state is required to have swap space equal to
its RAM size.

4. Linux Inodes

An Inode number is a uniquely existing number for all the files in Linux and all Unix type systems.

When a file is created on a system, a file name and Inode number is assigned to it.

Generally, to access a file, a user uses the file name but internally file name is first mapped with
respective Inode number stored in a table.

Note: Inode doesn't contain the file name. Reason for this is to maintain hard-links for the files.
When all the other information is separated from the file name then only we can have various file
names pointing to the same Inode.

i. Inode Contents

An Inode is a data structure containing metadata about the files.

6|Page
Following contents are stored in the Inode from a file:

• User ID of file
• Group ID of file
• Device ID
• File size
• Date of creation
• Permission
• Owner of the file
• File protection flag

Link counter to determine number of hard links

Example:
ls -ld new1

ii. Inode Table

The Inode table contains all the Inodes and is created when file system is created. The df -
i command can be used to check how many inodes are free and left unused in the filesystem.

iii. Inode Number

Each Inode has a unique number and Inode number can be seen with the help of ls -li command.

7|Page
Look at the above snapshot, Directory Disk1 has the three files and each file has a different Inode
number.

Note: The Inode doesn't contain file content, instead it has a pointer to that data.

5. File Descriptors

A file descriptor is a number that uniquely identifies an open file in a computer's operating
system. It describes a data resource, and how that resource may be accessed.

When a program asks to open a file — or another data resource, like a network socket —
the kernel:

1. Grants access.

2. Creates an entry in the global file table.

3. Provides the software with the location of that entry.

The descriptor is identified by a unique non-negative integer, such as 0, 12, or 567. At least one file
descriptor exists for every open file on the system.

A file descriptor is the Unix abstraction for an open input/output stream: a file, a network
connection, a pipe (a communication channel between processes), a terminal, etc.

A Unix file descriptor thus fills a similar niche as a stdio FILE*. However, whereas
a FILE* (like stdin or stdout) is a pointer to some object structure, a file descriptor is just an
integer. For example, 0, 1, and 2 are the file descriptor versions of stdin, stdout, and stderr,
respectively.

(Integers are used because they’re easier for the operating system kernel to verify than arbitrary
pointers. Although the kernel has objects somewhat similar to FILE*s, it doesn’t give applications
direct access to those objects. Instead, an array called the file descriptor table stores an array of
8|Page
such objects. The file descriptors that applications manipulate are indexes into this table. It’s very
easy to check that an integer is in bounds.)

Logically, a file descriptor comprises a file reference, which represents the underlying data (such
as /home/kohler/grades.txt), and a file position, which is an offset into the file. There can be many
file descriptors simultaneously open for the same file reference, each with a different position. For
disk files, the position can be explicitly changed: a process can rewind and re-read part of a file, for
example, or skip around, as we saw with strided I/O patterns. These files are called seekable.
However, not all types of file descriptor are seekable. Most communication channels between
processes aren’t, and neither are network channels.

A. File descriptor system calls

a. open
int open(const char* pathname, int flags, [mode_t mode])

Open the file pathname according to mode, which a set of flags containing exactly one
of O_RDONLY (open for reading), O_WRONLY (open for writing), and O_RDWR (open for both
reading and writing), as well as other optional flags. Returns a file descriptor for the open file, or -
1 on error.

Other important flags include:

• O_CREAT: Create the file if it does not exist, using the mode argument to set the file’s initial
permissions. (Typically the mode argument will be 0660 or S_IRUSR | S_IWUSR | S_IRGRP |
S_IWGRP, which allows the current user and group to read or write the file.

• O_CREAT | O_EXCL: Create the file and fail if the file already exists.

• O_APPEND: Open the file in append mode: every write automatically jumps to the end of
the file and makes the file longer.

• O_TRUNC: Truncate the file to length 0.

b. read

ssize_t read(int fd, char* buf, size_t sz)

9|Page
Read at most sz bytes from file descriptor fd into buffer buf. Returns the number of bytes read, if
any. Returns 0 at end of file and -1 on error.

Normally, read returns sz, but it can return less. For instance, there might be just sz - 2 bytes left
in the file, or there might only be sz - 10 bytes available to read at the moment. A read that returns
less than the requested number of bytes is called a short read.

If sz > 0, then the return value 0 is a reliable end-of-file indicator. For instance, when reading a
pipe, 0 means the other end of the pipe has closed. Other short reads are not reliable end-of-file
indicators. For instance, when reading from the terminal, a read of 1024 bytes might return 1 byte
because the user has only typed 1 byte so far; the user might still type more bytes in the future.

The return value -1 indicates an error and means that no bytes were read. If any bytes were read,
the return value will be greater than 0. However, not all errors are equally serious.

Permanent errors

The read and write system calls, as well as some other system calls, are so-called “slow” system
calls that can return different classes of error.

Some errors are serious, indicating problems with the underlying file. For instance, the EIO error
indicates disk corruption, and ENOSPC indicates that the disk is full. These errors, which we’ll
call permanent errors, should be returned to the user.

Other, restartable errors indicate a temporary blip, and retrying the slow system call will likely
succeed. These errors are EINTR and EAGAIN. The kernel returns these errors for special reasons
we’ll discuss in unit 5. I/O libraries must sometimes mask these errors by retrying until the errors
go away. For example, the stdio library’s fflush function retries its writes until restartable errors
go away; and in pset 4, your io61_flush function must do the same.

Error codes like EIO and EINTR are defined in #include <errno.h>. When a system call returns an
error, it generally returns -1; the error code is returned in a special global variable called errno.

Each system call manual page list all errors that can occur for that system call. Read this page for
read by looking at read(2). (That notation means “the page for read in section 2 of the manual”;
run [wo]man 2 read.)

10 | P a g e
c. write

ssize_t write(int fd, const char* buf, size_t sz)

Write at most sz bytes to file descriptor fd from buffer buf. Returns the number of bytes written,
if any. Returns -1 on error.

Normally, write returns sz, but as with read, it might return less: a short write. For instance, there
might only be room for `sz

• 2 bytes on the disk, or the process reading from a pipe might be behind, or a signal might
interrupt the write partway through.

The return value -1 indicates an error and means that no bytes were written. If any bytes were
written, the return value will be greater than 0.

d. lseek
off_t lseek(int fd, off_t pos, int whence)

Change file descriptor fd’s position and return the resulting position relative to the beginning of
the file. There are three important values for whence:

• SEEK_SET: Set the file position to pos. pos == 0 sets the position to the beginning of the
file, pos == 1 sets it one byte in, and so forth.

• SEEK_CUR: Change the file position relative to the current position. pos == 0 leaves the
position unchanged, pos == 10 skips over the next 10 bytes, and so forth.

• SEEK_END: Set the file position relative to the file size. pos == 0 sets the position to the end
of the file, pos == -1 sets it to the last byte in the file, and so forth.

So lseek(fd, 0, SEEK_CUR) returns the current position without changing it.

Returns -1 on error, which can happen, for example, if the file is not seekable or the new file
position is out of range for the file.

e. close

int close(int fd)

Close the file descriptor.


11 | P a g e
Understanding errors

The Unix error convention is that system calls return -1 on error. A global variable, int errno, is
then set so the program can tell what kind of error occurred. The <errno.h> header file defines
symbolic names for specific error conditions. Each name starts with E. For example, the system
calls above “return EBADF if fd is not an open file descriptor.” This actually means that the system
call returns the value -1 (cast to the appropriate type), and the global errno variable is set to the
constant EBADF.

The const char* strerror(int errnum) library function returns a textual string describing an error
constant. For instance, strerror(EINVAL) returns "Invalid argument". This might be useful for
debugging.

A system call’s manual page will list the errors it might return.

6. Additional system calls

The following system calls might also be useful for problem set 4, depending on your
implementation strategy. Read their manual pages, consult CS:APP3e or our handout code, or
contact Piazza for more.

• void* mmap(void* addr, size_t len, int prot, int flags, int fd, off_t offset)
Memory-map a portion of a file, returning the mapped address. Returns MAP_FAILED ==
(void*) -1 on error. Doesn’t work for all file types.

• int munmap(void* addr, size_t len)


Unmap a previously-mapped memory region.

• int madvise(void* addr, size_t len, int advice)


Provide prefetching advice for a portion of a memory-mapped region.

• int posix_fadvise(int fd, off_t pos, off_t len, int advice)


Provide prefetching advice for a portion of a file descriptor.

7. Linux system call

A system call is a procedure that provides the interface between a process and the operating
system. It is the way by which a computer program requests a service from the kernel of the
operating system.

12 | P a g e
Different operating systems execute different system calls.

In Linux, making a system call involves transferring control from unprivileged user mode to
privileged kernel mode; the details of this transfer vary from architecture to architecture. The
libraries take care of collecting the system-call arguments and, if necessary, arranging those
arguments in the special form necessary to make the system call.

How are system calls made?

When a computer software needs to access the operating system's kernel, it makes a system call.
The system call uses an API to expose the operating system's services to user programs. It is the
only method to access the kernel system. All programs or processes that require resources for
execution must use system calls, as they serve as an interface between the operating system and
user programs.

Below are some examples of how a system call varies from a user function.

1. A system call function may create and use kernel processes to execute the asynchronous
processing.

2. A system call has greater authority than a standard subroutine. A system call with kernel-
mode privilege executes in the kernel protection domain.

3. System calls are not permitted to use shared libraries or any symbols that are not present
in the kernel protection domain.

4. The code and data for system calls are stored in global kernel memory.

Why do you need system calls in Operating System?

There are various situations where you must require system calls in the operating system.
Following of the situations are as follows:

1. It is must require when a file system wants to create or delete a file.

2. Network connections require the system calls to sending and receiving data packets.

3. If you want to read or write a file, you need to system calls.

4. If you want to access hardware devices, including a printer, scanner, you need a system call.

5. System calls are used to create and manage new processes.

13 | P a g e
How System Calls Work

The Applications run in an area of memory known as user space. A system call connects to the
operating system's kernel, which executes in kernel space. When an application creates a system
call, it must first obtain permission from the kernel. It achieves this using an interrupt request,
which pauses the current process and transfers control to the kernel.

If the request is permitted, the kernel performs the requested action, like creating or deleting a
file. As input, the application receives the kernel's output. The application resumes the procedure
after the input is received. When the operation is finished, the kernel returns the results to the
application and then moves data from kernel space to user space in memory.

A simple system call may take few nanoseconds to provide the result, like retrieving the system
date and time. A more complicated system call, such as connecting to a network device, may take
a few seconds. Most operating systems launch a distinct kernel thread for each system call to avoid
bottlenecks. Modern operating systems are multi-threaded, which means they can handle various
system calls at the same time.

System calls are divided into 5 categories mainly:

i. Process Control
ii. File Management
iii. Device Management
iv. Information Maintenance
v. Communication

14 | P a g e
i. Process Control

This system calls perform the task of process creation, process termination, etc.

The Linux System calls under this are fork() , exit() , exec().

• fork()

• A new process is created by the fork() system call.

• A new process may be created with fork() without a new program being
run-the new sub-process simply continues to execute exactly the same
program that the first (parent) process was running.

• It is one of the most widely used system calls under process management.

• exit()

• The exit() system call is used by a program to terminate its execution.

• The operating system reclaims resources that were used by the process
after the exit() system call.

• exec()

• A new program will start executing after a call to exec()

• Running a new program does not require that a new process be created
first: any process may call exec() at any time. The currently running
program is immediately terminated, and the new program starts executing
in the context of the existing process.

ii. File Management

File management system calls handle file manipulation jobs like creating a file, reading, and
writing, etc. The Linux System calls under this are open(), read(), write(), close().

a. open():

• It is the system call to open a file.

• This system call just opens the file, to perform operations such as read and
write, we need to execute different system call to perform the operations.

15 | P a g e
b. read():

• This system call opens the file in reading mode

• We can not edit the files with this system call.

• Multiple processes can execute the read() system call on the same file
simultaneously.

c. write():

• This system call opens the file in writing mode

• We can edit the files with this system call.

• Multiple processes can not execute the write() system call on the same file
simultaneously.

d. close():

• This system call closes the opened file.

e. lseek()

• reposition read/write file offset

f. link()
• make a new name for a file

If newpath exists it will not be overwritten.

This new name may be used exactly as the old one for any operation; both names refer to the same
file (and so have the same permissions and ownership) and it is impossible to tell which name was
the `original’.

g. symlink()

• make a new name for a file

symlink() creates a symbolic link named newpath which contains the string oldpath.

Symbolic links are interpreted at run-time as if the contents of the link had been substituted into
the path being followed to find a file or directory.

16 | P a g e
Symbolic links may contain .. path components, which (if used at the start of the link) refer to the
parent directories of that in which the link resides.

A symbolic link (also known as a soft link) may point to an existing file or to a nonexistent one; the
latter case is known as a dangling link.

The permissions of a symbolic link are irrelevant; the ownership is ignored when following the
link, but is checked when removal or renaming of the link is requested and the link is in a directory
with the sticky bit (S_ISVTX) set.

If newpath exists it will not be overwritten.

h. unlink()

• To delete a link (a path) in a directory we can use the unlink system call. Its syntax
is:

#include <unistd.h>

int unlink(const char* path);

The function returns 0 in case of success and -1 otherwise. The function decrements the hard link
counter in the i-node and deletes the appropriate directory entry for the file whose link was
deleted. If the number of links of a file becomes 0 then the space occupied by the file and its i-node
will be freed. Only the root can delete a directory.

System calls STAT, LSTAT and FSTAT

• In order to obtain more details about a file the following system call can be
used: stat, lstat or fstat.

#include <sys/types.h>

#include <sys/stat.h>

int stat(const char* path, struct stat* buf);

int lstat(const char* path, struct stat* buf);

int fstat(int df, struct stat* buf);

17 | P a g e
These three functions return 0 in case of success and -1 in case of an error. The first two gets as
input parameter a name of a file and completes the structure of the buffer with additional
information read from its i-node. The fstat function is similar, but it works for files that were
already opened and for which the file descriptor is known. The difference between stat and lstat is
that in case of a symbolic link, function stat returns information about the linked (refered) file,
while lstat returns information about the symbolic link file. The struct stat structure is described
in the sys/stat.h header and has the following fields:

struct stat {

mode_t st_mode; /* file type & rights */

ino_t st_ino; /* i-node */

dev_t st_dev; /* număr de dispozitiv (SF) */

nlink_t st_nlink; /* nr of links */

uid_t st_uid; /* owner ID */

gid_t st_gid; /* group ID */

off_t st_size; /* ordinary file size */

time_t st_atime; /* last time it was accessed */

time_t st_mtime; /* last time it was modified */

time_t st_ctime; /* last time settings were changed */

dev_t st_rdev; /* nr. dispozitiv */

/* pt. fişiere speciale /

long st_blksize; /* optimal size of the I/O block */

long st_blocks; /* nr of 512 byte blocks allocated */

};

18 | P a g e
The Linux command that the most frequently uses this function is ls. Type declarations for the
members of this structure can be found in the sys/stat.h header. The type and access rights for the
file are encrypted in the st_mode field and can be determined using the following macros:

a) System call CHMOD

To modify the access rights for an existing file we use:

#include <sys/types.h>

#include <sys/stat.h>

int chmod(const char* path, mode_t mod);

The function returns 0 in case of a success and -1 otherwise. The chmod call modifies the access
rights of the file specified by the path depending on the access rights specified by
the mod argument. To be able to modify the access rights the effective UID of the process has to be
identical to the owner of the file or the process must have root rights.

The mod argument can be specified by one of the symbolic constants defined in
the sys/stat.h header. Their effect can be obtained by making a bitwise OR operation on them:

Bit masks for testing the access rights of a file

Mode Description

S_ISUID Sets the suid bit.

S_ISGID Sets the sgid bit.

S_ISVTX Sets the sticky bit.

S_IRWXU Read, write, execute rights for the owner obtained from: S_IRUSR | S_IWUSR | S_IXUSR

S_IRWXG Read, write, execute rights for the group obtained from: S_IRGRP | S_IWGRP | S_IXGRP

S_IRWXO Read, write, execute rights for others obtained from: S_IROTH | S_IWOTH | S_IXOTH

19 | P a g e
b) System call CHOWN

This system call is used to modify the owner (UID) and the group (GID) that a certain file
belongs to. The syntax of the function is:

#include <sys/types.h>

#include <unistd.h>

int chown(const char* path, uid_t owner, gid_t grp);

The function returns 0 in case of success and -1 in case of an error. Calling this function will change
the owner and the group of the file specified by the argument path to the values specified by the
arguments owner and grp. None of the users can change the owner of any file (even of his/her own
files), except the root user, but they can change the GID for their own files to that of any group they
belong to.

iii. Device Management

Device management does the job of device manipulation like reading from device buffers,
writing into device buffers, etc. The Linux System calls under this is ioctl().

• ioctl()

• ioctl() is referred to as Input and Output Control.

• ioctl is a system call for device-specific input/output operations and other


operations which cannot be expressed by regular system calls.

iv. Information Maintenance

It handles information and its transfer between the OS and the user program. In addition, OS
keeps the information about all its processes and system calls are used to access this
information. The System calls under this are getpid(), alarm(), sleep().

• getpid()

• getpid stands for Get the Process ID.

• The getpid() function shall return the process ID of the calling process.

20 | P a g e
• The getpid() function shall always be successful and no return value is
reserved to indicate an error.

• alarm()

• This system call sets an alarm clock for the delivery of a signal that when
it has to be reached.

• It arranges for a signal to be delivered to the calling process.

• sleep()

• This System call suspends the execution of the currently running process
for some interval of time

• Meanwhile, during this interval, another process is given chance to


execute

v. Communication

These types of system calls are specially used for inter-process communications.

Two models are used for inter-process communication

1. Message Passing (processes exchange messages with one another)

2. Shared memory (processes share memory region to communicate)

The system calls under this are pipe() , shmget() ,mmap().

• pipe()

• The pipe() system call is used to communicate between different Linux


processes.

• It is mainly used for inter-process communication.

• The pipe() system function is used to open file descriptors.

• shmget()

• shmget stands for shared memory segment.

• It is mainly used for Shared memory communication.

21 | P a g e
• This system call is used to access the shared memory and access the
messages in order to communicate with the process.

• mmap()

• This function call is used to map or unmap files or devices into memory.

• The mmap() system call is responsible for mapping the content of the file
to the virtual memory space of the process.

• Examples of Windows and Unix system calls

• There are various examples of Windows and Unix system calls. These are as listed below in
the table:

Process Windows Unix

Process Control CreateProcess() Fork()


ExitProcess() Exit()
WaitForSingleObject() Wait()

File Manipulation CreateFile() Open()


ReadFile() Read()
WriteFile() Write()
CloseHandle() Close()

Device Management SetConsoleMode() Ioctl()


ReadConsole() Read()
WriteConsole() Write()

Information Maintenance GetCurrentProcessID() Getpid()


SetTimer() Alarm()
Sleep() Sleep()

22 | P a g e
Communication CreatePipe() Pipe()
CreateFileMapping() Shmget()
MapViewOfFile() Mmap()

Protection SetFileSecurity() Chmod()


InitializeSecurityDescriptor() Umask()
SetSecurityDescriptorgroup() Chown()

8. Device Drivers in Linux

Drivers are used to help the hardware devices interact with the operating system. In windows,
all the devices and drivers are grouped together in a single console called device manager. In
Linux, even the hardware devices are treated like ordinary files, which makes it easier for the
software to interact with the device drivers. When a device is connected to the system, a device
file is created in /dev directory.

Most Common types of devices in Linux

1. Character devices – These devices transmit the data character by characters, like a
mouse or a keyboard.

2. Block devices – These devices transfer unit of data storage called a block, USB
drives, hard drives, and CD ROMs

To list all the device files, use the below command.

ls -l /dev

23 | P a g e
In the above output, we can see some other types of file types, some of them have B for a block
device, C for character device some devices start with /dev/sda or /sdb. In Linux, the disk names
are alphabetical. For example, dev/sda is the first hard drive, dev/sdb is the second hard drive,
and so on. These devices are mass storage devices like memory sticks, hard drives, etc. Hence,
sda means that this device was detected by the computer first. Example of character device is
: /dev/consoles or /dev/ttyS0. These devices are accessed as a stream of bytes. Example of block
device: /dev/sdxn. Block devices allow the programmer to read and write any size of the block.
Pseudo devices act as device drivers without an actual device. Examples of pseudo devices
are /dev/null, /dev/zero, /dev/pf etc.

9. Disk and Driver Commands

i. fdisk – It stands for format disk. This command is used to display the partitions on a
disk and other details related to the file system.

sudo fdisk -l

24 | P a g e
ii. sfdisk – This command displays the partitions on the disk, the size of each partition in
MB, etc.

iii. parted – This command helps list and modify the partitions of the disk.

sudo parted -l

25 | P a g e
iv. df – Displays the details of the file system. Using grep we can filter real hard disk files.

df -h | grep ^/dev

26 | P a g e
v. lsblk – List details about the block devices.

lsblk

vi. inxi – Lists details about the hardware components in the file system.
inxi -D -xx

27 | P a g e
10. Directory API

1. opendir()

• The opendir() function shall open a directory stream corresponding to the directory named
by the dirname argument. The directory stream is positioned at the first entry. If the type
DIR is implemented using a file descriptor, applications shall only be able to open up to a
total of {OPEN_MAX} files and directories.

2. readdir()

readdir, readdir_r - read a directory

SYNOPSIS

#include <dirent.h>
struct dirent *readdir(DIR *dirp);

int readdir_r(DIR *restrict dirp, struct dirent *restrict entry,


struct dirent **restrict result);

DESCRIPTION

The type DIR, which is defined in the <dirent.h> header, represents a directory stream, which is an
ordered sequence of all the directory entries in a particular directory. Directory entries represent
files; files may be removed from a directory or added to a directory asynchronously to the
operation of readdir().

The readdir() function shall return a pointer to a structure representing the directory entry at the
current position in the directory stream specified by the argument dirp, and position the directory
stream at the next entry. It shall return a null pointer upon reaching the end of the directory
stream. The structure dirent defined in the <dirent.h> header describes a directory entry.

The readdir() function shall not return directory entries containing empty names. If entries for dot
or dot-dot exist, one entry shall be returned for dot and one entry shall be returned for dot-dot;
otherwise, they shall not be returned.

28 | P a g e
The pointer returned by readdir() points to data which may be overwritten by another call
to readdir() on the same directory stream. This data is not overwritten by another call to readdir()
on a different directory stream.

If a file is removed from or added to the directory after the most recent call
to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is
unspecified.

The readdir() function may buffer several directory entries per actual read operation; readdir()
shall mark for update the st_atime field of the directory each time the directory is actually read.

After a call to fork(), either the parent or child (but not both) may continue processing the
directory stream using readdir(), rewinddir(), [XSI] or seekdir(). If both the parent and child
processes use these functions, the result is undefined.

If the entry names a symbolic link, the value of the d_ino member is unspecified.

The readdir() function need not be reentrant. A function that is not required to be reentrant is not
required to be thread-safe.

The readdir_r() function shall initialize the dirent structure referenced by entry to represent the
directory entry at the current position in the directory stream referred to by dirp, store a pointer
to this structure at the location referenced by result, and position the directory stream at the next
entry.

The storage pointed to by entry shall be large enough for a dirent with an array
of char d_name members containing at least {NAME_MAX}+1 elements.

Upon successful return, the pointer returned at *result shall have the same value as the
argument entry. Upon reaching the end of the directory stream, this pointer shall have the value
NULL.

The readdir_r() function shall not return directory entries containing empty names.

If a file is removed from or added to the directory after the most recent call
to opendir() or rewinddir(), whether a subsequent call to readdir_r() returns an entry for that file
is unspecified.

29 | P a g e
The readdir_r() function may buffer several directory entries per actual read operation;
the readdir_r() function shall mark for update the st_atime field of the directory each time the
directory is actually read.

Applications wishing to check for error situations should set errno to 0 before calling readdir().
If errno is set to non-zero on return, an error occurred.

3. closedir()

• closedir - close a directory stream

SYNOPSIS

#include <dirent.h>
int closedir(DIR *dirp);

DESCRIPTION

The closedir() function shall close the directory stream referred to by the argument dirp. Upon
return, the value of dirp may no longer point to an accessible object of the type DIR. If a file
descriptor is used to implement type DIR, that file descriptor shall be closed.

RETURN VALUE

Upon successful completion, closedir() shall return 0; otherwise, -1 shall be returned and errno set
to indicate the error.

ERRORS

The closedir() function may fail if:

[EBADF]

The dirp argument does not refer to an open directory stream.

[EINTR]

The closedir() function was interrupted by a signal.

30 | P a g e
4. umask()

The umask command in Linux is used to set default permissions for files or directories the user
creates.

How does the umask command work?

• The umask command specifies the permissions that the user does not want to be given
out to the newly created file or directory.

• umask works by doing a Bitwise AND with the bitwise complement(where the bits are
inverted, i.e. 1 becomes 0 and 0 becomes 1) of the umask.

• The bits which are set in the umask value, refer to the permissions, which are not
assigned by default, as these values are subtracted from the maximum permission for
files/directories.

How to calculate umask value?

Syntax:

$umask

11. File Links

A. Soft and Hard links in Unix/Linux

A link in UNIX is a pointer to a file. Like pointers in any programming languages, links in UNIX
are pointers pointing to a file or a directory. Creating links is a kind of shortcuts to access a file.
Links allow more than one file name to refer to the same file, elsewhere.

There are two types of links:

1. Soft Link or Symbolic links

2. Hard Links

These links behave differently when the source of the link (what is being linked to) is moved or
removed. Symbolic links are not updated (they merely contain a string which is the path name
of its target); hard links always refer to the source, even if moved or removed.

31 | P a g e
For example, if we have a file a.txt. If we create a hard link to the file and then delete the file, we
can still access the file using hard link. But if we create a soft link of the file and then delete the
file, we can’t access the file through soft link and soft link becomes dangling. Basically, hard link
increases reference count of a location while soft links work as a shortcut (like in Windows)

1. Hard Links
Each hard linked file is assigned the same Inode value as the original, therefore they reference
the same physical file location. Hard links more flexible and remain linked even if the original
or linked files are moved throughout the file system, although hard links are unable to cross
different file systems.

• ls -l command shows all the links with the link column shows number of links.

• Links have actual file contents

• Removing any link, just reduces the link count, but doesn’t affect other links.

• Even if we change the filename of the original file then also the hard links properly
work.

• We cannot create a hard link for a directory to avoid recursive loops.

• If original file is removed then the link will still show the content of the file.

• The size of any of the hard link file is same as the original file and if we change the
content in any of the hard links then size of all hard link files are updated.

• The disadvantage of hard links is that it cannot be created for files on different file
systems and it cannot be created for special files or directories.

• Command to create a hard link is:

$ ln [original filename] [link name]

2. Soft Links

• A soft link is similar to the file shortcut feature which is used in Windows Operating
systems. Each soft linked file contains a separate Inode value that points to the
original file. As similar to hard links, any changes to the data in either file is reflected
32 | P a g e
in the other. Soft links can be linked across different file systems, although if the
original file is deleted or moved, the soft linked file will not work correctly (called
hanging link).

• ls -l command shows all links with first column value l? and the link points to
original file.

• Soft Link contains the path for original file and not the contents.

• Removing soft link doesn’t affect anything but removing original file, the link
becomes “dangling” link which points to nonexistent file.

• A soft link can link to a directory.

• The size of the soft link is equal to the length of the path of the original file we gave.
E.g if we link a file like ln -s /tmp/hello.txt /tmp/link.txt then the size of the file will
be 14bytes which is equal to the length of the “/tmp/hello.txt”.

• If we change the name of the original file then all the soft links for that file become
dangling i.e. they are worthless now.

• Link across file systems: If you want to link files across the file systems, you can only
use symlinks/soft links.

• Command to create a Soft link is:

$ ln -s [original filename] [link name]

33 | P a g e

You might also like