Linux
Fundamentals
Linux is a free and open-source operating system (OS) based
on Unix. It acts as a bridge between computer hardware and
software applications. Unlike proprietary systems like Windows or
macOS,
Linux is developed collaboratively and distributed under the GNU
General Public License (GPL).
Kernel-based system used in servers, desktops, mobile devices,
and embedded systems.
Community-driven development
Linux Fundamentals
1. Kernel: The core component that manages hardware, memory,
processes, and system calls.
2. System Libraries: Provide functions for programs to interact
with the kernel.
3. System Utilities: Basic tools for managing the system (e.g., file
handling, user management).
4. Shell: Command-line interface (e.g., Bash) for user interaction.
5. Applications: Software that runs on top of the OS (e.g.,
browsers, editors).
Linux File System Hierarchy
Hierarchical structure starting from / (root)
Key directories:
/bin: Essential binaries
/etc: Configuration files
/home: User directories
/var: Variable data (logs, mail)
/usr: User-installed software
/dev: Device files
/proc: Kernel and process info
Basic Linux Commands
Command Description
ls List directory contents
cd Change directory
pwd Print working directory
cp Copy files/directories
mv Move/rename files
rm Remove files/directories
mkdir Create directory
touch Create empty file
chown Change ownership
File Permissions
Three types of users:
Owner
Group
Others
Permission types:
Read (r)
Write (w)
Execute (x)
Example: -rwxr-xr--
Owner: read, write, execute
Group: read, execute
Others: read
Networking
A network is a group of interconnected devices that can
communicate and share data.
Types of Network
LAN (Local Area Network)
WAN (Wide Area Network)
MAN (Metropolitan Area Network)
PAN (Personal Area Network)
WLAN (Wireless LAN)
VPN (Virtual Private Network)
Topologies like Star ,Bus, Ring, Mesh, Tree, Hybrid are available.
Network components: This includes hardware like routers, switches,
and wireless access points, as well as software like operating
systems and applications.
Protocols: Protocols are sets of rules that govern how data is
exchanged over a network. Examples include TCP/IP, DNS, and
DHCP.
Network security: This includes measures to protect the network
from unauthorized access and data breaches.
IP addressing: IP addresses are numerical labels that identify
devices on a network.
OSI model: A layered model that describes how data is transmitted
over a network.
Network topology: The physical or logical arrangement of devices in
a network.
Firewalls: Network security devices that monitor and filter incoming
and outgoing network traffic
Routers: Devices that forward data packets between computer
networks
Switches: Devices that connect devices within a network and use
MAC addresses to forward data to the correct destination.
Subnetting: Dividing a large network into smaller subnetworks to
improve efficiency.
Ethernet: A common networking technology for wired connections.
GIT HUB
What is Git and Github
Git is a distributed version control system that helps developers
track changes in source code during software development. It
allows multiple people to work on a project simultaneously
without overwriting each other's work.
GitHub is a cloud-based platform that hosts Git repositories and
provides tools for collaboration, code review, and project
management.
Some of its features are Remote repositories, Pull requests and
code reviews, Issues and project boards, Actions for CI/CD
Concept Description
Repository (repo) A directory tracked by Git
Commit A snapshot of changes with a message
Copy a remote repo to your local
Clone
machine
Push Upload local changes to a remote repo
Pull Download changes from a remote repo
Status Shows current changes
Log Shows commit history
Branching in Git
Branching allows you to diverge from the main line of development
and work on features or fixes independently.
Common Branches:
- main or master: Production-ready code
- feature/xyz: New feature development
- bugfix/abc: Fixing a bug
- dev: Development integration
Commands:
git branch feature-login # Create a branch
git checkout feature-login # Switch to the branch
git switch -c feature-login # Create and switch (newer syntax)
Merging in Git
Merging integrates changes from one branch into another.
Types of Merges:
- Fast-forward: Linear history
- Three-way merge: When branches diverge
Commands:
git checkout main
git merge feature-login
If there are conflicts, Git will prompt you to resolve them manually.
Pull Request in Git
A Pull Request is a way to propose changes to a repository. It allows
team members to review, discuss, and approve changes before
merging.
PR Workflow:
1. Create a branch
2. Make changes and commit
3. Push to GitHub
4. Open a pull request
5. Review and discuss
6. Merge into the main branch
Example workflow
# Clone a repo
git clone https://github.com/user/repo.git
# Create a new branch
git checkout -b feature-xyz
# Make changes and commit
git add .
git commit -m "Add feature xyz"
# Push to GitHub
git push origin feature-xyz
# Open a pull request on GitHub
Bash Scripting
Bash (Bourne Again SHell) is a Unix shell and command language. It
is the default shell on most Linux distributions and macOS.
A Bash script is a plain text file containing a series of commands that
the shell can execute.
To run the Bash Script: Save the script as script.sh Make it executable:
Why Use Bash Scripting?
Automate repetitive tasks
Manage system operations
Schedule jobs (with cron)
Create custom command-line tools
Simplify complex command sequences
Syntax
#!/bin/bash
#!/bin/bash: Shebang line tells the system to use
Bash
# This is a comment
echo "Hello, World!"
echo: Prints text to the terminal
Variables
name="Alice"
echo "Hello, $name“
Note: Use $ to reference variables
Basics
If-Else
if [ "$name" == "Alice" ]; then
echo "Welcome, Alice!"
else
echo "Access denied."
fi
For
for i in 1 2 3; do
echo "Number $i"
done
While
count=1
while [ $count -le 5 ]; do
echo "Count is $count"
((count++))
done
Read Input:
read -p "Enter your name: " username
echo "Hello, $username"
Read Output:
echo "Log entry" >> log.txt # Append to file
Functions
greet() {
echo "Hello, $1"
}
greet "Bob"
Commands
Command Purpose
ls List files
cp Copy files
mv Move/rename files
rm Delete files
grep Search text
awk Pattern scanning
sed Stream editing
Python File handling
File Handling:
Python offers built-in functions for file
manipulation. Key operations include:
•Opening files: Using open(), with modes like 'r' for reading, 'w'
for writing (overwrites existing content), 'a' for appending, and
'r+' for reading and writing.
•Reading files: Methods like read(), readline(), and readlines() to
access file content.
•Writing files: Using write() to add content.
•Closing files: The close() method or the with statement ensures
proper resource management.
Subprocess
The subprocess module enables running external commands and
managing processes.
subprocess.run(): Executes a command and waits for completion,
returning a CompletedProcess object. It is the recommended method
for most use cases.
Code:
import subprocess result = subprocess.run(['ls', '-l'],
capture_output=True, text=True)
print(result.stdout)
subprocess.Popen(): Offers more control over process creation and
management, allowing interaction with input/output streams.
Automation
Python is widely used for automation due to its versatility and libraries.
File management: Automating tasks such as copying, moving,
renaming, and deleting files using os and shutil modules.
Task scheduling: Using tools like cron (Linux/macOS) or Task Scheduler
(Windows) to run scripts at specific times.
System administration: Automating backups, service management,
and other system tasks through subprocess.
Workflow automation: Integrating different tools and services to
streamline processes.
Web scraping: Automating data extraction from websites.
Data processing: Automating data cleaning, transformation, and
analysis.
Docker Architecture
Docker is a platform for developing, shipping, and running applications. It
uses containerization to provide a lightweight, portable, and consistent
runtime environment.
Key Components:
Docker Engine: Core component that runs and manages containers.
Docker Daemon: Background service that manages Docker containers.
Docker Client: Command-line interface to interact with Docker Daemon.
Docker Registry: Repository for Docker images (e.g., Docker Hub).
Docker Compose: Tool for defining and running multi-container Docker
applications.
Building Docker Images
Docker images are the blueprints for containers. They contain everything
needed to run an application.
Steps to Build an Image:
1. Write a Dockerfile: Define the image's contents and instructions.
2. Build the Image: Use `docker build` command to create the image.
3. Tag the Image: Assign a name and version to the image.
4. Push to Registry: Upload the image to a Docker registry for sharing.
Running a Docker Container
Docker containers are instances of Docker images. They run
applications in isolated environments.
Steps to Run a Container:
1. Pull the Image: Download the image from a registry using
`docker pull`.
2. Run the Container: Use `docker run` command to start the
container.
3. Manage Containers: Use commands like `docker ps`, `docker
stop`, `docker rm` to manage containers.
4. Access Container: Use `docker exec` to run commands inside the
container.
Writing a Dockerfile
A Dockerfile is a text file that contains instructions for building a
Docker image.
Example Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
AWS-EC2
AWS, or Amazon Web Services, is a comprehensive cloud computing platform
offered by Amazon. It provides a wide range of services, including compute,
storage, databases, and analytics, on a pay-as-you-go basis. AWS is used by
businesses, startups, and government agencies to lower costs, become more
agile, and innovate faster.
Which includes many services like EC2, VPC, IAM, S3 etc…
EC2 -Elastic Compute Cloud which is used for providing scalable computing
capacity in the AWS cloud.
Key Features include variety of instance types, auto-scaling, elastic Load
Balancing, integration with other AWS services
S3- Simple Storage Service
AWS S3 is an object storage service that offers industry-leading scalability,
data availability, security, and performance.
Amazon S3 is used for various purposes in the Cloud because of its robust
features with scaling and Securing of data.
Use Cases of S3 include backup and restore, data archiving, big data analytics
and content storage and distribution.
Amazon S3 works on organizing the data into unique S3 Buckets, customizing
the buckets with Acccess controls.
Key Features includes unlimited storage, versioning, lifecycle management,
cross-region replication.
AWS-IAM
Identity and Access Management (IAM) is a web service that
helps you securely control access to AWS resources. It allows you
to manage users, groups, roles, and policies, ensuring that the
right people and applications have the necessary permissions.
AWS IAM enables you to manage access to AWS services and
resources securely.
Key Features includes fine-grained permissions, multi-factor
authentication (MFA), temporary security credentials, integration
with other AWS services
AWS- VPC
An AWS Virtual Private Cloud (VPC) is a logically isolated section
of the AWS cloud where you can launch AWS resources in a
virtual network you define. You have complete control over your
networking environment within the VPC, including IP address
ranges, subnets, and route tables. This allows you to create a
network that closely resembles your own data center with the
benefits of AWS's scalable infrastructure.
Key Features includes subnets, route tables, internet gateways,
security groups and network ACLs.
Jenkins
Jenkins is a tool that helps software development teams automate the
process of making their applications ready for release (building, testing, and
deploying).
CI (Continuous Integration):
This focuses on integrating code changes frequently into a shared repository
and automatically running tests to catch any issues early on.
CD (Continuous Delivery):
This extends CI by automating the building, packaging, and deployment of
code to various environments (like test, staging, and production).
Benefits of using CI/CD with Jenkins:
Faster development cycles
Improved code quality:
Reduced deployment errors:
Increased collaboration:
Easy To maintain
To create a simple CI/CD pipeline in Jenkins, you first need to install and
configure Jenkins. Then, you can create a new pipeline project within Jenkins,
define the stages of your pipeline in a Jenkinsfile, and configure the project to
trigger on changes to your code repository.
Installing Jenkins
1. Install Java Development Kit (JDK)
2. Download Jenkins from the official website
3. Run Jenkins installer
4. Access Jenkins through http://localhost:8080
5. Complete the setup wizard
Creating Pipeline
1. Open Jenkins dashboard
2. Click on 'New Item'
3. Enter a name and select 'Pipeline'
4. Configure the pipeline script
5. Save and run the pipeline
pipeline {
agent any
stages {
stage('Build') {
steps {
echo 'Building...'
}
}
stage('Test') {
steps {
echo 'Testing...'
}}
stage('Deploy') {
steps {
echo 'Deploying...'
Terraform
Terraform is a popular Infrastructure as Code (IaC) tool that allows you to
define and manage your infrastructure resources in a declarative manner,
using code.
Instead of manually configuring infrastructure through interfaces, you define
your desired state in configuration files, which Terraform then uses to
provision and manage your resources.
What is Infrastructure as Code (IaC)?
IaC is the practice of managing and provisioning computer data centers
through machine-readable definition files rather than physical hardware
configuration or interactive configuration tools. It allows you to treat your
infrastructure like software, with version control, repeatability, and scalability.
How do Terraform work
1. Define Your Desired State:
You write configuration files using Terraform’s HashiCorp Configuration Language
(HCL to describe the infrastructure resources you want to create, such as virtual
machines, networks, and storage.
2. Plan the Changes:
Terraform analyzes your configuration files and compares them to your existing
infrastructure, generating a plan of changes required to reach your desired state.
3. Apply the Changes:
Terraform executes the plan, provisioning and managing your infrastructure resources
to match your configuration.
4. State Management:
Terraform keeps track of your infrastructure resources in a state file, which is used to
determine the changes required to maintain consistency between your configuration
and the actual infrastructure.
5. Versioning and Collaboration:
You can version control your Terraform configuration files, allowing you to track
changes, collaborate with others, and roll back to previous versions if needed.
HCL (HashiCorp Configuration Language) is the syntax used by Terraform, an
infrastructure-as-code tool, to define and manage your cloud infrastructure. It's
designed to be human-readable and allows you to express the desired state of
your infrastructure resources in a concise and declarative way. HCL is a key
component of Terraform, enabling you to automate the provisioning and
management of resources across multiple cloud providers and on-premises
environments.
Key aspects of HCL in Terraform include declarative language, human-readable,
IaaS, Blocks, attributes, variables.
HCL is the language that powers Terraform, enabling you to define and manage
your infrastructure as code in a human-readable and maintainable way. It plays
a crucial role in the automation and orchestration of cloud resources.
Kubernets
Kubernetes is an open-source container orchestration platform
that automates the deployment, management, and scaling of
containerized applications.
It essentially acts as an orchestrator, managing and coordinating
containers across a cluster of servers, making it easier to run
distributed applications at scale.
Key aspects of Kubernetes are Container Orchestration,
scalability, Automated Management, portability, open source and
docker integration.
In Kubernetes, pods are the fundamental units of deployment, representing a single
instance of an application. Deployments manage pods, ensuring a specified number
of replicas are running and allowing for updates and scaling. Services provide a
stable network address to a set of pods, enabling communication and load
balancing.
Pods:
Basic Unit:
A pod is the smallest deployable unit in Kubernetes, encapsulating one or more
containers.
Resource Sharing:
Pods share resources like storage (volumes), network IP, and a shared context for
containers within them.
Ephemeral:
Pods are typically ephemeral, meaning they can be created and destroyed, and
their IP addresses can change.
Not Directly Managed:
Pods are usually not directly managed in production, but rather through higher-level
abstractions like Deployments.
2. Deployments:
Management: Deployments manage pods, ensuring that the desired number of replicas
are always running.
Scalability and Updates: They provide features for scaling applications up or down, and
for rolling out new versions of an application with minimal downtime.
ReplicaSets: Deployments use ReplicaSets internally to manage pods and ensure the
desired number of replicas are running.
Desired State: Deployments define the desired state of the application, and Kubernetes
works to maintain that state.
3. Services:
Network Abstraction: Services provide a stable IP address and DNS name for a set of
pods, enabling communication between pods and external clients.
Load Balancing: Services can act as a load balancer, distributing traffic across multiple
pods.
Stable Communication: Services address the ephemeral nature of pods by providing a
persistent endpoint for communication.
Types: Kubernetes offers various service types, including ClusterIP, LoadBalancer,
NodePort, and ExternalName, each with different exposure strategies.
In essence:
Pods are the running instances of your application.
Deployments ensure that you have the right number of pods
running and handle updates and scaling.
Services provide a way for external clients and other pods to
communicate with your application pods, regardless of their
individual IP addresses.