0% found this document useful (0 votes)
19 views12 pages

Technical Seminar Report

The document discusses prediction delay in airlines using machine learning algorithms. It introduces machine learning and different machine learning algorithms like logistic regression, decision trees, and random forests that can be used for prediction. It then discusses the need to analyze and predict flight delays to reduce large costs to the aviation industry.

Uploaded by

Lipika Slt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views12 pages

Technical Seminar Report

The document discusses prediction delay in airlines using machine learning algorithms. It introduces machine learning and different machine learning algorithms like logistic regression, decision trees, and random forests that can be used for prediction. It then discusses the need to analyze and predict flight delays to reduce large costs to the aviation industry.

Uploaded by

Lipika Slt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

VISVERSVARAYA TECHNOLOGY UNIVERSITY

“Jnana Sangama”, Belagavi-590018, Karnataka

A Technical Seminar Report on

“PREDICTION DELAY IN AIRLINES”


Submitted in partial fulfillment of the requirement for the award of degree of

Bachelor of Engineering
In
Computer Science & Engineering
Submitted by

(1SP20CS028) LIPIKA

Under the Guidance of


Prof.JAYAKUMAR B L
Assistant Professor
Dept. Of CSE,
SEACET Bangalore-560049

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

S.E.A COLLEGE OF ENGINEERING & TECHNOLOGY


Ekta Nagar, K.R.Puram, Bengaluru-560049

2023-2024
VISVESVARAYA TECHNOLOGICAL UNIVERSITY
“Jnana Sangama”, Belagavi-590018, Karnataka

S.E.A COLLEGE OF ENGINEERING AND TECHNOLOGY


Ekta Nagar, K.R.Puram, Bengaluru-560049

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Certificate

This is to certify that Technical Seminar work entitled “Prediction delay in airlines” carried out by
Ms. Lipika, bearing USN 1SP20CS028, a bonafide student of VIII semester B.E for the partial
fulfillment of the requirements for the Bachelor’s Degree in Computer Science & Engineering of the
VISVESVARAYA TECHNOLOGY UNIVERSITY, during the year 2023-2024. It is certified that
all correction/suggestion indicated for Internal Assessment have been incorporated in the report
deposited in the department library. The Technical Seminar report has been approved as it satisfies
the academic requirements in respect of the Technical Seminar work prescribed for the said degree.

Signature of the Guide Signature of the HOD Signature of the principal


Prof. Jayakumar B L Dr. Krishna Kumar P R Dr. B. Venkata Narayana
ACKNOWLEDGEMENT

Firstly, I thank the Management late Shri. A KRISHNAPPA, Chairman of SEA College of
Engineering and Technology for providing necessary infrastructure and creating a good environment.

We would like to thank to our respected DR BHAGAVANT K DESHPANDE, Director of SEA


College Of Engineering And Technology for the encouragement and support given by him.

I express my deep sense of gratitude to S.E.A COLLEGE OF ENGINEERING &


TECHNOLOGY, BANGALORE that provided me an opportunity in fulfilling my most cherished
desire of reaching the goal.

I am thankful to our principal Dr. B.VENKATA NARAYANA, who is responsible for creating such
a pleasant environment and appreciating my talent in both academic and extracurricular activities.

I am thankful to Dr. KRISHNA KUMAR P R, HOD of Computer Science, who inspired me in my


work and stood for many of my work.

I whole-heartedly express my gratitude to my guide Prof. JAYAKUMAR B L who is responsible


for creating such a pleasant environment and appreciating my talent in both academic and extra-
curricular activities.

Lastly, I thank my parents, friends, lecturer and staff who provided me the much- needed moral
support while pursuing this project.

LIPIKA
(1SP20CS028)
CONTENTS

CHAPTER TOPICS PAGE NO

1 Introduction 1
1.1 Machine Learning
1.2 Logistic Regression Algorithm
1.3 Decision Tree Algorithm
1.4 Random Forest Algorithm
2 Literature Survey 8
3 Problem Statement 11
3.1 Existing System

3.1.1 Disadvantages

4 Development Process 12
4.1 Requirement Analysis
4.2 Resource Requirements
4.3 System Design
4.4 System Architecture
4.5 Module Description

21
5 System Study
5.1 Feasibility Study
5.1.1 Economic Feasibility
5.1.2 Technology Feasibility
5.1.3 Social Feasibility
6 Testing 23
6.1 Types of Tests
6.1.1 Unit Testing
6.1.2 Integration Testing
6.1.3 Function Test
6.1.4 System Test

Result 29

Conclusion And Future Works


References
CHAPTER-1

INTRODUCTION
Flight delay is studied vigorously in various research in recent years. The growing
demand for air travel has led to an increase in flight delays. According to the Federal
Aviation Administration (FAA), the aviation industry loses more than $3 billion in a year
due to flight delays and, as per BTS, in 2016 there were 860,646 arrival delays. The reasons
for the delay of commercial scheduled flights are air traffic congestion, passengers
increasing per year, maintenance and safety problems, adverse weather conditions, the late
arrival of plane to be used for next flight. In the United States, the FAA believes that a flight
is delayed when the scheduled and actual arrival times differs by more than 15 minutes.
Since it becomes a serious problem in the United States, analysis and prediction of flight
delays are being studied to reduce large costs.

1.1 Machine Learning


Machine learning is a growing technology which enables computers to learn automatically
from past data. Machine learning uses various algorithms for building mathematical models
and making predictions using historical data or information. Currently, it is being used for
various tasks such as image recognition, speech recognition, email filtering, Facebook
auto- tagging, recommender system, and many more.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned with
the development of algorithms which allow a computer to learn from the data and past
experiences on their own. The term machine learning was first introduced by Arthur
Samuel in 1959. We can define it in a summarized way as: “Machine learning enables a
machine to automatically learn from data, improve performance from experiences, and
predict things without being explicitly programmed”.
A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model
which predicts the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so
instead of writing a code for it, we just need to feed the data to generic algorithms, and with
the help of these algorithms, machine builds the logic as per the data and predict the

DEPT. OF CSE, SEACET 2023-24 Page 1


Prediction Delay In Airlines

output.Machine learning has changed our way of thinking about the problem. The below
block diagram explains the working of Machine Learning algorithm:

Fig 1.1: working of Machine Learning algorithm

1.1.1. Features of Machine Learning:


 Machine learning uses data to detect various patterns in a given dataset.
 It can learn from past data and improve automatically.
 It is a data-driven technology.
 Machine learning is much similar to data mining as it also deals with the huge
amount of the data.

1.1.2. Classification of Machine Learning


At a broad level, machine learning can be classified into three types:

1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning

1. Supervised Learning
Supervised learning is a type of machine learning method in which we provide sample
labeled data to the machine learning system in order to train it, and on that basis, it predicts
the output.
The system creates a model using labeled data to understand the datasets and learn about
each data, once the training and processing are done then we test the model by providing a
sample data to check whether it is predicting the exact output or not.
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.

DEPT. OF CSE, SEACET 2023-24 Page 2


Prediction Delay In Airlines

Supervised learning can be grouped further in two categories of algorithms:


 Classification
 Regression

2. Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns without any
supervision. The training is provided to the machine with the set of data that has not been
labeled,classified, or categorized, and the algorithm needs to act on that data without any
supervision. The goal of unsupervised learning is to restructure the input data into new
features or a group of objects with similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find
useful insights from the huge amount of data.
It can be further classifieds into two categories of algorithms:

a. Clustering
b. Association

1.2. LOGISTIC REGRESSION ALGORITHM


Logistic regression is one of the most popular Machine Learning algorithms, which comes
under the Supervised Learning technique. It is used for predicting the categorical
dependent variable using a given set of independent variables. Logistic regression predicts
the output of a categorical dependent variable. Therefore the outcome must be a categorical
or discrete value. It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving
the exact value as 0 and 1,it gives the probabilistic values which lie between 0 and 1.

Logistic Regression is much similar to the Linear Regression except that how they are
used. Linear Regression is used for solving Regression problems, whereas Logistic
regression is used for solving the classification problems. In Logistic regression, instead of
fitting a regression line.
we fit an "S" shaped logistic function, which predicts two maximum values (0 or 1). The
curve from the logistic function indicates the likelihood of something such as whether the
cells are cancerous or not, a mouse is obese or not based on its weight, etc.

Logistic Regression is a significant machine learning algorithm because it has the ability to

DEPT. OF CSE, SEACET 2023-24 Page 3


Prediction Delay In Airlines
AAAAAAAAAirlines

provide probabilities and classify new data using continuous and discrete datasets. Logistic
Regression can be used to classify the observations using different types of data and can
easily determine the most effective variables used for the classification. The below image
is showing the logistic function

Fig 1.2: logistic function

1.2.1. Logistic Function (Sigmoid Function):

• The sigmoid function is a mathematical function used to map the predicted


values to probabilities.
• It maps any real value into another value within a range of 0 and 1.
• The value of the logistic regression must be between 0 and 1, which cannot go
beyond this limit, so it forms a curve like the "S" form. The S-form curve is
called the Sigmoid function or the logistic function.

• In logistic regression, we use the concept of the threshold value, which defines
the probability of either 0 or 1. Such as values above the threshold value tends
to 1, and a value below the threshold values tends to 0.

DEPT. OF CSE, SEACET 2023-24 Page 4


Prediction Delay In Airlines

1.3. DECISION TREE ALGORITHM

Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-
structured classifier, where internal nodes represent the features of a dataset, branches represent
the decision rules and each leaf node represents the outcome.
In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision
nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches.
In order to build a tree, we use the CART algorithm, which stands for Classification and
Regression Tree algorithm.
A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees.

Below diagram explains the general structure of a decision tree:

Fig 1.3: general structure of a decision tree

DEPT. OF CSE, SEACET 2023-24 Page 5


Prediction Delay In Airlines
AAAAAAAAAirlines
sThere are various algorithms in Machine learning, so choosing the best algorithm for the
given dataset and problem is the main point to remember while creating a machine learning
model. Below are the two reasons for using the Decision tree:

c. Decision Trees usually mimic human thinking ability while making a decision,
so it is easy to understand.
d. The logic behind the decision tree can be easily understood because it shows a
tree-like structure.

In a decision tree, for predicting the class of the given dataset, the algorithm starts from the
root
node of the tree. This algorithm compares the values of root attribute with the record (real
dataset) attribute and, based on the comparison, follows the branch and jumps to the next
node.For the next node, the algorithm again compares the attribute value with the other sub-
nodes and move further. It continues the process until it reaches the leaf node of the tree. The
complete process can be better understood using the below algorithm:

• Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for the best attributes.
• Step-4: Generate the decision tree node, which contains the best attribute.
• Step-5: Recursively make new decision trees using the subsets of the dataset created in step
-3. Continue this process until a stage is reached where you cannot further classify the nodes
and called the final node as a leaf node.

1.4 RANDOM FOREST ALGORITHM

Random Forest is a popular machine learning algorithm that belongs to the supervised
learning technique. It can be used for both Classification and Regression problems in ML.
It is based on the concept of ensemble learning, which is a process of combining multiple
classifiers to solve a complex problem and to improve the performance of the model.
As the name suggests, "Random Forest is a classifier that contains a number of decision
trees on various subsets of the given dataset and takes the average to improve the
predictive accuracy of that dataset." Instead of relying on one decision tree, the random
forest takes the prediction from each tree and based on the majority votes of predictions,
and it predicts the final output.
DEPT. OF CSE, SEACET 2023-24 Page 6
Prediction Delay In Airlines

The greater number of trees in the forest leads to higher accuracy and prevents the problem
of overfitting.

Below are some points that explain why we should use the Random Forest algorithm:

 It takes less training time as compared to other algorithms.


 It predicts output with high accuracy, even for the large dataset it runs efficiently.
 It can also maintain accuracy when a large proportion of data is missing.

Random Forest works in two-phase first is to create the random forest by


combining N decision tree, and second is to make predictions for each tree created in the
first phase.

The Working process can be explained in the below steps and diagram:

 Step-1: Select random K data points from the training set.


 Step-2: Build the decision trees associated with the selected data points (Subsets).
 Step-3: Choose the number N for decision trees that you want to build.
 Step-4: Repeat Step 1 & 2.
 Step-5: For new data points, find the predictions of each decision tree, and
assign the new data points to the category that wins the majority votes.

DEPT. OF CSE, SEACET 2023-24 Page 7

You might also like