0% found this document useful (0 votes)
134 views

Black Book 2020

The document is a project report submitted by three students - Zain Shaikh, Saad Shaikh, and Prajesh Waghela - for their Bachelor's degree. The project aims to develop a system for object detection using convolutional neural networks (CNNs). The report includes an introduction to object detection methods, literature survey on CNNs, proposed system design with block diagram and flowchart, implementation details of modules, testing procedures, result analysis, and future scope. It was submitted under the guidance of Prof. Shilpa Kalantri and approved by examiners and the computer engineering department head.

Uploaded by

Bhavesh Kolluri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views

Black Book 2020

The document is a project report submitted by three students - Zain Shaikh, Saad Shaikh, and Prajesh Waghela - for their Bachelor's degree. The project aims to develop a system for object detection using convolutional neural networks (CNNs). The report includes an introduction to object detection methods, literature survey on CNNs, proposed system design with block diagram and flowchart, implementation details of modules, testing procedures, result analysis, and future scope. It was submitted under the guidance of Prof. Shilpa Kalantri and approved by examiners and the computer engineering department head.

Uploaded by

Bhavesh Kolluri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Project Report on

OBJECT DETECTION USING CNN

Submitted in partial fulfillment of the requirements


of the degree of
Bachelor in Engineering

By

Zain Shaikh BE-D 53


Saad Shaikh BE-D 16
Prajesh Waghela BE-D 58

Under the guidance of


Prof. Shilpa Kalantri

DEPARTMENT OF COMPUTER ENGINEERING


SHAH AND ANCHOR KUTCHHI ENGINEERING COLLEGE
CHEMBUR, MUMBAI – 400088.
2019 – 2020
Certificate
This is to certify that the report of the project entitled

Object Detection Using CNN


is a bonafide work of

Zain Shaikh BE-D 53


Saad Shaikh BE-D 16
Prajesh Waghela BE-D 58

submitted to the
UNIVERSITY OF MUMBAI
during semester VIII in partial fulfilment of the requirement for the
award of the degree of

BACHELOR OF ENGINEERING
in
COMPUTER ENGINEERING.

(Prof. Shilpa Kalantri)

Guide

(Prof. Uday Bhave) (Dr. Bhavesh Patel)


Computer Head of Department Principal
Mahavir Education Trust's
SHAH & ANCHOR KUTCHHI ENGINEERING COLLEGE
Chembur, Mumbai - 400 088
UG Program in Computer Engineering

Attendance Certificate

Date
To,
The Principal
Shah and Anchor Kutchhi Engineering College,
Chembur, Mumbai-88

Subject:Confirmation of Attendance

Respected Sir,

This is to certify that Final year


(BE)students
Zain Shaikh BED- 53
Saad Shaikh BED-16
Prajesh Waghela BED- 58

Have duly attended the sessions on the day allotted to them during the period from 2019 to
2020 for performing the Project titled Object Detection Using CNN.

They were punctual and regular in their attendance. Following is the detailed record of
the student’s attendance.

Attendance Record:

Date Prajesh Waghela Zain Shaikh Saad Shaikh


15/01/2020 Present Present Present
22/01/2020 Present Present Present
29/01/2020 Present Present Present
05/02/2020 Present Present Present
04/03/2020 Present Present Present
18/03/2020 Present Present Present

Signature and Name of Internal Guide


Approval for Project Report for B. E. Semester VIII

This project report entitled OBJECT DETECTION USING CNN by Zain Shaikh,
Saad Shaikh, Prajesh Waghela is approved for semester VIII in partial fulfilment
of the requirements for the award of the degree of Bachelor of Engineering.

Examiners

1.

2.

Guide

1.

2.

Date:

Place:
Declaration

We declare that this written submission represents our ideas in our own words and where others'
ideas or words have been included, we have adequately cited and referenced the original sources.
We also declare that we have adhered to all principles of academic honesty and integrity and have
not misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We
understand that any violation of the above will be cause for disciplinary action by the Institute and
can also evoke penal action from the sources which have thus not been properly cited or from
whom proper permission has not been taken when needed.

Zain Shaikh BE-D 53


Saad Shaikh BE-D 16
Prajesh Waghela BE-D 58

Date:

Place:
Abstract

The object detection based on deep learning is an important application in deep learning
technology, which is characterized by its strong capability of feature learning and feature
representationcomparedwiththetraditionalobjectdetectionmethods.In this work, first makes an
introduction of the classical methods in object detection, and expounds the relation and difference
between the classical methods and the deep learning methods in object detection. Then it introduces
the emergence of the object detection methods based on deep learning and elaborates the most typical
methods nowadays in the object detection via deep learning. In the statement of the methods, in this
work focuses on the framework design and theworking principle of the models and analyzes the
model performance in the real-time and the accuracy of detection. Eventually, it discusses the
challenges in the object detection based on deep learning and offers some solutions for reference.

Keywords:deep learning, object detection.


Acknowledgement

We would express a deep gratitude to the management of Shah and Anchor Kutchhi Engineering
College our Principal Dr. Bhavesh Patel, Vice Principal, Dr. Vinit Kotak, Department Prof.Uday
Bhave, guide Prof. Shilpa kalantri for providing us with valuable guidance,advice and
suggestions. We were fortunate to have met such supervisors.We would like to thank our guide
for providing us with the opportunity to do this project Object Detection using CNN which
helped us in doing a lot of research and doing new things.We acknowledge with a deep sense of
gratitude,the encouragement and support received from faculty members and colleagues.
Table of Content

1. Introduction 1

2. Literature Survey 2

2.1. Convolutional neural network 2

2.2. Problem Definition and Objectives 6

3. Proposed System 7

3.1 Block Diagram &Flowchart 8

3.2 Software and Hardware Requirement 9

4. Implementation Details 10

4.1 Modules and design 10

4.2 Snapshots 13

5. Testing 15

6. Result and Analysis 17

7. Conclusion 18

8. Future Scope 19

9. References 21
List of Figures

2.1.1 Convolution neural network 2

3.1.1 Block diagram 7

3.1.2 Flowchart 8

List of Tables

3.2.1 Hardware specifications 9


Chapter 1

Introduction

Objects detection is the means detecting the instances of the objects from a particular classes in an
images. Goals of the object detection is to detect all the instances of the objects from a known classes
such as face, people, things or cars image. Well research domains of the object detection include
pedestrian detections and face detections. Object detection system construct a models for an objects
classes from the set of training examples. Object Detection methods falls into two categories such as
generative and discriminative

Object detection is widely used in computer vision tasks such as face recognition, face detection, video
object etc. Object detection is used for tracking of objects ,like examples tracking a ball during a cricket
match, tracking the movement of a base bat , or tracking a person in a videos or photos. Every object
class has its own special features such that it helps in classifying the class . Object class detection uses
these special features. To get a complete image understanding, we should have not only concentrate on
the classifying different images, but also try to precisely estimate the concepts and location of the
objects contained in each image . This tasks are referred as objects detection, which usually consists of
different subtasks such as face detection etc.

1
Chapter 2

Literature Survey

2.1-CONVOLUTIONAL NEURAL NETWORK

Object detection is a problems in computer vision where you are work to recognize objects, specifically
what object are inside a given images and also where they are in the image. The problems in the object
detection is more complex than classification, which also can recognize objects but doesn’t indicate where
the object is located in the images. In addition, classification doesn’t work on image containing more than
one object.

YOLOV3 is very popular algorithm of object detection because it achieves high accuracy while being
able to run in real time. YOLOV3 trains the system on full images and directly optimize detection
performance. YOLOV3 model has a number of advantages over other object detection algorithms:-
YOLOV3 is extremely faster than other algorithms. ([4]Azizpour).

Fig 2.1.1

Existing system:
The existing system involves the detection of the object.
It involves the use of YOLO V3(You Look Only Once) algorithm which is initially applied to aligned
frame pair. The algorithm “only looks once” at the image in the sense that it requires only one forward
propagation pass through the neural network to make predictions. After non-max suppression (which
makes sure the object detection algorithm only detects each object once), it then outputs recognized

objects together with the bounding boxes.

2
ResNet

To train the network model in a more effective manner, we herein adopt the same strategy as that used
for DSSD (the performance of the residual network is better than that of the VGG network). The goal is
to improve accuracy. However, the first implemented for the modification was the replacement of the
VGG network which is used in the original SSD with ResNet. We will also add a series of convolution
feature layers at the end of the underlying network ([12] Correa). These feature layers will gradually be
reduced in size that allowed prediction of the detection results on multiple scales. When the input size is
given as 300 and 320, although the ResNet–101 layer is deeper than the VGG–16 layer, it is
experimentally known that it replaces the SSD’s underlying convolution network with a residual
network, and it does not improve its accuracy but rather decreases it.

R-CNN

To circumvent the problem of selecting a huge number of regions, Ross Girshick et al. proposed a
method where we use the selective search for extract just 2000 regions from the image and he called
them region proposals. ([11] Cadena) Therefore, instead of trying to classify the huge number of regions,
you can just work with 2000 regions. These 2000 region proposals are generated by using the selective
search algorithm which is written below.

Selective Search:

1. Generate the initial sub-segmentation, we generate many candidate regions

2. Use the greedy algorithm to recursively combine similar regions into larger ones

3. Use generated regions to produce the final candidate region proposals

3
Fast R-CNN

The approach is similar to the R-CNN algorithm. But, instead of feeding the region proposals to the CNN,
we feed the input image to the CNN to generate a convolutional feature map. From the convolutional
feature map, we can identify the region of the proposals and warp them into the squares and by using an
RoI pooling layer we reshape them into the fixed size so that it can be fed into a fully connected layer.
([11] Cadena) From the RoI feature vector, we can use a softmax layer to predict the class of the proposed
region and also the offset values for the bounding box.

Problems with R-CNN

1- It still takes a huge amount of time to train the network as you would have to classify 2000 region
proposals per image.

2- It cannot be implemented real time as it takes around 47 seconds for each test image.

Limitation of existing system:

It struggles with small objects that appear in groups, such as flocks of birds.

It struggles to generalize to objects in new or unusual aspect ratios or configurations.

The architecture in the previous system is not able to achieve states.

4
YOLO — You Only Look Once
All the previous object detection algorithms have used regions to localize the object within the image.
The network does not look at the complete image. Instead, parts of the image which has high probabilities
of containing the object. ([4]Azizpour) YOLO or You Only Look Once is an object detection algorithm
much is different from the region based algorithms which seen above. In YOLO a single convolutional
network predicts the bounding boxes and the class probabilities for these boxes.

To deal we these limitation, we are going to apply optimized yolov3 algorithm for detection of objects
through a live feed or an image. The working of this optimized yolov3 is very simple as yolov3 is based
on regression. Unlike CNN which selects interesting parts in an image, yolov3 on the other hand predicts
the class and bounding boxes for the whole image in one run of the algorithm. To apply this algorithm
we need to know what we are going to predict i.e. the objects we are likely to be interested in so that we
can train our algorithm to look for classes of the objects and the bounding box specifying the object
location.

YOLO works by taking an image and split it into an SxS grid, within each of the grid we take m bounding
boxes. For each of the bounding box, the network gives an output a class probability and offset values
for the bounding box. ([12] Correa)The bounding boxes have the class probability above a threshold
value is selected and used to locate the object within the image.

5
2.2: Problem Definition and Objectives

Problem Definition

❖ We propose an end-to-end approach for object detection and pose estimation by


formulating bounding boxes containing the targeted object and their pose.

Objectives

❖ Produce Bounding boxes on the training images.


❖ Generating the pose coordinates of each object in the scene.
❖ Detect and localize simultaneously each object present in image during the testing steps.

6
Chapter 3

Proposed System

Block Diagram

Fig 3.1.1

7
Flowchart

Fig 3.1.2

8
3.2 Details of Hardware And Software

Software:-
PYTHON PROGRAMMING
Python is a general purpose, interpreted high level programming language. Python supports multiple
programming paradigms such as object oriented programming. Python highlights on the code
readability,allows you to use English keywords instead of pre- defined syntax. It doesn’t require curly
brackets to end the code blocks and doesn’t need a semicolon after statements. It supports multiple
platforms hence we can run the code same code in many different platforms without recompilation.
Python includes standard robust library which allows us to choose from easily.

SOFTWARE SPECIFICATION:
PYTHON : Python 3 Release – 3.7.2 for PC windows.
ANACONDA-3.7
Windows x86-64 executable installer.

Hardware Specification

Processor intel core i5-9400f CPU

Ram 8.00 GB

Windows Processor Windows 10

Windows Version 1803 x64-based system

Table No 3.2.1

9
Chapter 4

Implementation Details

Modules and Design:-


Install Python on your computer system:-

1. Install ImageAI and its dependencies like tensorflow, Numpy,OpenCV, etc.

2. Download the Object Detection model file(Retinanet)

Steps to be followed :-
1) Download and install Python version 3 from official Python Language website

https://python.org

2) Install the following dependencies via pip:

i. Tensorflow:
Tensorflow is an open-source software library for dataflow and differentiable programming across
a range of tasks. It is an symbolic math library, and is also used for machine learning application as
neural networks ,etc.. It is used for both research and production by Google.Tensor flow is developed
by Google Brain team for internal Google use. It is released under the Apache License 2.0 on November
9,2015.Tensor flow is Google Brain's second-generation system.1st Version of tensorflow was
released on February 11, 2017.While the reference implementation runs on single devices, Tensorflow
can run on multiple CPU’s and GPU (with optional CUDA and SYCL extensions for general-purpose
on graphics processing units). Tensor Flow is available on various platforms such as64-bit Linux,
MACOS , Windows, and mobile computing platforms including Android and iOS.
pip install tensorflow –command

ii. Numpy:
NumPy is library of Python programming language, adding support for large, multi-dimensional
array and matrice, along with large collection of high-level mathematical function to operate

10
features of computing Numarray into Numeric, with extension modifications. NumPy is open-
sourcesoftware and has many contributors.
pip install numpy –command.

iii. SciPy:

SciPy contain modules for many optimizations, linear algebra, integration, interpolation, special

fumction, FFT, signal and image processing, ODE solvers and other tasks common in engineering.

SciPy abstracts majorly on NumPy array object,and is the part of the NumPy stack which include tools

like Matplotlib, pandas and SymPy,etc., and an expanding set of scientific computing libraries. This

NumPy stack has similar uses to other applications such as MATLAB,Octave, and Scilab. The NumPy

stack is also sometimes referred as the SciPy stack. The SciPy library is currently distributed under

BSDlicense, and its development is sponsored and supported by an open communities of developers.

It is also supported by NumFOCUS, community foundation for supporting reproducible and accessible

science.

pip install scipy –command

iv. OpenCV:

OpenCV is an library of programming functions mainly aimed on real time computer vision. originally

developed by Intel, it is later supported by Willow Garage then Itseez. The library is across- platform

and free to use under the open-source BSD license.

pip install opencv-python –command

v. Pillow:

Python Imaging Library is a free Python programming language library that provides support to open,

edit and save several different formats of image files. Windows, Mac OS X and Linux are available for

this.

pip install pillow –command

11
vi. Matplotlib:

Matplotlib is a Python programming language plotting library and its NumPy numerical math

extension. It provides an object-oriented API to use general-purpose GUI toolkits such as Tkinter,

wxPython, Qt, or GTK+ to embed plots into applications.

pip install matplotlib – command

vii. H5py:

Synatx--pip install h5py

viii. Keras

Keras is an open-source neural-network library written in Python. It is capable of running on top of

TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML. Designed to enable fast

experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible.

pip install keras

ix. ImageAI:

ImageAI provides API to recognize 1000 different objects in a picture using pre-trained models that

were trained on the ImageNet-1000 dataset. The model implementations provided are SqueezeNet,

ResNet, InceptionV3 and DenseNet.

Synatx:

pip3 install imageai --upgrade

12
Snapshots

Screenshot 4.2.1

Screenshot 4.2.2

13
Screenshot 4.2.3

Screenshot 4.2.4

14
Chapter 5

Testing

Test Objective Steps / Input Expected Actual Result Remark


Case.ID Description Output Output

1 Displaying Output should Image or Object would The system satisfac test case
rectangle have an the data be detected in will be tory passed
boxes around appropriate set rectangular displaying
the object rectangular form output with
box around rectangular
the object box

Test Case Objective Steps / Input Expected Actual Result Remark


ID Description Output Output

2 Capture As soon as the Data Image The satisfac test case


images webcam is about will be images tory passed
opened it will object stored will be
capture objects captured
and stored

15
Test Case Objective Steps / Input Expected Actual Result Remark
ID Description Output Output

3 Train the The images are Create different different satisfac test case
image trained using different pixel values pixel tory passed
datasets the algorithm pixel are created values are
and stored as values and stored created
datasets. and store in various and stored
them in datasets in various
various datasets
datasets

16
Chapter 6

Results
The test cases above proved that the project is working as expected when the actual input is given. The
output of the project is just as expected. The system displays appropriate and correct images and name
of the object is displayed with the rectangular box just around the image which is the object as soon as
the user runs the program.

Analysis

Moving object detection is the basic step for further analysis of video. Every tracking method requires
an object detection mechanism either in every frame or when the object first appears in the video. It
handles segmentation of moving objects from stationary background objects. This focuses on higher
level processing.
.It also decreases computation time. Due to environmental conditions like illumination changes, shadow
object segmentation becomes difficult and significant problem. A common approach for object
detection is to use information in a single frame. However, some object detection methods make use of
the temporal information computed from a sequence of frames to reduce the number of false detections.
This temporal information is usually in the form of frame differencing, which highlights regions that
changes dynamically in consecutive frames. Given the object regions in the image, it is then the tracker’s
task to perform object correspondence from one frame to the next to generate the tracks. This section
reviews three moving object detection methods that are background subtraction with alpha parameter,
temporal difference, and statistical methods, Eigen Background Subtraction.

17
Chapter 7

Conclusion

The entire project has been developed and deployed as per the requirement of the final year project
which are planned well before, it is found to be bug free as per the testing standards that are
implemented. And by specification untraced errors concentrated in the coming versions, which are
planned to be developed in near future.
Finally, we like to conclude that we put all our efforts throughout the development ofour project and
tried to fulfil most of the requirements of the project that are planned. Object Detection system is a
project that will help to identify various objects in an environment. It will largely help to keep track
of every object and can be further upgraded into waste segregation or in any other system.

18
Chapter 8

Future Scope

This project can be further enhanced by adding several different features to support its current working
method to make it more accurate for different recognizing patterns and identifying various objects. To
make the system fully automatic and also to overcome the above limitations as discussed in future,
multi- view tracking can be implemented using multiple cameras. Multi view tracking has the obvious
advantage over single view tracking because of wide coverage range with different viewing angles for
the objects to be tracked.For Night time visual tracking, night vision mode should be available as an inbuilt
feature in the CCTV camera.
As a scope for future enhancement,

 Features either the local or global used for recognition can be increased, to increase the efficiency of
the object recognition system.
 Geometric properties of the image can be included in the feature vector for recognition. 150

 Using unsupervised classifier instead of a supervised classifier for recognition of the object.

 The proposed object recognition system uses grey-scale image and discards the color information. The
colour information in the image can be used for recognition of the object. Colour based object
recognition plays vital role in Robotics
Although the visual tracking algorithm proposed here is robust in many of the conditions, it can be
made more robust by eliminating some of the limitations as listed below:
 In the Single Visual tracking, the size of the template remains fixed for tracking. If the size of the object
reduces with the time, the background becomes more dominant than the object being tracked. In this
case the object may not be tracked.

19
 Fully occluded object cannot be tracked and considered as a new object in the next frame.
 Foreground object extraction depends on the binary segmentation which is carried out by applying
threshold techniques. So blob extraction and tracking depends on the threshold value.

20
Chapter 9

References

[1] V. Gajjar, A. Gurnani and Y. Khandhedia, "Human Detection and Tracking for Video
Surveillance: A Cognitive Science Approach," in 2017 IEEE International Conference on
Computer Vision Workshops, 2017.

[2] M. Adel, A. Moussaoui, M. Rasigni, S. Bourennane and L. Hamami, "Statistical-Based


Tracking Technique for Linear Structures Detection: Application to Vessel Segmentation in
Medical Images," IEEE Signal Processing Letters, vol. 17, no. 6, pp. 555-558, June 2010

[3] Aloimonos, J., Weiss, I., and Bandyopadhyay, A. (1988). Active vision. Int. J.Comput.
Vis. 1 ,333–356. doi:10.1007/BF00133571

[4] Azizpour, H., and Laptev, I. (2012). “Object detection using strongly- superviseddeformable
part models,” in Computer Vision-ECCV 2012 (Florence: Springer),836– 849.
[5] Azzopardi, G., and Petkov, N. (2013). Trainable cosfire filters for keypoint detectionand
pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 490–
503.doi:10.1109/TPAMI.2012.106
[6] Azzopardi, G., and Petkov, N. (2014). Ventral-stream-like shape representation:from pixel
intensity values to trainable object-selective cosfire models. Front.Comput. Neurosci. 8:80.
doi:10.3389/fncom.2014.00080

[7] Benbouzid, D., Busa-Fekete, R., and Kegl, B. (2012). “Fast classification using
sparsedecision dags,” in Proceedings of the 29th International Conference on MachineLearning
(ICML-12), ICML ‘12, eds J. Langford and J. Pineau (New York, NY:Omnipress), 951–958.
[8] Bourdev, L. D., Maji, S., Brox, T., and Malik, J. (2010). “Detecting peopleusing mutually
consistent poselet activations,” in Computer Vision – ECCV2010 – 11th European Conference on
Computer Vision, Heraklion, Crete, Greece,September 5-11, 2010, Proceedings, Part VI, Volume
6316 Lecture Notes inComputer Science, eds K. Daniilidis, P. Maragos, and N. Paragios

21
[9] Bourdev, L. D., and Malik, J. (2009). “Poselets: body part detectors trained using 3dhuman
[10] pose annotations,” in IEEE 12th International Conference on ComputerVision, ICCV 2009,
Kyoto, Japan, September 27 – October 4, 2009 (Kyoto: IEEE),1365–1372.

[11] Cadena, C., Dick, A., and Reid, I. (2015). “A fast, modular scene understanding sys-tem
using context-aware object detection,” in Robotics and Automation (ICRA),2015 IEEE
International Conference on (Seattle, WA).
[12] Correa, M., Hermosilla, G., Verschae, R., and Ruiz-del-Solar, J. (2012). Humandetection
and identification by robots using thermal and visual information indomestic environments. J.
Intell. Robot Syst. 66, 223–243. doi:10.1007/s10846-011-9612-2

22
URKUND

Urkund Analysis Result


Analysed Document: IEEE (YOLO) (2) (1).docx (D68174014)
Submitted: 4/15/2020 7:42:00 AM
Submitted By: [email protected]
Significance: 65 %

Sources included in the report:

IEEE (YOLO).docx (D65358356)

Instances where selected sources appear:

31
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

0: IEEE (YOLO).docx 60%

Object Detection

Saad Shaikh Computers BED-38

Shah and Anchor Kutchhi Engineering College. Mumbai,India [email protected]

Zain Shaikh Computers BED-37 Shah and Anchor Kutchhi Engineering College. Mumbai,India
[email protected]

Prajesh Waghela Computers BED-46 Shah and Anchor Kutchhi Engineering College.
Mumbai,India [email protected]

Abstract—

Objects detection is the process of detecting the objects in an images or through videos from
particular classes. Object detection comes under the concept of Deep Learning or
Convolutional Neural Networks which refers to Artificial Neural Network (ANN) with multi
layers.

Many different algorithms can be used for detection of objects and these algorithms can
handle large amounts of data sets or images collectively at a time. Similarly, from many of the
object detection techniques one technique is

0: IEEE (YOLO).docx 100%

YOLO. You Only Look Once is real-time object detection algorithm

which does not predicts bounding boxes but scans the whole image in one run of the
algorithm. YOLO is an area of computer vision that explores various images in quick time and
efficiency is also much faster.

0: IEEE (YOLO).docx 36%

Keywords:-Deep Learning, you only look once, object detection.

I. INTRODUCTION

Objects detection is the concept of detecting the instances of the object. The Major goals of
the object detection is to detect all the data of the objects from a known classes and those
classes or objects may consists of face, people, things or cars image. From many of the

well research domains of the object detection face detections is the most common concept
used in the market. Object detection concept can be used in many of our day to day activities
like examples tracking the movement of ball during a cricket match, tracking the movement of
a base bat, checking the number of students present in the classroom or granting access to

2
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

verified people in certain campuses. One of the major application of object detection is police
security at the airport can clearly see what objects are things are inside in an individual bags.

Whenever an object appears in the Every object class has its own special features such that it
helps in classify and identify

0: IEEE (YOLO).docx 93%

the class . Object class detection uses these special features. To get a complete image
understanding, we should have not only concentrate on the classifying different images, but
also try to precisely estimate the concepts and location of the objects contained in each
image . This tasks are referred as

objects detection, which usually consists of different subtasks such as training the system,

The problem definition of the object detection is to determine and recognize the image or
compare the image based on train dataset of that particular image. With the help of object
detection one can easily classify the images and predict the images. As soon as the object
detection model runs behind, the various relative output of those are displayed within the
rectangular box.

0: IEEE (YOLO).docx 88%

II. YOU LOOK ONLY ONCE(YOLO)

Object detection is a concept in computer vision where you are work to recognize objects,
specifically what object are inside a given images and also where they are in the image. The

problems in

0: IEEE (YOLO).docx 99%

the object detection is more complex than classification, which also can recognize objects but
doesn’t indicate where the object is located in the images. In addition, classification doesn’t
work on image containing more than one object.

YOLO is very popular algorithm of object detection because it achieves high accuracy while
being able to run in real time. YOLO trains the system on full images and directly optimize
detection performance. YOLO model has a number of advantages over other object detection
algorithms:-YOLO is extremely faster than other algorithms. Object

Locations are

0: IEEE (YOLO).docx 100%

3
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

made from one single network. You Only Look Once use relatively little pre-processing
compared to the other image classification algorithms.

A training sets for YOLO consists of a series of images, each image comes with a text file
indicate the coordinates and the category of object in the image. YOLO model processes
image in real times 45 frames per seconds. Fast YOLO processes 156 frames per second.

III.WORKING OF

YOLO ALGORITHM

First, an image is taken

for the detection

0: IEEE (YOLO).docx 100%

and YOLO algorithm is applied. In our example, the image is divided

into the

0: IEEE (YOLO).docx 87%

grids of 3x3 matrixes. We can divide the images into any number

of grids fordetection ,

0: IEEE (YOLO).docx 75%

depending on the complexity of the images. Once the image is divided for detection, each
grid are undergoes classification and localization of the object.

If there is no proper object image found in the grid of matrixes, then the objectness and
bounding box values of the grid matrixes

0: IEEE (YOLO).docx 100%

will be zero or if there found an object in the grid

matrixes

0: IEEE (YOLO).docx 100%

then the objectness will be 1 and the bounding box value will be its corresponding

to the

4
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

0: IEEE (YOLO).docx 100%

bounding values of the found object.

The bounding box prediction is explained as follows..

0: IEEE (YOLO).docx 100%

YOLO algorithm is used for predicting the accurate bounding boxes from the image. The
image divides into S x S grids by predicting the bounding boxes for each grid and class
probabilities. Both image classification and object localization techniques are applied for each
grid

matrixes of the image and each grid matrixes

0: IEEE (YOLO).docx 62%

is assigned with a labels. Then the algorithm checks each grid matrixes separately and marks
the labels which has an object in it and also marks its bounding boxes of the

images.

0: IEEE (YOLO).docx 79%

The labels of the gird matrixes without object are marked as a zero.

An image is taken and it is divided in the form of 3 x 3 matrixes. Each grid

matrixes

0: IEEE (YOLO).docx 94%

is labelled and each grid matrixes undergoes both image classification and objects
localization techniques. The label is considered as Y. Y consists of 8 values.

YOLO algorithm

are used

0: IEEE (YOLO).docx 83%

for the purpose of detecting objects using a single neural network. This algorithm is
generalized, it outperforms different strategies once generalizing from natural pictures to
different domains. The YOLO algorithm is very easy to implement and can be trained directly
on a complete image

5
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

for detection .

0: IEEE (YOLO).docx 100%

Region proposal strategies limit the classifier to a particular region..

It also predicts the fewer false positives images in the background areas of algorithm
.Comparing to other classifier algorithms YOLO

0: IEEE (YOLO).docx 100%

algorithm is much more efficient and fastest algorithm to use in real time

for object detection.

BLOCK DIAGRAM

FLOWCHART

Capture Image Capturing of an image is a

0: IEEE (YOLO).docx 88%

process where a device such as scanner is used for creating a digital representation of an
image

and stored in a dataset.

0: IEEE (YOLO).docx 100%

Image Acquisition It is defined as the action of retrieving an image from

the source usually a hardware based source and the object acts as the input for the
processing.

Image preprocessing It is a common name for detecting

0: IEEE (YOLO).docx 100%

operation with images at the lowest level of abstraction both input and output are intensity

of the images for detection.

0: IEEE (YOLO).docx 100%

6
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

Image Segmentation It is a process of partitioning a digital image into multiple segments so


that it is easy to analyze an image.

Feature Extraction

In order to represent an objects it is

0: IEEE (YOLO).docx 84%

one of the most widely researched areas in the fields of image analysis as it is a prime
requirements.

Classification It is refers to the task of extracting the object information classes from a
multiband raster from image

arrays.

ALGORITHM Step 1:Start Step 2:Setup the

0: IEEE (YOLO).docx 92%

GPU and Anaconda Virtual Environment Step 3:Gather and Label pictures Step4:Generate
training data Step 5:Create Label map and configure training Step 6:Train object detector
Step 7:Object detected Step 8:Finish IV.EXPERIMENT SETUP To implement the object
detection

we are using many different algorithms and one of them is

0: IEEE (YOLO).docx 81%

CNN,TensorFlow Object detection API.This ia an open source framework for constructing


,training and deploying object detection models

which helps in classify objects and detect objects.

0: IEEE (YOLO).docx 93%

The dataset used for this research was limited to person and things only.Training images
were extracted from personal video while testing images were

captured.One would need a high end device to run this algorithm as this requires high
computational time and large amount of data to deal with it.

0: IEEE (YOLO).docx 76%

V.CONCLUSION

7
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

Due to its powerful learning abilities and benefits in dealing with

various types of image, scales

0: IEEE (YOLO).docx 93%

transformation and background switches, deep learning based object detection has been a
researched hotspot in recent years.

This paper provides a detailed review on

a deep learning based object detection frameworks which handle the different sub-problems,
such as clutter and low resolution.

VI.

0: IEEE (YOLO).docx 100%

REFERENCES

[1] V. Gajjar, A Gumani and Y. Khandhedia, “

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach, “

in 2017 IEEE International Conference on Computer Vision Workshops, 2017. .

[2]

M.Adel,A.Moussaoui,M . Rasigni S.Bourenmane and l.Hamami,”Statistical-Based Tracking


Technique for Linear Structure Detection:Application to Vessel Segmentation in Medical
Images,”IEEE Signal Processing Letter,vol.17,no.6,pp.555-558,June 2010.

8
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

Hit and source - focused comparison, Side by Side:

Left side: As student entered the text in the submitted document.


Right side: As the text appears in the source.

Instances from: IEEE (YOLO).docx

1 60% 1: IEEE (YOLO).docx 60%

Object Detection Object Detection

Saad Shaikh Computers BED-38 Saad Shaikh Computers BED-38 Shah and Anchor Kutchhi
Engineering College. Mumbai,[email protected]
Shah and Anchor Kutchhi Engineering College. Mumbai,India
[email protected] Zain Shaikh Computers BED-37 Shah and Anchor Kutchhi
Engineering College. Mumbai,[email protected]
Zain Shaikh Computers BED-37 Shah and Anchor Kutchhi
Engineering College. Mumbai,India [email protected] PrajeshWaghela Computers BED-46 Shah and Anchor Kutchhi
Engineering College. Mumbai,[email protected]
Prajesh Waghela Computers BED-46 Shah and Anchor Kutchhi
Engineering College. Mumbai,India [email protected] Abstract— Objects detection means detecting the instances of
the objects from a particular classes in an images.The Term
Abstract— Deep Learning or Deep Neural Networks refers to Artificial
Neural Network (ANN) with multi layers.
Objects detection is the process of detecting the objects in an
images or through videos from particular classes. Object
detection comes under the concept of Deep Learning or

9
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

Convolutional Neural Networks which refers to Artificial Neural


Network (ANN) with multi layers.

2 100% 2: IEEE (YOLO).docx 100%

YOLO. You Only Look Once is real-time object detection YOLO).You Only Look Once is real-time object detection
algorithm algorithm

3 36% 3: IEEE (YOLO).docx 36%

Keywords:-Deep Learning, you only look once, object detection. Keywords:-Deep Learning, you only look once, object detection.

I. INTRODUCTION I. INTRODUCTION

Objects detection is the concept of detecting the instances of the Objects detection means detecting the instances of the objects
object. The Major goals of the object detection is to detect all the from a particular classes in an images.Goal of the object
data of the objects from a known classes and those classes or detection is to detect all instances of the object from a known
objects may consists of face, people, things or cars image. From classes such as face, people or cars image. Well research
many of the domains of the

4 93% 4: IEEE (YOLO).docx 93%

the class . Object class detection uses these special features. To the class . Object class detection uses these special features. To
get a complete image understanding, we should have not only get a complete image understanding, we should have not only
concentrate on the classifying different images, but also try to concentrate on the classifying different images, but also try to

10
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

precisely estimate the concepts and location of the objects precisely estimate the concepts and location of the objects
contained in each image . This tasks are referred as contained in each image . This task referred as

5 88% 5: IEEE (YOLO).docx 88%

II. YOU LOOK ONLY ONCE(YOLO) II. YOU LOOK ONLY ONCE(YOLO)

Object detection is a concept in computer vision where you are Object detection is a problems in computer vision where you are
work to recognize objects, specifically what object are inside a work to recognizeobjects, specifically what object are inside a
given images and also where they are in the image. The given images and also where they are in the image.

The

6 99% 6: IEEE (YOLO).docx 99%

the object detection is more complex than classification, which The


also can recognize objects but doesn’t indicate where the object
is located in the images. In addition, classification doesn’t work problemsof object detection is more complex than classification,
on image containing more than one object. which also can recognize objects but doesn’t indicate where the
object is located in the images. In addition, classification doesn’t
YOLO is very popular algorithm of object detection because it work on image containing more than one object.
achieves high accuracy while being able to run in real time.
YOLO trains the system on full images and directly optimize YOLO is very popular algorithm of object detection because it
detection performance. YOLO model has a number of achieves high accuracy while being able to run in real time.
advantages over other object detection algorithms:-YOLO is YOLO trains the system on full images and directly optimize
extremely faster than other algorithms. Object detection performance. YOLO model has a number of
advantages over other object detection algorithms:-YOLO is
extremely faster than other algorithms. Object

11
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

7 100% 7: IEEE (YOLO).docx 100%

made from one single network. You Only Look Once use made from one single network. You Only Look Once use
relatively little pre-processing compared to the other image relatively little pre-processing compared to the other image
classification algorithms. classification algorithms.

A training sets for YOLO consists of a series of images, each A training sets for YOLO consists of a series of images, each
image comes with a text file indicate the coordinates and the image comes with a text file indicate the coordinates and the
category of object in the image. YOLO model processes image in category of object in the image. YOLO model processes image in
real times 45 frames per seconds. Fast YOLO processes 156 real times 45 frames per seconds. Fast YOLO processes 155
frames per second. frames per second.

III.WORKING OF III.WORKING OF

8 100% 8: IEEE (YOLO).docx 100%

and YOLO algorithm is applied. In our example, the image is and YOLO algorithm is applied. In our example, the image is
divided divided

9 87% 9: IEEE (YOLO).docx 87%

grids of 3x3 matrixes. We can divide the images into any number grids of 3x3 matrixes. We can divide the image into any number

10 75% 10: IEEE (YOLO).docx 75%

12
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

depending on the complexity of the images. Once the image is depending on the complexity of the image. Once the image is
divided for detection, each grid are undergoes classification and divided, each grid undergoes classification and localization of the
localization of the object. object.

11 100% 11: IEEE (YOLO).docx 100%

will be zero or if there found an object in the grid will be zero or if there found an object in the grid

12 100% 12: IEEE (YOLO).docx 100%

then the objectness will be 1 and the bounding box value will be then the objectness will be 1 and the bounding box value will be
its corresponding its corresponding

13 100% 13: IEEE (YOLO).docx 100%

bounding values of the found object. bounding values of the found object.

The bounding box prediction is explained as follows.. The bounding box prediction is explained as follows.

14 100% 14: IEEE (YOLO).docx 100%

YOLO algorithm is used for predicting the accurate bounding YOLO algorithm is used for predicting the accurate bounding
boxes from the image. The image divides into S x S grids by boxes from the image. The image divides into S x S grids by
predicting the bounding boxes for each grid and class predicting the bounding boxes for each grid and class

13
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

probabilities. Both image classification and object localization probabilities. Both image classification and object localization
techniques are applied for each grid techniques are applied for each grid

15 62% 15: IEEE (YOLO).docx 62%

is assigned with a labels. Then the algorithm checks each grid is assigned with a label. Then the algorithm checks each grid
matrixes separately and marks the labels which has an object in separately and marks the label which has an object in it and also
it and also marks its bounding boxes of the marks its bounding boxes.

The labels of the

16 79% 16: IEEE (YOLO).docx 79%

The labels of the gird matrixes without object are marked as a The labels of the
zero.
gird without object are marked aszero. An image is taken and it
An image is taken and it is divided in the form of 3 x 3 matrixes. is divided in the form of 3 x 3 matrixes. Each grid
Each grid

17 94% 17: IEEE (YOLO).docx 94%

is labelled and each grid matrixes undergoes both image is labelled and each grid undergoes both image classification
classification and objects localization techniques. The label is and objects localization techniques. The label is considered as Y.
considered as Y. Y consists of 8 values. Y consists of 8 values.

YOLO algorithm YOLO algorithm

14
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

18 83% 18: IEEE (YOLO).docx 83%

for the purpose of detecting objects using a single neural for the purpose of detecting objects using a single neural
network. This algorithm is generalized, it outperforms different network.This algorithm is generalized, it outperforms different
strategies once generalizing from natural pictures to different strategies once generalizing from natural pictures to different
domains. The YOLO algorithm is very easy to implement and can domains. The algorithm is simple to build and can be trained
be trained directly on a complete image directly on a complete image.

19 100% 19: IEEE (YOLO).docx 100%

Region proposal strategies limit the classifier to a particular Region proposal strategies limit the classifier to a particular
region.. region.

20 100% 20: IEEE (YOLO).docx 100%

algorithm is much more efficient and fastest algorithm to use in algorithm is much more efficient and fastest algorithm to use in
real time real time.

21 88% 21: IEEE (YOLO).docx 88%

process where a device such as scanner is used for creating a process where an electronic device such as scanner is used for
digital representation of an image creating a digital representation of an image.

22 100% 22: IEEE (YOLO).docx 100%

15
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

Image Acquisition It is defined as the action of retrieving an Image Acquisition It is defined as the action of retrieving an
image from image from

23 100% 23: IEEE (YOLO).docx 100%

operation with images at the lowest level of abstraction both operation with images at the lowest level of abstraction both
input and output are intensity input and output are intensity

24 100% 24: IEEE (YOLO).docx 100%

Image Segmentation It is a process of partitioning a digital Image Segmentation It is a process of partitioning a digital
image into multiple segments so that it is easy to analyze an image into multiple segments so that it is easy to analyze an
image. image.

Feature Extraction Feature Extraction

25 84% 25: IEEE (YOLO).docx 84%

one of the most widely researched areas in the fields of image one of the most widely researched areas in the fields of image
analysis as it is a prime requirements. analysis as it is a prime requirements in order to represent an
objects.
Classification It is refers to the task of extracting the object
information classes from a multiband raster from image Classification It is refers to the task of extracting information
classes from a multiband raster from image.

26 92% 26: IEEE (YOLO).docx 92%

16
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

GPU and Anaconda Virtual Environment Step 3:Gather and Label GPU and Anaconda Virtual Environment Step 3:Gather and Label
pictures Step4:Generate training data Step 5:Create Label map pictures Step4:Generate training data Step 5:Create Label map
and configure training Step 6:Train object detector Step 7:Object and configure training Step 6:Train object detector Step 7:Object
detected Step 8:Finish IV.EXPERIMENT SETUP To implement the detected Step 8:Finish IV.EXPERIMENTSETUP To implement the
object detection object detection

27 81% 27: IEEE (YOLO).docx 81%

CNN,TensorFlow Object detection API.This ia an open source CNN,TensorFlow Object detection API was used.Thisia an open
framework for constructing ,training and deploying object source framework for constructing ,training and deploying
detection models object detection models.

28 93% 28: IEEE (YOLO).docx 93%

The dataset used for this research was limited to person and The dataset used for this research was limited to person and
things only.Training images were extracted from personal video quadrotor only.Training images were extracted from personal
while testing images were video while testing images were

29 76% 29: IEEE (YOLO).docx 76%

V.CONCLUSION V.CONCLUSION

Due to its powerful learning abilities and benefits in dealing with Due to its powerful learning ability and advantages in dealing
with

30 93% 30: IEEE (YOLO).docx 93%

17
URKUND IEEE (YOLO) (2) (1).docx (D68174014)

transformation and background switches, deep learning based transformation and background switches, deep learning based
object detection has been a researched hotspot in recent years. object detection has been a research hotspot in recent years.This
paper provides a detailed review on
This paper provides a detailed review on

31 100% 31: IEEE (YOLO).docx 100%

REFERENCES REFERENCES

[1] V. Gajjar, A Gumani and Y. Khandhedia, “ [1] V. Gajjar, A Gumani and Y. Khandhedia, “Human Detection
and Tracking for Video Surveillance: A Cognitive Science
Human Detection and Tracking for Video Surveillance: A Approach, “in 2017 IEEE International Conference on Computer
Cognitive Science Approach, “ Vision Workshops,2017.

in 2017 IEEE International Conference on Computer Vision [2]


Workshops, 2017. .

[2]

18

You might also like