0% found this document useful (0 votes)
11 views88 pages

Ch-3 Image AnalysisComputer Vision

This document discusses the concepts of computer vision, edge detection, and image segmentation, emphasizing their importance in deriving meaningful information from visual data. It outlines the technologies involved, such as deep learning and convolutional neural networks, and provides examples of applications in various fields like safety, healthcare, and entertainment. Additionally, it covers the challenges faced in edge detection and the methods used to enhance image processing.

Uploaded by

balchadegefa96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views88 pages

Ch-3 Image AnalysisComputer Vision

This document discusses the concepts of computer vision, edge detection, and image segmentation, emphasizing their importance in deriving meaningful information from visual data. It outlines the technologies involved, such as deep learning and convolutional neural networks, and provides examples of applications in various fields like safety, healthcare, and entertainment. Additionally, it covers the challenges faced in edge detection and the methods used to enhance image processing.

Uploaded by

balchadegefa96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 88

Kotebe Education University

Digital Image
Processing

Getahun G 1
Chapter Three

Image
Analysis/Computer
Vision
2
Objectives

 After completing this chapter, you will be able to

know the concepts of

Computer vision

Edge detection

Image segmentation

3
Computer Vision

 Computer vision is a field of artificial intelligence


(AI) that enables computers and systems to
derive meaningful information from digital
images, videos and other visual inputs and take
actions or make recommendations based on that
information.
 If AI enables computers to think, computer vision
enables them to see, observe and understand.

4
Cont.
 Computer vision needs lots of data. It runs analyses of
data over and over until it distinguishes distinctions and
ultimately recognize images.
 For example, to train a computer to recognize automobile tires, it
needs to be fed vast quantities of tire images and tire-related items
to learn the differences and recognize a tire, especially one with no
defects.

 Two essential technologies are used to accomplish the


tasks of computer vision
 A type of machine learning called deep learning and a convolutional
neural network (CNN).

5
Cont.
 Machine learning uses algorithmic models that enable a computer
to teach itself about the context of visual data.
 If enough data is fed through the model, the computer will “look” at the data
and teach itself to tell one image from another.
 Algorithms enable the machine to learn by itself, rather than someone
programming it to recognize an image.

 A CNN helps a machine learning or deep learning model “look” by


breaking images down into pixels that are given tags or labels.
 It uses the labels to perform convolutions (a mathematical operation on two
functions to produce a third function) and makes predictions about what it is
“seeing.”
 The neural network runs convolutions and checks the accuracy of its
predictions in a series of iterations until the predictions start to come true. It is
then recognizing or seeing images in a way similar to humans.
6
Cont.
 Much like a human making out an image at a distance, a CNN
first discerns hard edges and simple shapes, then fills in
information as it runs iterations of its predictions.
 A CNN is used to understand single images.
A recurrent neural network (RNN) is used in a similar way for video
applications to help computers understand how pictures in a series of
frames are related to one another.

7
Why computer vision matters

Safety Health Security

Comfort Fun Access


Ridiculously brief history of computer vision
• 1966: Minsky assigns computer vision
as an undergrad summer project
• 1960’s: interpretation of synthetic
worlds
Guzman ‘68
• 1970’s: some progress on interpreting
selected images
• 1980’s: ANNs come and go; shift toward
geometry and increased mathematical
rigor
• 1990’s: face recognition; statistical
analysis in vogue Ohta Kanade ‘78
• 2000’s: broader recognition; large
annotated datasets available; video
processing starts; vision & graphis;
vision for HCI; internet vision, etc.

Turk and Pentland ‘91


How vision is used now

 Examples of state-of-the-art
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR
software

Digit recognition, AT&T labs License plate readers


http://www.research.att.com/~yann/ http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Face detection

 Many new digital cameras now detect faces


 Canon, Sony, Fuji, …
Smile detection

Sony Cyber-shot® T70 Digital Still Camera


Object recognition (in supermarkets)

“A smart camera is flush-mounted in the checkout lane, continuously


watching for items. When an item is detected and recognized, the
cashier verifies the quantity of items that were found under the
basket, and continues to close the transaction. The item can remain
under the basket, and with Lane Hawk, you are assured to get paid for
it… “
Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris


Patterns” Read the
Login without a password…

Fingerprint scanners on many Face recognition systems now


new laptops, beginning to appear more widely
other devices
Object recognition (in mobile phones)
Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB,


NRC
Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic


Sports

Sport vision first down line


Smart cars

 Vision systems currently in many car models


Google cars

p://www.nytimes.com/2010/10/10/science/10google.html?ref=artificialintellige
Interactive Games: Kinect

 Object Recognition: http://www.youtube.com/watch?


feature=iv&v=fQ59dXOo63o
 Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
 3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
 Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY
 3D tracking, reconstruction, and interaction:
http://research.microsoft.com/en-us/projects/surfacerecon/default.aspx
Vision in space

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
Industrial robots

Vision-guided robots position nut runners on wheels


Mobile robots

NASA’s Mars Spirit Rover


RoboCup

Saxena et al. 2008


STAIR at Stanford
Medical imaging

Image guided surgery, Grimson


3D imaging, MRI, CT
et al., MIT
sion as a source of semantic information
Object categorization

sky
building

flag

face
banner
wall
street lamp
bus bus

cars
cene and context categorization
• outdoor
• city
• traffic
•…
Qualitative spatial information

slanted

non-rigid moving
object

vertical

rigid moving rigid moving


object object
horizontal
hallenges: viewpoint variation

Michelangelo 1475-1564
Challenges: illumination
Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba


Challenges: deformation

Xu, Beihong 1943


Challenges:
occlusion

Magritte, 1957
hallenges: background clutter
Challenges: object intra-class variation

slide credit: Fei-Fei, Fergus & Torralba


Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba


Challenges or opportunities?
• Images are confusing, but they also reveal the structure of the
world through numerous cues
• Our job is to interpret the cues!

Image source: J. Koenderink


Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a particular
2D picture
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a
particular 2D picture

• Possible solutions
– Bring in more constraints ( or more images)
– Use prior knowledge about the structure of the world
• Need both exact measurements and statistical inference!
Computer Vision vs. Graphics

 3D and 2D implies information loss


graphics

vision
 sensitivity to errors
 need for models
Image Filtering and Enhancing

 Linear Filters and


Convolution
 Image Smoothing
 Edge Detection
Region Segmentation
Texture
Image Restoration / Nose Removal

Original Synthetic
Perceptual Organization
Shape Analysis
Computer Vision Applications
 Real-world applications demonstrate how important computer vision
is to endeavors in business, entertainment, transportation,
healthcare and everyday life.
 Here are a few examples of established computer vision tasks:
 Image classification sees an image and can classify it (a dog, an apple, a person’s face).
More precisely, it is able to accurately predict that a given image belongs to a certain class. For
example, a social media company might want to use it to automatically identify and segregate
objectionable images uploaded by users.

 Object detection can use image classification to identify a certain class of image and then
detect and tabulate their appearance in an image or video. Examples include detecting damages
on an assembly line or identifying machinery that requires maintenance.

 Object tracking follows or tracks an object once it is detected. This task is often executed
with images captured in sequence or real-time video feeds. Autonomous vehicles, for example,
need to not only classify and detect objects such as pedestrians, other cars and road
infrastructure, they need to track them in motion to avoid collisions and obey traffic laws. (7)

 Content-based image retrieval uses computer vision to browse, search and retrieve
images from large data stores, based on the content of the images rather than metadata tags
associated with them.

50
Edge Detection
 Edges are significant local changes of intensity
in an image.
 Edges typically occur on the boundary
between two different regions in an image.

51
What Causes Intensity Changes?
 Geometric events
 surface orientation (boundary) discontinuities
 depth discontinuities
 color and texture discontinuities
 Non-geometric events
 illumination changes
 shadows
 inter-reflections
surface normal discontinuity

depth discontinuity

color discontinuity

illumination discontinuity

52
Edge Detection
 Edge detection is used to detect the location and presence of
edges by making changes in the intensity of an image.
 Different operations are used in image processing to detect
edges.
 It can detect the variation of grey levels but it quickly gives
response when a noise is detected.
 In image processing, edge detection is a very important task.
 Edge detection is the main tool in pattern recognition, image
segmentation and scene analysis.
 It is a type of filter which is applied to extract the edge points in an
image.
 Sudden changes in an image occurs when the edge of an image
contour across the brightness of the image.
 In image processing, edges are interpreted as a single class of
singularity.
 In a function, the singularity is characterized as discontinuities in
which the gradient approaches are infinity.
53
Cont.
 Edge detection is mostly used for the
measurement, detection and location changes in
an image gray level.
 Edges are the basic feature of an image.
 In an object, the clearest part is the edges and lines.
 With the help of edges and lines, an object structure is
known.
 So, extracting the edges is a very important technique
in graphics processing and feature extraction.
 The basic idea behind edge detection is :
 To highlight local edge operator use edge enhancement
operator.
 Define the edge strength and set the edge points.
 NOTE: edge detection cannot be performed when there
are noise and blurring image.
54
Goal of Edge Detection
 Produce a line “drawing” of a scene from an image of that
scene.

55
Why is Edge Detection Useful?
 Important features can be extracted from the
edges of an image (e.g., corners, lines,
curves).
 These features are used by higher-level
computer vision algorithms (e.g., recognition).

56
Edge Descriptors

 Edge normal: unit vector in the direction of


maximum intensity change.
 Edge direction: perpendicular to the direction of
maximum intensity change (i.e., edge normal)
 Edge strength: related to the local image contrast
along the normal.
 Edge position: the image position at which the
edge is located.

57
Modeling Intensity Changes
 Edge can be modeled according to their
intensity profile
 Step edge: the image intensity abruptly
changes from one value on one side of the
discontinuity to a different value on the
opposite side.

58
Cont.

 Ramp edge: a step edge where the intensity


change is not instantaneous but occur over a
finite distance.

59
Cont.
 Ridge edge: the image intensity abruptly
changes value but then returns to the starting
value within some short distance (i.e., usually
generated by lines).

60
Cont.

 Roof edge: a ridge edge where the intensity


change is not instantaneous but occur over a
finite distance (i.e., usually generated by the
intersection of two surfaces).

61
Main Steps in Edge Detection

1. Smoothing: suppress as much noise as possible,


without destroying true edges.
2. Enhancement: apply differentiation to enhance
the quality of edges (i.e., sharpening).
3. Thresholding/Detection: determine which
edge pixels should be discarded as noise and
which should be retained (i.e., threshold edge
magnitude).
4. Localization: determine the exact edge location.
 sub-pixel resolution might be required for some applications to estimate
the location of an edge to better than the spacing between pixels.
 Edge thinning and linking are usually required in this step.

62
Edge Detection Methods

 Many are implemented with convolution mask


and based on discrete approximations to
differential operators.
 Differential operations measure the rate of
change in the image brightness function.
 Some operators return orientation information.
 Other only return information about the
existence of an edge at each point.

63
Sobel Operator
 Looks for edges in both horizontal and vertical
directions, then combine the information into a single
metric.
 The masks are as follows:
  1 0 1   1  2  1
 
x   2 0 2 y   0 0 0 
  1 0 1 
 1 2 1 

Edge Magnitude = Edge


Direction
x 2  y=
2  y
1
tan  
 x
Prewitt Operator
 Similar to the Sobel, with different mask coefficients:

  1  1  1   1 0 1
y  0 0 0  x   1 0 1
 1 1 1    1 0 1

Edge Magnitude = Edge Direction =


x2  y2  y
1
tan  
 x
Roberts Operator
 Mark edge point only
 No information about edge orientation
 Work best with binary images
 Primary disadvantage:
 High sensitivity to noise
 Few pixels are used to approximate the gradient

First form of Roberts Operator


I (r , c)  I (r  1, c  1)2  I (r , c  1)  I (r  1, c)2
Second form of Roberts Operator
| I (r , c)  I (r  1, c  1) |  | I (r , c  1)  I (r  1, c) |

1 0  0 1
h1  h2  
0  1   1 0

66
Kirsch Compass Masks

 Taking a single mask and rotating it to 8 major


compass orientations: N, NW, W, SW, S, SE, E,
and NE.
 The edge magnitude = The maximum value
found by the convolution of each mask with
the image.
 The edge direction is defined by the mask that
produces the maximum magnitude.
Kirsch Compass Masks (Cont.)
 The Kirsch masks are defined as follows:

  3  3 5  3 5 5  5 5 5  5 5  3
N   3 0 5 W   3 0 5  S   3 0  3 E  5 0  3
  3  3 5   3  3  3   3  3  3   3  3  3

 5  3  3   3  3  3   3  3  3   3  3 5
NW  5 0  3 SW  5 0  3SE   3 0  3 NE   3 0 5
 5  3  3  5 5  3  5 5 5    3 5 5

 EX: If NE produces the maximum value,


then the edge direction is Northeast
Robinson Compass Masks

 Similar to the Kirsch masks, with mask


coefficients of 0, 1, and 2:
  1 0 1  0 1 2 1 2 1 2 1 0
N   2 0 2 W   1 0 1 S  0 0 0  E  1 0  1
  1 0 1   2  1 0   1  2  1  0  1  2

 1 0  1  0  1  2   1  2  1   2  1 0
NW  2 0  2 SW  1 0  1 SE  0 0 0  NE   1 0 1 
 1 0  1   2 1 0   1 2 1   0 1 2
Laplacian Operators
 Edge magnitude is approximated in digital
images by a convolution sum.
 The sign of the result (+ or -) from two
adjacent pixels provide edge orientation and
tells us which side of edge brighter
Laplacian Operators (Cont.)
 Masks for 4 and 8 neighborhoods

 0 1 0   1  1  1
  1 4  1   1 8  1
   
 0  1 0    1  1  1

 Mask with stressed significance of the central


pixel or its neighborhood

 1 2 1   2 1  2
  2 4  2  1 4 1
   
 1  2 1    2 1  2
Performance

 Sobel and Prewitt methods are very effectively


providing good edge maps.
 Kirsch and Robinson methods require more
time for calculation and their results are not
better than the ones produced by Sobel and
Prewitt methods.
 Roberts and Laplacian methods are not very
good as expected.
73
Sample code
%read color image and convert it to gray level image
%display the origional image
im=imread('Ethiopian.jpg');
im2=rgb2gray(im);
subplot(2,4,1);
imshow(im2);title('Origional image');

%apply Sobel Operator


%display on horizontal edges
sobelhz=edge(im2,'sobel','horizontal');
subplot(2,4,2);
imshow(sobelhz); title('sobel horizontal edge');

74
Cont.
%display on Vertical edges
sobelhz=edge(im2,'sobel','vertical');
subplot(2,4,3);
imshow(sobelhz); title('sobel vertical edge');

%display both horizontal and vertical using sobel


operator
sobleverhz=edge(im2,'sobel','both');
subplot(2,4,4);
imshow(sobleverhz,[]);title('sobel-All edges');

75
Cont.
%apply Roberts Operator
%Display both horizontal and vertical edges
roberted=edge(im2,'roberts');
subplot(2,4,5);
imshow(roberted,[]);title('Roberts-Edge');

% apply Prewitt Operator


% Display both horizontal and Vertical Edges
prewitted=edge(im2,'prewitt');
subplot(2,4,6);
imshow(prewitted,[]);title('Prewitt-Edge');

76
Cont.

% apply Laplacian Filter


f=fspecial('laplacian');
lepedg=imfilter(im2,f,'symmetric');
subplot(2,4,7);
imshow(lepedg,[]);title('Laplician Filter');

%apply Canny Edge Detection


cannyedg=edge(im2,'canny');
subplot(2,4,8);
imshow(cannyedg,[]);title('Canny Edge');

77
Edge Map In Matlab Program(Home Work)

 Implement all methods in this presentation


 Set up edge detection mask(s)
 Use convolution method (filter2 function)
 Calculate edge magnitude
 Show the result of edge map
 No calculation of edge direction
Summary of Edge Detection
 Matlab’s image processing toolbox provides
edge function to find edges in an image.
 Edge function supports six different edge-
finding methods: Sobel, Prewitt, Roberts,
Laplacian of Gaussian, Zero-cross, and Canny.
 Edge is a powerful edge-detection method
Image Segmentation
 Image segmentation is the operation of
partitioning an image into a collection of
connected sets of pixels.
1. into regions, which usually cover the image
2. into linear structures, such as
 line segments
 curve segments
3. into 2D shapes, such as
 circles
 ellipses
 ribbons (long, symmetric regions)

80
Cont.
 The purpose of image segmentation is to partition an
image into meaningful regions with respect to a
particular application
 The segmentation is based on measurements taken from
the image and might be grey level, colour, texture, depth
or motion.
 Usually image segmentation is an initial and vital step in
a series of processes aimed at overall image
understanding
 Applications of image segmentation include
 Identifying objects in a scene for object-based
measurements such as size and shape
 Identifying objects in a moving scene for object-based video
compression (MPEG4)
 Identifying objects which are at different distances from a
sensor using depth measurements from a laser range finder
enabling path planning for a mobile robots
81
Example

 Segmentation based on grey scale


 Very simple ‘model’ of grey scale leads to inaccuracies
in object labelling

82
Example 2

 Segmentation based on texture


 Enables object surfaces with varying patterns of grey to
be segmented

83
Example 3
 Segmentation based on motion
 The main difficulty of motion segmentation is that an intermediate
step is required to (either implicitly or explicitly) estimate an optical
flow field
 The segmentation must be based on this estimate and not, in general,
the true flow

84
Applications of Image Segmentation
 Image segmentation is an important step in artificial vision.
Machines need to divide visual data into segments for
segment-specific processing to take place.
 Image segmentation thus finds its way in prominent fields
like Robotics, Medical Imaging, Autonomous Vehicles, and
Intelligent Video Analytics.
 Apart from these applications, Image segmentation is also
used by satellites on aerial imagery for segmenting out
roads, buildings, and trees.

85
Medical imaging
 Medical Imaging is an important domain of computer vision that
focuses on the diagnosis of diseases from visual data, both in the
form of simple visual data and biomedical scans.
 Segmentation forms an important role in medical imaging as it helps
doctors identify possible malignant features in images in a fast and
accurate manner.
 X-Ray segmentation
 CT scan organ segmentation
 Dental instance segmentation
 Digital pathology cell segmentation
 Surgical video annotation

86
Reading Assignment
 Read more about image segmentation techniques.

87
Thank you !!!!

88

You might also like