Kotebe Education University
Digital Image
Processing
Getahun G 1
Chapter Three
Image
Analysis/Computer
Vision
2
Objectives
After completing this chapter, you will be able to
know the concepts of
Computer vision
Edge detection
Image segmentation
3
Computer Vision
Computer vision is a field of artificial intelligence
(AI) that enables computers and systems to
derive meaningful information from digital
images, videos and other visual inputs and take
actions or make recommendations based on that
information.
If AI enables computers to think, computer vision
enables them to see, observe and understand.
4
Cont.
Computer vision needs lots of data. It runs analyses of
data over and over until it distinguishes distinctions and
ultimately recognize images.
For example, to train a computer to recognize automobile tires, it
needs to be fed vast quantities of tire images and tire-related items
to learn the differences and recognize a tire, especially one with no
defects.
Two essential technologies are used to accomplish the
tasks of computer vision
A type of machine learning called deep learning and a convolutional
neural network (CNN).
5
Cont.
Machine learning uses algorithmic models that enable a computer
to teach itself about the context of visual data.
If enough data is fed through the model, the computer will “look” at the data
and teach itself to tell one image from another.
Algorithms enable the machine to learn by itself, rather than someone
programming it to recognize an image.
A CNN helps a machine learning or deep learning model “look” by
breaking images down into pixels that are given tags or labels.
It uses the labels to perform convolutions (a mathematical operation on two
functions to produce a third function) and makes predictions about what it is
“seeing.”
The neural network runs convolutions and checks the accuracy of its
predictions in a series of iterations until the predictions start to come true. It is
then recognizing or seeing images in a way similar to humans.
6
Cont.
Much like a human making out an image at a distance, a CNN
first discerns hard edges and simple shapes, then fills in
information as it runs iterations of its predictions.
A CNN is used to understand single images.
A recurrent neural network (RNN) is used in a similar way for video
applications to help computers understand how pictures in a series of
frames are related to one another.
7
Why computer vision matters
Safety Health Security
Comfort Fun Access
Ridiculously brief history of computer vision
• 1966: Minsky assigns computer vision
as an undergrad summer project
• 1960’s: interpretation of synthetic
worlds
Guzman ‘68
• 1970’s: some progress on interpreting
selected images
• 1980’s: ANNs come and go; shift toward
geometry and increased mathematical
rigor
• 1990’s: face recognition; statistical
analysis in vogue Ohta Kanade ‘78
• 2000’s: broader recognition; large
annotated datasets available; video
processing starts; vision & graphis;
vision for HCI; internet vision, etc.
Turk and Pentland ‘91
How vision is used now
Examples of state-of-the-art
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR
software
Digit recognition, AT&T labs License plate readers
http://www.research.att.com/~yann/ http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Face detection
Many new digital cameras now detect faces
Canon, Sony, Fuji, …
Smile detection
Sony Cyber-shot® T70 Digital Still Camera
Object recognition (in supermarkets)
“A smart camera is flush-mounted in the checkout lane, continuously
watching for items. When an item is detected and recognized, the
cashier verifies the quantity of items that were found under the
basket, and continues to close the transaction. The item can remain
under the basket, and with Lane Hawk, you are assured to get paid for
it… “
Vision-based biometrics
“How the Afghan Girl was Identified by Her Iris
Patterns” Read the
Login without a password…
Fingerprint scanners on many Face recognition systems now
new laptops, beginning to appear more widely
other devices
Object recognition (in mobile phones)
Special effects: shape capture
The Matrix movies, ESC Entertainment, XYZRGB,
NRC
Special effects: motion capture
Pirates of the Carribean, Industrial Light and Magic
Sports
Sport vision first down line
Smart cars
Vision systems currently in many car models
Google cars
p://www.nytimes.com/2010/10/10/science/10google.html?ref=artificialintellige
Interactive Games: Kinect
Object Recognition: http://www.youtube.com/watch?
feature=iv&v=fQ59dXOo63o
Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY
3D tracking, reconstruction, and interaction:
http://research.microsoft.com/en-us/projects/surfacerecon/default.aspx
Vision in space
Vision systems (JPL) used for several tasks
• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
Industrial robots
Vision-guided robots position nut runners on wheels
Mobile robots
NASA’s Mars Spirit Rover
RoboCup
Saxena et al. 2008
STAIR at Stanford
Medical imaging
Image guided surgery, Grimson
3D imaging, MRI, CT
et al., MIT
sion as a source of semantic information
Object categorization
sky
building
flag
face
banner
wall
street lamp
bus bus
cars
cene and context categorization
• outdoor
• city
• traffic
•…
Qualitative spatial information
slanted
non-rigid moving
object
vertical
rigid moving rigid moving
object object
horizontal
hallenges: viewpoint variation
Michelangelo 1475-1564
Challenges: illumination
Challenges: scale
slide credit: Fei-Fei, Fergus & Torralba
Challenges: deformation
Xu, Beihong 1943
Challenges:
occlusion
Magritte, 1957
hallenges: background clutter
Challenges: object intra-class variation
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
slide credit: Fei-Fei, Fergus & Torralba
Challenges or opportunities?
• Images are confusing, but they also reveal the structure of the
world through numerous cues
• Our job is to interpret the cues!
Image source: J. Koenderink
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a particular
2D picture
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a
particular 2D picture
• Possible solutions
– Bring in more constraints ( or more images)
– Use prior knowledge about the structure of the world
• Need both exact measurements and statistical inference!
Computer Vision vs. Graphics
3D and 2D implies information loss
graphics
vision
sensitivity to errors
need for models
Image Filtering and Enhancing
Linear Filters and
Convolution
Image Smoothing
Edge Detection
Region Segmentation
Texture
Image Restoration / Nose Removal
Original Synthetic
Perceptual Organization
Shape Analysis
Computer Vision Applications
Real-world applications demonstrate how important computer vision
is to endeavors in business, entertainment, transportation,
healthcare and everyday life.
Here are a few examples of established computer vision tasks:
Image classification sees an image and can classify it (a dog, an apple, a person’s face).
More precisely, it is able to accurately predict that a given image belongs to a certain class. For
example, a social media company might want to use it to automatically identify and segregate
objectionable images uploaded by users.
Object detection can use image classification to identify a certain class of image and then
detect and tabulate their appearance in an image or video. Examples include detecting damages
on an assembly line or identifying machinery that requires maintenance.
Object tracking follows or tracks an object once it is detected. This task is often executed
with images captured in sequence or real-time video feeds. Autonomous vehicles, for example,
need to not only classify and detect objects such as pedestrians, other cars and road
infrastructure, they need to track them in motion to avoid collisions and obey traffic laws. (7)
Content-based image retrieval uses computer vision to browse, search and retrieve
images from large data stores, based on the content of the images rather than metadata tags
associated with them.
50
Edge Detection
Edges are significant local changes of intensity
in an image.
Edges typically occur on the boundary
between two different regions in an image.
51
What Causes Intensity Changes?
Geometric events
surface orientation (boundary) discontinuities
depth discontinuities
color and texture discontinuities
Non-geometric events
illumination changes
shadows
inter-reflections
surface normal discontinuity
depth discontinuity
color discontinuity
illumination discontinuity
52
Edge Detection
Edge detection is used to detect the location and presence of
edges by making changes in the intensity of an image.
Different operations are used in image processing to detect
edges.
It can detect the variation of grey levels but it quickly gives
response when a noise is detected.
In image processing, edge detection is a very important task.
Edge detection is the main tool in pattern recognition, image
segmentation and scene analysis.
It is a type of filter which is applied to extract the edge points in an
image.
Sudden changes in an image occurs when the edge of an image
contour across the brightness of the image.
In image processing, edges are interpreted as a single class of
singularity.
In a function, the singularity is characterized as discontinuities in
which the gradient approaches are infinity.
53
Cont.
Edge detection is mostly used for the
measurement, detection and location changes in
an image gray level.
Edges are the basic feature of an image.
In an object, the clearest part is the edges and lines.
With the help of edges and lines, an object structure is
known.
So, extracting the edges is a very important technique
in graphics processing and feature extraction.
The basic idea behind edge detection is :
To highlight local edge operator use edge enhancement
operator.
Define the edge strength and set the edge points.
NOTE: edge detection cannot be performed when there
are noise and blurring image.
54
Goal of Edge Detection
Produce a line “drawing” of a scene from an image of that
scene.
55
Why is Edge Detection Useful?
Important features can be extracted from the
edges of an image (e.g., corners, lines,
curves).
These features are used by higher-level
computer vision algorithms (e.g., recognition).
56
Edge Descriptors
Edge normal: unit vector in the direction of
maximum intensity change.
Edge direction: perpendicular to the direction of
maximum intensity change (i.e., edge normal)
Edge strength: related to the local image contrast
along the normal.
Edge position: the image position at which the
edge is located.
57
Modeling Intensity Changes
Edge can be modeled according to their
intensity profile
Step edge: the image intensity abruptly
changes from one value on one side of the
discontinuity to a different value on the
opposite side.
58
Cont.
Ramp edge: a step edge where the intensity
change is not instantaneous but occur over a
finite distance.
59
Cont.
Ridge edge: the image intensity abruptly
changes value but then returns to the starting
value within some short distance (i.e., usually
generated by lines).
60
Cont.
Roof edge: a ridge edge where the intensity
change is not instantaneous but occur over a
finite distance (i.e., usually generated by the
intersection of two surfaces).
61
Main Steps in Edge Detection
1. Smoothing: suppress as much noise as possible,
without destroying true edges.
2. Enhancement: apply differentiation to enhance
the quality of edges (i.e., sharpening).
3. Thresholding/Detection: determine which
edge pixels should be discarded as noise and
which should be retained (i.e., threshold edge
magnitude).
4. Localization: determine the exact edge location.
sub-pixel resolution might be required for some applications to estimate
the location of an edge to better than the spacing between pixels.
Edge thinning and linking are usually required in this step.
62
Edge Detection Methods
Many are implemented with convolution mask
and based on discrete approximations to
differential operators.
Differential operations measure the rate of
change in the image brightness function.
Some operators return orientation information.
Other only return information about the
existence of an edge at each point.
63
Sobel Operator
Looks for edges in both horizontal and vertical
directions, then combine the information into a single
metric.
The masks are as follows:
1 0 1 1 2 1
x 2 0 2 y 0 0 0
1 0 1
1 2 1
Edge Magnitude = Edge
Direction
x 2 y=
2 y
1
tan
x
Prewitt Operator
Similar to the Sobel, with different mask coefficients:
1 1 1 1 0 1
y 0 0 0 x 1 0 1
1 1 1 1 0 1
Edge Magnitude = Edge Direction =
x2 y2 y
1
tan
x
Roberts Operator
Mark edge point only
No information about edge orientation
Work best with binary images
Primary disadvantage:
High sensitivity to noise
Few pixels are used to approximate the gradient
First form of Roberts Operator
I (r , c) I (r 1, c 1)2 I (r , c 1) I (r 1, c)2
Second form of Roberts Operator
| I (r , c) I (r 1, c 1) | | I (r , c 1) I (r 1, c) |
1 0 0 1
h1 h2
0 1 1 0
66
Kirsch Compass Masks
Taking a single mask and rotating it to 8 major
compass orientations: N, NW, W, SW, S, SE, E,
and NE.
The edge magnitude = The maximum value
found by the convolution of each mask with
the image.
The edge direction is defined by the mask that
produces the maximum magnitude.
Kirsch Compass Masks (Cont.)
The Kirsch masks are defined as follows:
3 3 5 3 5 5 5 5 5 5 5 3
N 3 0 5 W 3 0 5 S 3 0 3 E 5 0 3
3 3 5 3 3 3 3 3 3 3 3 3
5 3 3 3 3 3 3 3 3 3 3 5
NW 5 0 3 SW 5 0 3SE 3 0 3 NE 3 0 5
5 3 3 5 5 3 5 5 5 3 5 5
EX: If NE produces the maximum value,
then the edge direction is Northeast
Robinson Compass Masks
Similar to the Kirsch masks, with mask
coefficients of 0, 1, and 2:
1 0 1 0 1 2 1 2 1 2 1 0
N 2 0 2 W 1 0 1 S 0 0 0 E 1 0 1
1 0 1 2 1 0 1 2 1 0 1 2
1 0 1 0 1 2 1 2 1 2 1 0
NW 2 0 2 SW 1 0 1 SE 0 0 0 NE 1 0 1
1 0 1 2 1 0 1 2 1 0 1 2
Laplacian Operators
Edge magnitude is approximated in digital
images by a convolution sum.
The sign of the result (+ or -) from two
adjacent pixels provide edge orientation and
tells us which side of edge brighter
Laplacian Operators (Cont.)
Masks for 4 and 8 neighborhoods
0 1 0 1 1 1
1 4 1 1 8 1
0 1 0 1 1 1
Mask with stressed significance of the central
pixel or its neighborhood
1 2 1 2 1 2
2 4 2 1 4 1
1 2 1 2 1 2
Performance
Sobel and Prewitt methods are very effectively
providing good edge maps.
Kirsch and Robinson methods require more
time for calculation and their results are not
better than the ones produced by Sobel and
Prewitt methods.
Roberts and Laplacian methods are not very
good as expected.
73
Sample code
%read color image and convert it to gray level image
%display the origional image
im=imread('Ethiopian.jpg');
im2=rgb2gray(im);
subplot(2,4,1);
imshow(im2);title('Origional image');
%apply Sobel Operator
%display on horizontal edges
sobelhz=edge(im2,'sobel','horizontal');
subplot(2,4,2);
imshow(sobelhz); title('sobel horizontal edge');
74
Cont.
%display on Vertical edges
sobelhz=edge(im2,'sobel','vertical');
subplot(2,4,3);
imshow(sobelhz); title('sobel vertical edge');
%display both horizontal and vertical using sobel
operator
sobleverhz=edge(im2,'sobel','both');
subplot(2,4,4);
imshow(sobleverhz,[]);title('sobel-All edges');
75
Cont.
%apply Roberts Operator
%Display both horizontal and vertical edges
roberted=edge(im2,'roberts');
subplot(2,4,5);
imshow(roberted,[]);title('Roberts-Edge');
% apply Prewitt Operator
% Display both horizontal and Vertical Edges
prewitted=edge(im2,'prewitt');
subplot(2,4,6);
imshow(prewitted,[]);title('Prewitt-Edge');
76
Cont.
% apply Laplacian Filter
f=fspecial('laplacian');
lepedg=imfilter(im2,f,'symmetric');
subplot(2,4,7);
imshow(lepedg,[]);title('Laplician Filter');
%apply Canny Edge Detection
cannyedg=edge(im2,'canny');
subplot(2,4,8);
imshow(cannyedg,[]);title('Canny Edge');
77
Edge Map In Matlab Program(Home Work)
Implement all methods in this presentation
Set up edge detection mask(s)
Use convolution method (filter2 function)
Calculate edge magnitude
Show the result of edge map
No calculation of edge direction
Summary of Edge Detection
Matlab’s image processing toolbox provides
edge function to find edges in an image.
Edge function supports six different edge-
finding methods: Sobel, Prewitt, Roberts,
Laplacian of Gaussian, Zero-cross, and Canny.
Edge is a powerful edge-detection method
Image Segmentation
Image segmentation is the operation of
partitioning an image into a collection of
connected sets of pixels.
1. into regions, which usually cover the image
2. into linear structures, such as
line segments
curve segments
3. into 2D shapes, such as
circles
ellipses
ribbons (long, symmetric regions)
80
Cont.
The purpose of image segmentation is to partition an
image into meaningful regions with respect to a
particular application
The segmentation is based on measurements taken from
the image and might be grey level, colour, texture, depth
or motion.
Usually image segmentation is an initial and vital step in
a series of processes aimed at overall image
understanding
Applications of image segmentation include
Identifying objects in a scene for object-based
measurements such as size and shape
Identifying objects in a moving scene for object-based video
compression (MPEG4)
Identifying objects which are at different distances from a
sensor using depth measurements from a laser range finder
enabling path planning for a mobile robots
81
Example
Segmentation based on grey scale
Very simple ‘model’ of grey scale leads to inaccuracies
in object labelling
82
Example 2
Segmentation based on texture
Enables object surfaces with varying patterns of grey to
be segmented
83
Example 3
Segmentation based on motion
The main difficulty of motion segmentation is that an intermediate
step is required to (either implicitly or explicitly) estimate an optical
flow field
The segmentation must be based on this estimate and not, in general,
the true flow
84
Applications of Image Segmentation
Image segmentation is an important step in artificial vision.
Machines need to divide visual data into segments for
segment-specific processing to take place.
Image segmentation thus finds its way in prominent fields
like Robotics, Medical Imaging, Autonomous Vehicles, and
Intelligent Video Analytics.
Apart from these applications, Image segmentation is also
used by satellites on aerial imagery for segmenting out
roads, buildings, and trees.
85
Medical imaging
Medical Imaging is an important domain of computer vision that
focuses on the diagnosis of diseases from visual data, both in the
form of simple visual data and biomedical scans.
Segmentation forms an important role in medical imaging as it helps
doctors identify possible malignant features in images in a fast and
accurate manner.
X-Ray segmentation
CT scan organ segmentation
Dental instance segmentation
Digital pathology cell segmentation
Surgical video annotation
86
Reading Assignment
Read more about image segmentation techniques.
87
Thank you !!!!
88