0% found this document useful (0 votes)
10 views2 pages

Computer Vision Intro

Computer vision is a field of AI that enables computers to interpret images and videos, aiming for recognition, measurement, and interaction with visual data. It faces challenges such as incomplete information and high-dimensional data, while offering applications in various sectors like healthcare, security, and entertainment. The study of computer vision is crucial due to the vast data available and its significance in automation and human-computer interaction.

Uploaded by

Leela mutyala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

Computer Vision Intro

Computer vision is a field of AI that enables computers to interpret images and videos, aiming for recognition, measurement, and interaction with visual data. It faces challenges such as incomplete information and high-dimensional data, while offering applications in various sectors like healthcare, security, and entertainment. The study of computer vision is crucial due to the vast data available and its significance in automation and human-computer interaction.

Uploaded by

Leela mutyala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Computer Vision – Introduction

Definition

 Field of AI enabling computers to understand and interpret images & videos like
humans.
 Goals: Recognition, measurement, search, interaction with visual data.
 Definitions:
o Ballard & Brown (1982) – Build explicit, meaningful descriptions of
physical objects from images.
o Trucco & Verri (1998) – Compute 3D properties from one or more images.
o Sockman & Shapiro (2001) – Make useful decisions about objects/scenes
from images.

Scope of Computer Vision

1. Perception & Interpretation – Identify objects, people, scenes, activities.


2. Measurement – Extract 3D world properties.
3. Search & Organization – Manage and retrieve visual content.

Human vs. Computer Vision

 Computers: Good at simple, repetitive tasks (with large data).


 Humans: Superior in complex, context-based tasks.
 Trend: AI improving rapidly, changing what is “hard”.

Why is Vision Hard?

1. Incomplete Information – Unknown camera settings, lighting, object shape.


2. Inverse Problem – Need to infer real-world scene from 2D data.
3. High-Dimensional Data – Computationally expensive.
4. No complete model of human visual system.
5. Context dependency – Deciding what is important in an image.

Importance

 Visual cortex occupies ~50% of Macaque brain; ~1/3 of human brain dedicated to
vision.
 Billions of images/videos captured daily (Flickr, Facebook, Instagram, YouTube).
Applications

 Consumer Electronics: QR codes, panorama, night mode, FaceID.


 Healthcare: Medical imaging, diagnosis, surgery planning.
 Security: Biometric authentication, CCTV monitoring.
 Industry: Robotics, automated inspection, OCR, number plate recognition.
 Transportation: Self-driving cars, driver monitoring.
 Space: Mars rovers, telescopes.
 Entertainment: AR/VR, film VFX, virtual sports replay.
 Retail: Cashier-less checkout, theft detection.
 Creative: Image/video generation (GANs), photo editing.

Key Research Areas

 Reconstruction: 3D/4D scene recovery from photos or depth cameras.


 Recognition: Object detection, face recognition, classification.
 Generation: Image-to-image translation, GANs.
 Visual Search: Content-based retrieval.

Why Study Computer Vision?

 Huge data availability from internet and sensors.


 Broad application across multiple domains.
 Critical in automation, AI, and human-computer interaction.

If you want, I can also make a memory-friendly mind map or one-page cheat sheet from
this so you can revise in 5 minutes before your exam. That will make it easier to recall.

You might also like