பார்க்கும் கணினிகள்

Computers That Can See

~11 min

Free

Your phone unlocks by recognising your face. A hospital AI can spot patterns in an X-ray. These systems are powerful — but they are pattern-matching on numbers, not truly seeing. Understanding how they work is the first step to knowing when to trust them.

By the end of this lesson you will be able to— இந்த பாடத்தின் இறுதியில்

Explain how digital images are represented as grids of numbers (pixels)
Describe how image recognition AI finds patterns in pixel data
Give real examples of computer vision in daily life
Explain why AI image recognition can be fooled, and what this means for trust

Let's Learn

What you will learn today

Understand how computers perceive images through pixels, and how AI learns to recognise objects — and where it can still fail.

🔁

What Is an Image to a Computer?

Hold your phone up and take a photo. To you, it is a memory, a scene, a face. To your phone, it is millions of numbers.

Every digital image is made of pixels — tiny squares of colour, each represented by three numbers: one for how red it is (0–255), one for green (0–255), and one for blue (0–255). A 12-megapixel photo has 12 million pixels — that is 36 million numbers.

This is what a computer 'sees' before AI gets involved: a giant grid of numbers.

From Pixels to Understanding — The Challenge

Recognising an object from pixel numbers is extraordinarily hard to programme by hand. Think about what changes when you take photos of the same cat:

• Different lighting (bright sun vs dim lamp)
• Different angle (front, side, above)
• Partial occlusion (tail hidden behind a chair)
• Different breeds (tabby, Persian, black cat)
• Motion blur if the cat is moving

Writing manual rules to handle all these variations is practically impossible. This is why machine learning — which learns from millions of examples — was the breakthrough that made computer vision work.

How Convolutional Neural Networks See

The AI used for image recognition is called a Convolutional Neural Network (CNN). Here is how it builds understanding layer by layer:

Layer 1 — Edge detectors: notices horizontal lines, vertical lines, diagonal edges
Layer 2 — Shape detectors: combines edges into curves, corners, and simple shapes
Layer 3 — Part detectors: combines shapes into eyes, ears, wheels, wings
Layer 4 — Object detectors: combines parts into 'cat face', 'car door', 'bird wing'
Final layer — Classification: makes a decision: 'This is a cat (89% confident)'

Each layer is automatically learned from data — not hand-programmed.

Layer 1: edges
Layer 2: shapes
Layer 3: parts
Layer 4: objects
Final: classification + confidence

📐 Real Applications of Computer Vision

Computer vision powers many things you use:

• Face ID: your phone's camera builds a 3D map of your face and checks it against the stored model every time you unlock
• Medical imaging: AI detects tumours in X-rays and MRI scans — sometimes more accurately than doctors
• Self-driving cars: cameras identify pedestrians, traffic lights, road markings, and other vehicles in real time
• Quality control in factories: cameras spot defects in products moving along assembly lines at high speed
• Accessibility: camera apps describe images aloud for visually impaired users
• Photo search: Google Photos finds 'beach photos' or 'photos with grandma' without manual tags

💡

AI Confidence Scores

AI image classifiers do not say 'this is a cat'. They say 'this is a cat with 87% confidence; a fox with 9% confidence; a dog with 4% confidence.'

This is important! AI is never certain. When AI drives a car and identifies a shape on the road, the confidence score determines whether it slows down. A well-designed system takes cautious action when confidence is low.

🔍

Adversarial Examples — How Easy It Is to Fool Computer Vision

Researchers discovered something disturbing: tiny changes to an image — invisible to humans — can completely fool a computer vision AI.

For example: adding carefully calculated noise (random-looking speckles that the human eye cannot notice) to a photo of a panda caused an AI to classify it as a gibbon with 99.3% confidence.

This reveals that AI is not seeing the way we see. It is matching statistical patterns, not understanding objects. A human cannot be fooled this way because we understand what a panda IS — we have touched, smelled, and heard pandas (or at least have a rich concept of them). The AI has only seen pixel statistics.

Human Vision vs Computer Vision

Human vision strengths:
• Works perfectly with 2–3 examples (sample efficient)
• Understands context and meaning
• Cannot be fooled by invisible pixel changes
• Works even when most of the image is hidden

Computer vision strengths:
• Processes millions of images per second
• Works 24 hours without fatigue
• Can be more accurate than humans in specific tasks (e.g. medical X-rays)
• Same accuracy every time (no tired days)

Computer vision weaknesses:
• Needs millions of examples
• Fooled by adversarial attacks
• Struggles with extreme variations
• Has no concept of what it is looking at

⚡

Challenge Round

Think About the Risk

A hospital deploys an AI to detect cancer in chest X-rays. It was trained on 500,000 X-rays from patients in the United States. The hospital is in Chennai, India.

Think about:
1. What could go wrong?
2. Why might the AI perform differently in Chennai than in the US?
3. What should the hospital do before fully trusting the AI's diagnoses?

What You Learned

Images are grids of numbers to a computer. CNNs build understanding layer by layer from edges to objects. Computer vision enables face recognition, medical diagnosis, and self-driving cars. AI can be fooled by adversarial attacks — because it matches patterns, not meaning. And training data diversity matters enormously for real-world accuracy.

🌟

You now understand how computers 'see' — and more importantly, why they sometimes see wrong.

↪ Next lesson: computers that understand language — how does AI read and write?

★

Key Points

✓Computers see images as grids of numbered pixels, not meaningful shapes
✓CNNs (Convolutional Neural Networks) learn features layer by layer — edges, then shapes, then objects
✓A confidence score tells you how sure the AI is — 95% is not the same as certain
✓Adversarial examples are tiny changes that fool AI but not humans
✓Biased training data leads to biased recognition — diversity in data matters

Glossary

சொல் அகராதி

Pixel

படவொளி

Image recognition

படம் அடையாளம்

Computer vision

கணினி பார்வை

Pattern matching

வடிவ பொருத்தம்

Confidence score

நம்பகத்தன்மை மதிப்பெண்

Practice Activities

Quizவினாடி வினா

Answer each question to check your understanding.

Question 1 of 3

What is a pixel?

Match the Termsபொருத்துக

Click a term on the left, then click its matching definition on the right.

MMatch terms to their definitions

Click a term, then click its matching definition.

Terms

Definitions

How Do Machines Learn?

Next Lesson

Computers That Understand Language