Data Representation Formula

The Formula

E: D \to \{0,1\}^*

When to use: Turning real-world things (text, images, sound) into numbers a computer can process.

Quick Example

Letter 'A' = 65. Color red = RGB(255, 0, 0). Sound = waveform samples.

What This Formula Means

The way information—numbers, text, images, and sound—is encoded as binary digits (0s and 1s) inside a computer. Different encoding schemes map real-world data to binary patterns, such as ASCII/Unicode for text, RGB for colors, and sampling for audio.

Turning real-world things (text, images, sound) into numbers a computer can process.

Formal View

Data representation defines an encoding function E: D \to \{0,1\}^* that maps values from a data domain D to binary strings, along with a decoding function E^{-1} that recovers the original data.

Worked Examples

Example 1

easy
A computer stores the character 'A' as the number 65 (ASCII). Explain why computers use numbers to represent characters.

Solution

  1. 1
    Step 1: Computers can only store binary numbers (sequences of 0s and 1s).
  2. 2
    Step 2: To store text, each character is assigned a unique number using an encoding scheme like ASCII (A=65, B=66, etc.).
  3. 3
    Step 3: The binary for 65 is 01000001, which is what the computer actually stores. The encoding scheme maps between human-readable characters and binary.

Answer

Computers only store binary numbers. Characters are represented by assigning each a unique number via an encoding scheme like ASCII.
Character encoding is a fundamental concept in data representation. ASCII uses 7 bits (128 characters), while Unicode extends this to represent characters from all writing systems worldwide.

Example 2

medium
Explain how a computer represents a colour image using binary. What are pixels and colour depth?

Common Mistakes

  • Assuming all text uses the same encoding—ASCII, UTF-8, and UTF-16 represent characters differently
  • Forgetting that higher-quality representations (more bits per sample) produce larger files
  • Confusing the data itself with its representation—the same image can be stored as PNG, JPEG, or BMP with different trade-offs

Common Mistakes Guide

If this formula feels simple in isolation but keeps breaking during real problems, review the most common errors before you practice again.

Why This Formula Matters

Understanding how data is stored enables better design and debugging. It explains why images have file sizes, why audio quality varies, and why text can look different across systems. Data representation is the bridge between the physical world and digital computing.

Frequently Asked Questions

What is the Data Representation formula?

The way information—numbers, text, images, and sound—is encoded as binary digits (0s and 1s) inside a computer. Different encoding schemes map real-world data to binary patterns, such as ASCII/Unicode for text, RGB for colors, and sampling for audio.

How do you use the Data Representation formula?

Turning real-world things (text, images, sound) into numbers a computer can process.

Why is the Data Representation formula important in CS Thinking?

Understanding how data is stored enables better design and debugging. It explains why images have file sizes, why audio quality varies, and why text can look different across systems. Data representation is the bridge between the physical world and digital computing.

What do students get wrong about Data Representation?

Different representations have trade-offs (quality vs. size).

What should I learn before the Data Representation formula?

Before studying the Data Representation formula, you should understand: binary, bits bytes.