Data Compression Examples in CS Thinking

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Data Compression.

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in CS Thinking.

Concept Recap

Data compression is the process of reducing the number of bits needed to store or transmit information. Some compression is lossless, meaning the original data can be recovered exactly, while some is lossy, meaning some detail is discarded to save more space.

Compression is packing information more tightly so files take less space or move faster across a network.

Read the full concept explanation โ†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: Compression trades storage and transfer speed against exactness or quality.

Common stuck point: Smaller is not always better. You must decide whether exact recovery matters.

Sense of Study hint: Ask two questions: Do you need the exact original back, and how much size reduction do you need? If exact recovery matters, choose lossless compression.

Common Mistakes to Watch For

Before you work through the examples, skim the mistake guide so you know which shortcuts and sign errors to avoid.

Worked Examples

Example 1

medium
Apply RLE to `XYZXYZXYZ`. Is the result shorter? Explain.

Answer

1X1Y1Z1X1Y1Z1X1Y1Zย (longer)1X1Y1Z1X1Y1Z1X1Y1Z\text{ (longer)}

First step

1
Each character forms its own run of length 1.

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan โ€” every worked solution, all subjects

Example 2

medium
In Huffman coding, symbol A has frequency 0.50.5, B has 0.250.25, C and D each 0.1250.125. Assign reasonable code lengths in bits.

Example 3

medium
You must transmit medical X-ray images. Which compression type is appropriate and why?

Example 4

hard
A 300300 MB video must stream in real time over a 22 MB/s connection for 6060 s of content. What compression ratio minimum is required?

Example 5

hard
Compare RLE on `AAAAABBBAAAA` vs the alphabet shuffled `ABABABAB`. Give compressed lengths in characters.

Example 6

challenge
You receive a 1010 MB file in 44 seconds on a 11 MB/s link. Explain how this is possible.

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

easy
A file is 1000 KB before compression and 250 KB after. What is the compression ratio?

Example 2

easy
Lossless compression of a text file is later decompressed. How does the result compare to the original?

Example 3

easy
JPEG discards some image detail to shrink files. Is JPEG lossy or lossless?

Example 4

easy
Run-length encoding turns `AAAAA` (5 A's) into `5A`. How many characters does the compressed form use?

Example 5

easy
A 12 MB video compresses to 3 MB. What is the compression ratio?

Example 6

easy
Which data type generally requires LOSSLESS compression: a program's source code or a vacation photo?

Example 7

easy
If a file does not compress at all (ratio 1:1), what does that suggest about its data?

Example 8

easy
RLE encodes `WWWWWWWWWW` (10 W's). Write the compressed form.

Example 9

medium
A 50 MB file compresses to 8 MB. What percentage of the original size was saved?

Example 10

medium
RLE on `AAABBBBCC` produces what string, and is it shorter than the original?

Example 11

medium
RLE on `ABABAB` produces `1A1B1A1B1A1B`. Why did compression FAIL here?

Example 12

medium
A photo is JPEG-compressed, then that JPEG is re-saved as JPEG several more times. What happens to quality and why?

Example 13

medium
Two files are both '5 MB compressed,' but one was lossy and one lossless from a 20 MB original. Can you compare their compression ratios fairly?

Example 14

medium
Huffman coding gives frequent symbols short codes. If 'E' appears 40% of the time and gets a 1-bit code while rare symbols get 4-bit codes, what principle reduces total size?

Example 15

medium
A 100 MB dataset must transfer over a link in under 10 seconds, and the link does 5 MB/s. What minimum compression ratio is required?

Example 16

medium
A scheme claims to losslessly compress ANY file by at least 1 byte. Why is this impossible?

Example 17

challenge
A 1920x1080 24-bit frame is 6.22 MB raw. A codec stores it at 0.5 MB. Compute the compression ratio and classify the likely type for video.

Example 18

challenge
Text `ABABABAB...` (the pair AB repeated 1000 times, 2000 chars) compresses poorly under RLE but extremely well under a pattern/dictionary method. Explain why, and estimate the dictionary-coded size conceptually.

Example 19

challenge
A 16-color image (242^4 palette) is 100ร—100100 \times 100 pixels. Compare its size to the same image at 24 bpp (in bits), and give the compression ratio from palette reduction.

Example 20

medium
A backup is 80 GB raw and compresses at 5:1. What is the compressed size, and how much space is saved?

Example 21

easy
A song file is 8 MB and compresses to 2 MB. What is the compression ratio?

Example 22

easy
Apply RLE to the string `BBBBBB` (6 B's). Give the compressed form.

Example 23

easy
A file shrinks from 200 KB to 50 KB. What is the compression ratio?

Example 24

easy
A 1 GB archive compresses to 250 MB. What is the compression ratio?

Example 25

easy
Apply RLE to `CCCCDDDD`. What is the output?

Example 26

medium
A 240 MB file compresses at 6:16\text{:}1. What is the compressed size?

Example 27

medium
A 500 MB file compresses to 125 MB. What percentage was saved?

Example 28

medium
A photo is saved as JPEG at 80% quality, then opened and re-saved as JPEG at 80% ten times. What happens?

Example 29

medium
A 5 GB video must download in 100 seconds over a 25 MB/s link. What minimum compression ratio is needed?

Example 30

medium
A 60 MB log file compresses to 6 MB. What is the percent reduction?

Example 31

medium
RLE encodes `AAAAAAABBBBC` (7 A's, 4 B's, 1 C). Give the encoded form and original/compressed character counts.

Example 32

medium
A backup grows from 8080 GB to 200200 GB, but the compressed size only grows from 1616 GB to 2525 GB. Which compression ratio improved, original โ†’\to compressed?

Example 33

hard
A 640ร—480640\times 480 8-bit grayscale image compresses to 3030 KB. Compute the compression ratio (raw bytes รท\div compressed bytes).

Example 34

hard
Huffman codes for symbols with probabilities {0.4,0.3,0.2,0.1}\{0.4,0.3,0.2,0.1\} have lengths {1,2,3,3}\{1,2,3,3\}. Compute the expected bits per symbol.

Example 35

hard
A 44K frame raw is 2424 MB. A codec produces 0.30.3 MB per frame at 3030 fps. Compute the compression ratio per frame.

Example 36

hard
A text file of 2,000,0002{,}000{,}000 chars uses 88 bits per char raw. After Huffman coding with avg 4.54.5 bits/char, what is the compressed size in MB and the ratio?

Example 37

challenge
An information source has entropy H=2.5H = 2.5 bits/symbol. What is the BEST possible average code length, and what does this imply for compression ratio against 8-bit ASCII?

Example 38

challenge
A โ€‰2\,2-color image is 200ร—200200\times 200 pixels. Stored at 11 bit/pixel vs 2424 bits/pixel, compute the compression ratio achieved by bit-depth reduction.

Background Knowledge

These ideas may be useful before you work through the harder examples.

bits bytesdata representation