Start with the recap, study the fully worked examples, then use the practice problems to
check your understanding of Data Compression.
This page combines explanation, solved examples, and follow-up practice so you can move
from recognition to confident problem-solving in CS Thinking.
Concept Recap
Data compression is the process of reducing the number of bits needed to store or transmit information. Some compression is lossless, meaning the original data can be recovered exactly, while some is lossy, meaning some detail is discarded to save more space.
Compression is packing information more tightly so files take less space or move faster across a network.
Read the first worked example with the solution open so the structure is clear.
Try the practice problems before revealing each solution.
Use the related concepts and background knowledge badges if you feel stuck.
What to Focus On
Core idea:Compression trades storage and transfer speed against exactness or quality.
Common stuck point:Smaller is not always better. You must decide whether exact recovery matters.
Sense of Study hint:Ask two questions: Do you need the exact original back, and how much size reduction do you need? If exact recovery matters, choose lossless compression.
Common Mistakes to Watch For
Before you work through the examples, skim the mistake guide so you know which shortcuts and
sign errors to avoid.
Apply RLE to `XYZXYZXYZ`. Is the result shorter? Explain.
Answer
1X1Y1Z1X1Y1Z1X1Y1Zย (longer)
First step
1
Each character forms its own run of length 1.
See the full worked solution + why-it-works coaching
SetupยทKey insightยทWhy it worksยทCommon pitfallยทConnection
Unlock answer keysOne Family plan โ every worked solution, all subjects
Example 2
medium
In Huffman coding, symbol A has frequency 0.5, B has 0.25, C and D each 0.125. Assign reasonable code lengths in bits.
Example 3
medium
You must transmit medical X-ray images. Which compression type is appropriate and why?
Example 4
hard
A 300 MB video must stream in real time over a 2 MB/s connection for 60 s of content. What compression ratio minimum is required?
Example 5
hard
Compare RLE on `AAAAABBBAAAA` vs the alphabet shuffled `ABABABAB`. Give compressed lengths in characters.
Example 6
challenge
You receive a 10 MB file in 4 seconds on a 1 MB/s link. Explain how this is possible.
Practice Problems
Try these problems on your own first, then open the solution to compare your method.
Example 1
easy
A file is 1000 KB before compression and 250 KB after. What is the compression ratio?
Example 2
easy
Lossless compression of a text file is later decompressed. How does the result compare to the original?
Example 3
easy
JPEG discards some image detail to shrink files. Is JPEG lossy or lossless?
Example 4
easy
Run-length encoding turns `AAAAA` (5 A's) into `5A`. How many characters does the compressed form use?
Example 5
easy
A 12 MB video compresses to 3 MB. What is the compression ratio?
Example 6
easy
Which data type generally requires LOSSLESS compression: a program's source code or a vacation photo?
Example 7
easy
If a file does not compress at all (ratio 1:1), what does that suggest about its data?
Example 8
easy
RLE encodes `WWWWWWWWWW` (10 W's). Write the compressed form.
Example 9
medium
A 50 MB file compresses to 8 MB. What percentage of the original size was saved?
Example 10
medium
RLE on `AAABBBBCC` produces what string, and is it shorter than the original?
Example 11
medium
RLE on `ABABAB` produces `1A1B1A1B1A1B`. Why did compression FAIL here?
Example 12
medium
A photo is JPEG-compressed, then that JPEG is re-saved as JPEG several more times. What happens to quality and why?
Example 13
medium
Two files are both '5 MB compressed,' but one was lossy and one lossless from a 20 MB original. Can you compare their compression ratios fairly?
Example 14
medium
Huffman coding gives frequent symbols short codes. If 'E' appears 40% of the time and gets a 1-bit code while rare symbols get 4-bit codes, what principle reduces total size?
Example 15
medium
A 100 MB dataset must transfer over a link in under 10 seconds, and the link does 5 MB/s. What minimum compression ratio is required?
Example 16
medium
A scheme claims to losslessly compress ANY file by at least 1 byte. Why is this impossible?
Example 17
challenge
A 1920x1080 24-bit frame is 6.22 MB raw. A codec stores it at 0.5 MB. Compute the compression ratio and classify the likely type for video.
Example 18
challenge
Text `ABABABAB...` (the pair AB repeated 1000 times, 2000 chars) compresses poorly under RLE but extremely well under a pattern/dictionary method. Explain why, and estimate the dictionary-coded size conceptually.
Example 19
challenge
A 16-color image (24 palette) is 100ร100 pixels. Compare its size to the same image at 24 bpp (in bits), and give the compression ratio from palette reduction.
Example 20
medium
A backup is 80 GB raw and compresses at 5:1. What is the compressed size, and how much space is saved?
Example 21
easy
A song file is 8 MB and compresses to 2 MB. What is the compression ratio?
Example 22
easy
Apply RLE to the string `BBBBBB` (6 B's). Give the compressed form.
Example 23
easy
A file shrinks from 200 KB to 50 KB. What is the compression ratio?
Example 24
easy
A 1 GB archive compresses to 250 MB. What is the compression ratio?
Example 25
easy
Apply RLE to `CCCCDDDD`. What is the output?
Example 26
medium
A 240 MB file compresses at 6:1. What is the compressed size?
Example 27
medium
A 500 MB file compresses to 125 MB. What percentage was saved?
Example 28
medium
A photo is saved as JPEG at 80% quality, then opened and re-saved as JPEG at 80% ten times. What happens?
Example 29
medium
A 5 GB video must download in 100 seconds over a 25 MB/s link. What minimum compression ratio is needed?
Example 30
medium
A 60 MB log file compresses to 6 MB. What is the percent reduction?
Example 31
medium
RLE encodes `AAAAAAABBBBC` (7 A's, 4 B's, 1 C). Give the encoded form and original/compressed character counts.
Example 32
medium
A backup grows from 80 GB to 200 GB, but the compressed size only grows from 16 GB to 25 GB. Which compression ratio improved, original โ compressed?
Example 33
hard
A 640ร480 8-bit grayscale image compresses to 30 KB. Compute the compression ratio (raw bytes รท compressed bytes).
Example 34
hard
Huffman codes for symbols with probabilities {0.4,0.3,0.2,0.1} have lengths {1,2,3,3}. Compute the expected bits per symbol.
Example 35
hard
A 4K frame raw is 24 MB. A codec produces 0.3 MB per frame at 30 fps. Compute the compression ratio per frame.
Example 36
hard
A text file of 2,000,000 chars uses 8 bits per char raw. After Huffman coding with avg 4.5 bits/char, what is the compressed size in MB and the ratio?
Example 37
challenge
An information source has entropy H=2.5 bits/symbol. What is the BEST possible average code length, and what does this imply for compression ratio against 8-bit ASCII?
Example 38
challenge
A 2-color image is 200ร200 pixels. Stored at 1 bit/pixel vs 24 bits/pixel, compute the compression ratio achieved by bit-depth reduction.