unix-foundations Lesson 5 22 min read

How Do Input/Output Redirection and Pipes Work?

Sending output to files, chaining commands together, and the Unix philosophy in action

Reading: Linux Text: Ch. 6–7, pp. 156–235

After this lesson, you will be able to:

  • Redirect command output to files with > (overwrite) and >> (append)
  • Redirect input from files with <
  • Pipe the output of one command into another with |
  • Combine multiple commands into pipelines (sort | uniq | wc -l)
  • Use grep to search file contents and filter pipeline output
  • Redirect error output separately with 2>

What If You Could Wire Programs Together?

Every command you’ve run so far has printed its output to the screen. That’s fine for ls or cal, but what about a command that produces 10,000 lines of output? Or what if you want to save compiler errors to review later?

Here’s the insight that makes Unix special: every program has three standard streams, and the shell lets you rewire them. You can send output to a file instead of the screen. You can feed a file into a program instead of typing. And — most powerful of all — you can connect the output of one program directly to the input of another, like snapping LEGO bricks together.

This is the Unix philosophy in action: small programs, connected by pipes, solving complex problems.


Streams, Redirection, and Pipes

The Three Standard Streams

Every running program automatically gets three channels:

Stream File Descriptor Default Destination Purpose
stdin 0 Keyboard Input
stdout 1 Screen Normal output
stderr 2 Screen Error messages

From Java: You already know these! System.out.println() writes to stdout. System.err.println() writes to stderr. Scanner(System.in) reads from stdin. In C, printf() writes to stdout, fprintf(stderr, ...) writes to stderr, and scanf() reads from stdin. The streams are the same — Unix just lets you redirect them.

Output Redirection: > and >>

Let’s start with a running example. Create a file using output redirection:

echo "banana" > fruits.txt

This sends the output of echo to fruits.txt instead of the screen. The > operator means “redirect stdout to this file, overwriting it if it exists.”

Add more entries:

echo "apple" >> fruits.txt
echo "cherry" >> fruits.txt
echo "banana" >> fruits.txt
echo "date" >> fruits.txt
echo "apple" >> fruits.txt

The >> operator means “append to the file.” Check what we’ve built:

cat fruits.txt
banana
apple
cherry
banana
date
apple

Common Pitfall: > is destructive. It overwrites the target file immediately, before the command even runs. The classic disaster: sort fruits.txt > fruits.txt — this empties fruits.txt before sort can read it! Always use a temporary file: sort fruits.txt > temp.txt && mv temp.txt fruits.txt.

Input Redirection: <

You can also redirect a file into a program’s stdin:

sort < fruits.txt

Output (to screen):

apple
apple
banana
banana
cherry
date

Combine input and output redirection:

sort < fruits.txt > sorted.txt
cat sorted.txt

Now sorted.txt contains the sorted version.

Error Redirection: 2>

Stderr has its own file descriptor (2), so it gets its own redirect:

ls /nonexistent 2> errors.txt
cat errors.txt
ls: cannot access '/nonexistent': No such file or directory

Separate stdout and stderr into different files:

ls /bin /nonexistent > files.txt 2> errors.txt

Now files.txt has the successful listing and errors.txt has the error messages.

Redirect both to the same file:

ls /bin /nonexistent > everything.txt 2>&1

Key Insight: The 2>&1 syntax means “redirect stderr (2) to wherever stdout (1) is currently going.” Order matters! > file 2>&1 works because stdout is redirected to the file first, then stderr follows. 2>&1 > file would send stderr to the screen (original stdout) and only stdout to the file.

Check Your Understanding
What's the difference between echo "hello" > file.txt and echo "hello" >> file.txt?
A > overwrites the file, >> appends to the end of the file
B >> creates the file if it doesn't exist, > does not
C > writes to stdout, >> writes to stderr
D They do the same thing; >> is just an older syntax
Answer: A. > truncates the file to zero bytes before writing — anything that was there is gone. >> opens the file for appending and adds new content after the existing content. Both create the file if it doesn't exist. Mixing these up is how people accidentally destroy data: sort data.txt > data.txt empties the file before sort reads it.
Why does this matter?

In C, fopen("file", "w") and fopen("file", "a") do exactly the same thing as > and >>. Understanding shell redirection now means you’ll instantly know what those C file modes do in Week 8 when you start file I/O.

The /dev/null Black Hole

Sometimes you want to discard output entirely:

ls /nonexistent 2>/dev/null        # Suppress error messages
command > /dev/null 2>&1           # Suppress everything

/dev/null is a special file that discards anything written to it. It’s Unix’s trash can for output.

The Pipe Operator: |

Now for the real power. The pipe | connects stdout of one command to stdin of the next:

cat fruits.txt | sort

This feeds fruits.txt through sort. But we can keep going:

cat fruits.txt | sort | uniq

Output:

apple
banana
cherry
date

uniq removes adjacent duplicate lines. Since we sorted first, identical lines are adjacent, so all duplicates are removed.

Add a count:

cat fruits.txt | sort | uniq -c
      2 apple
      2 banana
      1 cherry
      1 date

Sort by count (most common first):

cat fruits.txt | sort | uniq -c | sort -rn
      2 banana
      2 apple
      1 date
      1 cherry

We just built a frequency analysis pipeline from four simple commands. No programming language needed.

Key Insight: Pipes embody the Unix philosophy. Instead of one giant program that reads, sorts, deduplicates, and counts, you chain four small programs. Each does one thing well; pipes connect them. This is why Unix has hundreds of small commands instead of a few big ones.

Check Your Understanding
You run ls > sort instead of ls | sort. What happens?
A The directory listing is sorted and printed to the screen
B You get an error because sort is a command, not a file
C A file named sort is created containing the unsorted directory listing
D The sort command's binary is overwritten with the listing
Answer: C. > is file redirection — it creates (or overwrites) a regular file with that name. The shell doesn't care that a command named sort also exists. You get a plain file called sort in your current directory containing the raw ls output. The actual sort program in /usr/bin is unaffected because > writes to the current directory, not PATH.

Essential Filter Commands

These commands are designed to work in pipelines:

sort — sort lines

sort fruits.txt                    # Alphabetical (default)
sort -n numbers.txt                # Numerical
sort -r fruits.txt                 # Reverse order
sort -k2 data.txt                  # Sort by 2nd field
sort -t',' -k2 -n grades.csv      # Sort CSV by 2nd column numerically

Common Pitfall: sort without -n sorts alphabetically. This means 10 comes before 2 (because 1 < 2 in character comparison). Always use -n for numerical data.

uniq — remove adjacent duplicates (sort first!)

sort fruits.txt | uniq             # Remove duplicates
sort fruits.txt | uniq -c          # Count occurrences
sort fruits.txt | uniq -d          # Show ONLY duplicates

wc — count lines, words, bytes

wc -l fruits.txt                   # Count lines
ls /bin | wc -l                    # Count files in /bin

head and tail — first/last N lines

head -5 bigfile.txt                # First 5 lines
tail -3 bigfile.txt                # Last 3 lines
ls -lS | head -6                   # 5 largest files (+header)
history | tail -10                 # Last 10 commands

grep — filter lines matching a pattern

grep "apple" fruits.txt            # Lines containing "apple"
ls -l | grep ".txt"                # Only .txt files in listing
grep -i "APPLE" fruits.txt         # Case-insensitive
grep -c "apple" fruits.txt         # Count matching lines

(We’ll cover grep in much more detail in Series 2, Lesson 2.1.)

cut — extract fields from delimited data

echo "Alice,90,A" | cut -d',' -f1     # Alice
echo "Alice,90,A" | cut -d',' -f2     # 90

tr — translate/replace characters

echo "hello world" | tr 'a-z' 'A-Z'   # HELLO WORLD
echo "hello   world" | tr -s ' '      # hello world (squeeze spaces)

tee — split the stream (save to file AND pass through)

ls /bin | tee filelist.txt | wc -l     # Save listing AND count it

Building Pipelines Incrementally

The Trick: Build pipelines one stage at a time. Run the first command alone, check the output. Add the next pipe, verify. When something goes wrong, you know exactly which stage caused it.

Real-world example — find the most common login shell on the system:

# Step 1: Look at the raw data
cat /etc/passwd | head -3

# Step 2: Extract just the shell field (field 7, colon-delimited)
cat /etc/passwd | cut -d: -f7 | head -5

# Step 3: Sort and count
cat /etc/passwd | cut -d: -f7 | sort | uniq -c

# Step 4: Sort by count, highest first
cat /etc/passwd | cut -d: -f7 | sort | uniq -c | sort -rn | head -5

Each step builds on the last. You verify at every stage.

Why does this matter?

Debugging C programs means reading compiler output, filtering log files, and analyzing test results. A quick pipeline like gcc *.c 2>&1 | grep error pulls out just the error lines from a wall of compiler output. You’ll use these pipelines daily — they’re not academic exercises.

Combining Pipes with Redirection

sort < data.txt | uniq > results.txt          # Input from file, pipe, output to file
cat fruits.txt | sort | uniq -c > summary.txt # Pipeline output to file

Common Pitfall: Don’t confuse > and |. ls > sort creates a file named “sort” and writes the listing into it. ls | sort sends the listing to the command sort. One character difference, completely different behavior.

Check Your Understanding
Given a file names.txt containing duplicate names in random order, which pipeline counts how many times each name appears?
A cat names.txt | uniq -c
B sort names.txt | uniq -c
C uniq -c names.txt | sort
D grep -c names.txt
Answer: B. uniq only removes adjacent duplicates, so the data must be sorted first. Option A skips sorting, so non-adjacent duplicates won't be counted together. Option C runs uniq on unsorted data, then sorts the (wrong) counts. Option D uses grep -c, which counts lines matching a pattern — not duplicate frequencies.
Quick Check: What's wrong with sort data.txt > data.txt?

The > operator truncates (empties) data.txt before sort reads it. The result is an empty file. Use a temporary file instead: sort data.txt > temp.txt && mv temp.txt data.txt.

Quick Check: Why does uniq require sorted input?

uniq only removes adjacent duplicate lines. If duplicates aren’t next to each other (because the data isn’t sorted), uniq won’t detect them. Always pipe through sort first: sort | uniq.

Quick Check: What does ls /bin | grep zip | wc -l do?

Lists all files in /bin, filters to only those containing “zip” in the name, then counts how many there are. It answers: “How many programs in /bin have ‘zip’ in their name?”

Quick Check: Where does stderr go when you use a pipe?

Directly to the screen. Pipes only connect stdout. Error messages (stderr) bypass the pipe. This is usually what you want — you see errors immediately while normal output flows through the pipeline.


You Just Learned the Unix Superpower

Pipes and redirection are arguably the most important Unix concept in this entire series. They’re what let you solve complex data processing problems without writing a single line of code. Professional developers use them daily — filtering log files, analyzing data, processing build output.

And here’s the connection to C that you’ll see in a few weeks: when you write C programs, they automatically get the same three streams (stdin, stdout, stderr). A C program that reads from stdin and writes to stdout can be dropped into any pipeline. That’s by design.

In the final lesson of this series, you’ll learn to customize your shell — setting up environment variables, modifying your PATH, and creating aliases that make your daily workflow faster.

Big Picture: The pipeline concept shows up everywhere in software engineering: data pipelines, CI/CD pipelines, streaming architectures. The | operator in your shell is the simplest version of a pattern you’ll encounter throughout your career.