How Do Input/Output Redirection and Pipes Work?
Sending output to files, chaining commands together, and the Unix philosophy in action
After this lesson, you will be able to:
- Redirect command output to files with
>(overwrite) and>>(append) - Redirect input from files with
< - Pipe the output of one command into another with
| - Combine multiple commands into pipelines (
sort | uniq | wc -l) - Use
grepto search file contents and filter pipeline output - Redirect error output separately with
2>
What If You Could Wire Programs Together?
Every command you’ve run so far has printed its output to the screen. That’s fine for ls or cal, but what about a command that produces 10,000 lines of output? Or what if you want to save compiler errors to review later?
Here’s the insight that makes Unix special: every program has three standard streams, and the shell lets you rewire them. You can send output to a file instead of the screen. You can feed a file into a program instead of typing. And — most powerful of all — you can connect the output of one program directly to the input of another, like snapping LEGO bricks together.
This is the Unix philosophy in action: small programs, connected by pipes, solving complex problems.
Streams, Redirection, and Pipes
The Three Standard Streams
Every running program automatically gets three channels:
| Stream | File Descriptor | Default Destination | Purpose |
|---|---|---|---|
| stdin | 0 | Keyboard | Input |
| stdout | 1 | Screen | Normal output |
| stderr | 2 | Screen | Error messages |
From Java: You already know these!
System.out.println()writes to stdout.System.err.println()writes to stderr.Scanner(System.in)reads from stdin. In C,printf()writes to stdout,fprintf(stderr, ...)writes to stderr, andscanf()reads from stdin. The streams are the same — Unix just lets you redirect them.
Output Redirection: > and >>
Let’s start with a running example. Create a file using output redirection:
echo "banana" > fruits.txt
This sends the output of echo to fruits.txt instead of the screen. The > operator means “redirect stdout to this file, overwriting it if it exists.”
Add more entries:
echo "apple" >> fruits.txt
echo "cherry" >> fruits.txt
echo "banana" >> fruits.txt
echo "date" >> fruits.txt
echo "apple" >> fruits.txt
The >> operator means “append to the file.” Check what we’ve built:
cat fruits.txt
banana
apple
cherry
banana
date
apple
Common Pitfall:
>is destructive. It overwrites the target file immediately, before the command even runs. The classic disaster:sort fruits.txt > fruits.txt— this emptiesfruits.txtbeforesortcan read it! Always use a temporary file:sort fruits.txt > temp.txt && mv temp.txt fruits.txt.
Input Redirection: <
You can also redirect a file into a program’s stdin:
sort < fruits.txt
Output (to screen):
apple
apple
banana
banana
cherry
date
Combine input and output redirection:
sort < fruits.txt > sorted.txt
cat sorted.txt
Now sorted.txt contains the sorted version.
Error Redirection: 2>
Stderr has its own file descriptor (2), so it gets its own redirect:
ls /nonexistent 2> errors.txt
cat errors.txt
ls: cannot access '/nonexistent': No such file or directory
Separate stdout and stderr into different files:
ls /bin /nonexistent > files.txt 2> errors.txt
Now files.txt has the successful listing and errors.txt has the error messages.
Redirect both to the same file:
ls /bin /nonexistent > everything.txt 2>&1
Key Insight: The
2>&1syntax means “redirect stderr (2) to wherever stdout (1) is currently going.” Order matters!> file 2>&1works because stdout is redirected to the file first, then stderr follows.2>&1 > filewould send stderr to the screen (original stdout) and only stdout to the file.
echo "hello" > file.txt and echo "hello" >> file.txt?> truncates the file to zero bytes before writing — anything that was there is gone. >> opens the file for appending and adds new content after the existing content. Both create the file if it doesn't exist. Mixing these up is how people accidentally destroy data: sort data.txt > data.txt empties the file before sort reads it.
Why does this matter?
In C, fopen("file", "w") and fopen("file", "a") do exactly the same thing as > and >>. Understanding shell redirection now means you’ll instantly know what those C file modes do in Week 8 when you start file I/O.
The /dev/null Black Hole
Sometimes you want to discard output entirely:
ls /nonexistent 2>/dev/null # Suppress error messages
command > /dev/null 2>&1 # Suppress everything
/dev/null is a special file that discards anything written to it. It’s Unix’s trash can for output.
The Pipe Operator: |
Now for the real power. The pipe | connects stdout of one command to stdin of the next:
cat fruits.txt | sort
This feeds fruits.txt through sort. But we can keep going:
cat fruits.txt | sort | uniq
Output:
apple
banana
cherry
date
uniq removes adjacent duplicate lines. Since we sorted first, identical lines are adjacent, so all duplicates are removed.
Add a count:
cat fruits.txt | sort | uniq -c
2 apple
2 banana
1 cherry
1 date
Sort by count (most common first):
cat fruits.txt | sort | uniq -c | sort -rn
2 banana
2 apple
1 date
1 cherry
We just built a frequency analysis pipeline from four simple commands. No programming language needed.
Key Insight: Pipes embody the Unix philosophy. Instead of one giant program that reads, sorts, deduplicates, and counts, you chain four small programs. Each does one thing well; pipes connect them. This is why Unix has hundreds of small commands instead of a few big ones.
ls > sort instead of ls | sort. What happens?> is file redirection — it creates (or overwrites) a regular file with that name. The shell doesn't care that a command named sort also exists. You get a plain file called sort in your current directory containing the raw ls output. The actual sort program in /usr/bin is unaffected because > writes to the current directory, not PATH.
Essential Filter Commands
These commands are designed to work in pipelines:
sort — sort lines
sort fruits.txt # Alphabetical (default)
sort -n numbers.txt # Numerical
sort -r fruits.txt # Reverse order
sort -k2 data.txt # Sort by 2nd field
sort -t',' -k2 -n grades.csv # Sort CSV by 2nd column numerically
Common Pitfall:
sortwithout-nsorts alphabetically. This means10comes before2(because1<2in character comparison). Always use-nfor numerical data.
uniq — remove adjacent duplicates (sort first!)
sort fruits.txt | uniq # Remove duplicates
sort fruits.txt | uniq -c # Count occurrences
sort fruits.txt | uniq -d # Show ONLY duplicates
wc — count lines, words, bytes
wc -l fruits.txt # Count lines
ls /bin | wc -l # Count files in /bin
head and tail — first/last N lines
head -5 bigfile.txt # First 5 lines
tail -3 bigfile.txt # Last 3 lines
ls -lS | head -6 # 5 largest files (+header)
history | tail -10 # Last 10 commands
grep — filter lines matching a pattern
grep "apple" fruits.txt # Lines containing "apple"
ls -l | grep ".txt" # Only .txt files in listing
grep -i "APPLE" fruits.txt # Case-insensitive
grep -c "apple" fruits.txt # Count matching lines
(We’ll cover grep in much more detail in Series 2, Lesson 2.1.)
cut — extract fields from delimited data
echo "Alice,90,A" | cut -d',' -f1 # Alice
echo "Alice,90,A" | cut -d',' -f2 # 90
tr — translate/replace characters
echo "hello world" | tr 'a-z' 'A-Z' # HELLO WORLD
echo "hello world" | tr -s ' ' # hello world (squeeze spaces)
tee — split the stream (save to file AND pass through)
ls /bin | tee filelist.txt | wc -l # Save listing AND count it
Building Pipelines Incrementally
The Trick: Build pipelines one stage at a time. Run the first command alone, check the output. Add the next pipe, verify. When something goes wrong, you know exactly which stage caused it.
Real-world example — find the most common login shell on the system:
# Step 1: Look at the raw data
cat /etc/passwd | head -3
# Step 2: Extract just the shell field (field 7, colon-delimited)
cat /etc/passwd | cut -d: -f7 | head -5
# Step 3: Sort and count
cat /etc/passwd | cut -d: -f7 | sort | uniq -c
# Step 4: Sort by count, highest first
cat /etc/passwd | cut -d: -f7 | sort | uniq -c | sort -rn | head -5
Each step builds on the last. You verify at every stage.
Why does this matter?
Debugging C programs means reading compiler output, filtering log files, and analyzing test results. A quick pipeline like gcc *.c 2>&1 | grep error pulls out just the error lines from a wall of compiler output. You’ll use these pipelines daily — they’re not academic exercises.
Combining Pipes with Redirection
sort < data.txt | uniq > results.txt # Input from file, pipe, output to file
cat fruits.txt | sort | uniq -c > summary.txt # Pipeline output to file
Common Pitfall: Don’t confuse
>and|.ls > sortcreates a file named “sort” and writes the listing into it.ls | sortsends the listing to the commandsort. One character difference, completely different behavior.
names.txt containing duplicate names in random order, which pipeline counts how many times each name appears?uniq only removes adjacent duplicates, so the data must be sorted first. Option A skips sorting, so non-adjacent duplicates won't be counted together. Option C runs uniq on unsorted data, then sorts the (wrong) counts. Option D uses grep -c, which counts lines matching a pattern — not duplicate frequencies.
Quick Check: What's wrong with sort data.txt > data.txt?
The > operator truncates (empties) data.txt before sort reads it. The result is an empty file. Use a temporary file instead: sort data.txt > temp.txt && mv temp.txt data.txt.
Quick Check: Why does uniq require sorted input?
uniq only removes adjacent duplicate lines. If duplicates aren’t next to each other (because the data isn’t sorted), uniq won’t detect them. Always pipe through sort first: sort | uniq.
Quick Check: What does ls /bin | grep zip | wc -l do?
Lists all files in /bin, filters to only those containing “zip” in the name, then counts how many there are. It answers: “How many programs in /bin have ‘zip’ in their name?”
Quick Check: Where does stderr go when you use a pipe?
Directly to the screen. Pipes only connect stdout. Error messages (stderr) bypass the pipe. This is usually what you want — you see errors immediately while normal output flows through the pipeline.
You Just Learned the Unix Superpower
Pipes and redirection are arguably the most important Unix concept in this entire series. They’re what let you solve complex data processing problems without writing a single line of code. Professional developers use them daily — filtering log files, analyzing data, processing build output.
And here’s the connection to C that you’ll see in a few weeks: when you write C programs, they automatically get the same three streams (stdin, stdout, stderr). A C program that reads from stdin and writes to stdout can be dropped into any pipeline. That’s by design.
In the final lesson of this series, you’ll learn to customize your shell — setting up environment variables, modifying your PATH, and creating aliases that make your daily workflow faster.
Big Picture: The pipeline concept shows up everywhere in software engineering: data pipelines, CI/CD pipelines, streaming architectures. The
|operator in your shell is the simplest version of a pattern you’ll encounter throughout your career.