Deep Dive: C Memory Safety and the Lab 1 CWEs
The five CWEs behind Lab 1's reflection questions, with concrete exploit mechanics and famous real-world incidents
Based on content from Dr. Stu Steiner, Eastern Washington University.
This page is optional reading. It takes the five CWEs from the Lab 1 reflection and shows you, in concrete terms, what goes wrong at the machine level and what historic incident the pattern produced. The main lesson (Variables & I/O) has the programming rules. This page has the exploit mechanics.
If you want the underlying machine model first, read Deep Dive: The C Machine Model.
CWE-481: assignment in a condition (= vs ==)
The pattern
if (available_credit = 100) {
approve_purchase();
}
The programmer meant ==. With =, the line assigns 100 to available_credit, and the expression available_credit = 100 evaluates to 100, which is non-zero, which C treats as true. So approve_purchase() always runs. The else branch (if any) never runs.
Why it compiles
C deliberately makes assignment an expression, not a statement. This is what lets you write:
while ((ch = getc(fp)) != EOF) { ... }
and other useful patterns. The price is that = in a condition is syntactically legal.
Defense
- Turn on
-Wall. GCC warns on assignment-in-condition unless you wrap it in double parens (if ((x = f())...)) to signal “yes, I meant it.” - “Yoda conditions” put the constant on the left:
if (100 == available_credit). A typoif (100 = ...)is a compile error because you cannot assign to a literal. Not universally loved, but effective.
CWE-597: using == to compare strings
The pattern
char stored[64];
char submitted[64];
/* ... load stored from DB, read submitted from user ... */
if (submitted == stored) {
grant_account_access();
}
In C, an array name is a pointer to its first byte. submitted == stored compares two addresses, not the two strings’ contents. stored lives in one place in memory, submitted lives in another. Their addresses are never equal, so the condition is always false and grant_account_access() never runs.
That sounds like a safe failure mode. It is not. When the same pattern appears with inverted logic, the bug flips to silently granting access.
Why a C programmer writes this
They wrote it in Java last quarter. Java’s == on String objects also compares references, not content; the Java idiom is submitted.equals(stored). Importing the bug directly into C and swapping .equals(...) for == feels natural. It is wrong.
Defense
if (strcmp(submitted, stored) == 0) {
grant_account_access();
}
strcmp walks both strings byte by byte and returns 0 when they match. Zero means equal: memorize it.
One more trap
Do not check strcmp(a, b) == 1 or == -1. The C standard only promises the sign of the non-zero return. strcmp("apple", "apricot") can return -1, -2, -14, or any negative value depending on the libc. Portable code uses < 0, > 0, or == 0.
CWE-483: missing braces on a multi-statement if body
The pattern (real code)
/* Apple's SSL/TLS certificate verification, simplified. CVE-2014-1266. */
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
/* more verification checks below, all skipped */
The second goto fail; is indented as if it is part of the if. It is not. Without braces, only the one statement immediately after the if is conditional. The second goto fail; runs every time, unconditionally, skipping every check below it.
The code was signature-verification for SSL/TLS certificates on iOS and macOS. For roughly 18 months, every iPhone and Mac accepted forged certificates that should have been rejected. An attacker on the same Wi-Fi as the victim could present a self-signed certificate for any HTTPS site and the OS would accept it.
Why indentation is not enough
Human eyes read indentation; the compiler reads braces. As long as the grammar accepts the code, the compiler produces exactly what the parser derived, regardless of whitespace.
Defense
- Always use braces, even for a single statement.
gcc -Wmisleading-indentation(on via-Wallsince GCC 6) catches the exact pattern above.- Static analyzers (
clang-tidy, Coverity, the Apple post-mortem analyzer) flag unreachable-code-after-goto-fail patterns.
CWE-134: format-string vulnerability
The pattern
char name[64];
fgets(name, sizeof(name), stdin);
printf(name); /* BUG */
The programmer passed user input as the format string itself, not as an argument to a format string. Normally you write:
printf("%s", name); /* safe */
With the safe form, %s is your literal and name is the data.
With the buggy form, the user supplies the format string. If the user types %s %s %s %x %x, printf treats those as conversion specifiers and reads five arguments that never existed. On most architectures that means reading from stack slots above printf’s frame, which leaks stack contents (pointers, return addresses, sometimes secrets) to stdout.
Worse: %n tells printf to write the number of bytes printed so far into a pointer argument. When the format string is attacker-controlled, %n becomes a write-what-where primitive. Combined with the read primitives of %s and %x, an attacker can leak stack addresses, compute offsets, and then write arbitrary values to arbitrary memory. This is the mechanism behind a class of exploits from the late 1990s and early 2000s that compromised wu-ftpd, rpc.statd, and many others.
Defense
Never pass user input as the format string. Always:
printf("%s", user_input); /* data is the argument, not the format */
Modern GCC warns on printf(variable) when the variable is not a compile-time constant: -Wformat-security (on via -Wformat=2).
CWE-120: classic buffer overflow (no length cap on input)
The pattern
char name[30];
int boarding_zone = 4; /* sits immediately after name on the stack */
scanf("%s", name); /* reads one "word" with no upper bound */
With scanf("%s", name) there is no length cap. A "word" is “as many non-whitespace bytes as the user provides.” Handing the program 100 non-whitespace bytes writes 101 bytes (plus '\0') starting at name, and bytes 30 through 100 fall outside name’s space.
On the stack layout above, boarding_zone lives right after name. The attacker picks exactly which bytes appear at offset 30: whatever they want boarding_zone to become. Carefully chosen bytes can also overwrite the saved return address further up the frame, which on classic (non-stack-protected) systems redirects execution to attacker-controlled code or existing code gadgets.
Historic examples
- Morris Worm (1988): exploited
gets()infingerdto overflow a stack buffer and execute shellcode on VAX systems. First internet-scale worm. - CVE-2021-3156 “Baron Samedit”: heap buffer overflow in
sudo. An off-by-one in the command-line parsing allowed writing past a heap allocation, eventually letting a local user gain root on most Linux distributions. - CVE-2019-11931: a stack overflow in WhatsApp’s GIF parser, exploitable by sending a crafted GIF. Remote code execution on Android.
Defense
The fix is always “bound the read”:
/* Option 1: width specifier on scanf */
scanf("%29s", name); /* at most 29 chars + '\0' in a 30-byte buffer */
/* Option 2: fgets for a whole line */
char line[100];
fgets(line, sizeof(line), stdin); /* at most 99 chars + '\0' */
And never gets(). It was removed from the C standard in C11 because it fundamentally cannot be used safely.
The habit that defeats all five
The five CWEs have five different fixes, but one shared habit underwrites all of them: compile with warnings on and treat warnings as errors.
gcc lab1.c -Wall -Wextra -Wformat=2 -pedantic -std=c90 -o lab1
-Wall -Wextracatches the=vs==mistake, the implicit declarations, the unused variables.-Wformat=2catchesprintf(user_input).-pedantic -std=c90refuses non-standard extensions so your code behaves the same on every C90 compiler.-Wmisleading-indentation(in-Wallon GCC 6+) catches the Apple “goto fail” shape.
None of these flags would have caught every historic CVE listed above, but each one catches a category. The combination, applied consistently, is why modern systems are not as exploitable as 1988 systems. Your Lab 1 autograder runs essentially this command.
Where to go next
- Back to Variables & I/O
- Back to If/Else, Switch & Loops (has the
goto failwalkthrough) - For the machine-level mechanics, see Deep Dive: The C Machine Model.