Deep Dive: C Memory Safety and the Lab 1 CWEs

This page is optional reading. It takes the five CWEs from the Lab 1 reflection and shows you, in concrete terms, what goes wrong at the machine level and what historic incident the pattern produced. The main lesson (Variables & I/O) has the programming rules. This page has the exploit mechanics.

If you want the underlying machine model first, read Deep Dive: The C Machine Model.

CWE-481: assignment in a condition (`=` vs `==`)

The pattern

if (available_credit = 100) {
    approve_purchase();
}

The programmer meant ==. With =, the line assigns 100 to available_credit, and the expression available_credit = 100 evaluates to 100, which is non-zero, which C treats as true. So approve_purchase() always runs. The else branch (if any) never runs.

Why it compiles

C deliberately makes assignment an expression, not a statement. This is what lets you write:

while ((ch = getc(fp)) != EOF) { ... }

and other useful patterns. The price is that = in a condition is syntactically legal.

Defense

Turn on -Wall. GCC warns on assignment-in-condition unless you wrap it in double parens (if ((x = f())...)) to signal “yes, I meant it.”
“Yoda conditions” put the constant on the left: if (100 == available_credit). A typo if (100 = ...) is a compile error because you cannot assign to a literal. Not universally loved, but effective.

CWE-597: using `==` to compare strings

The pattern

char stored[64];
char submitted[64];
/* ... load stored from DB, read submitted from user ... */
if (submitted == stored) {
    grant_account_access();
}

In C, an array name is a pointer to its first byte. submitted == stored compares two addresses, not the two strings’ contents. stored lives in one place in memory, submitted lives in another. Their addresses are never equal, so the condition is always false and grant_account_access() never runs.

That sounds like a safe failure mode. It is not. When the same pattern appears with inverted logic, the bug flips to silently granting access.

Why a C programmer writes this

They wrote it in Java last quarter. Java’s == on String objects also compares references, not content; the Java idiom is submitted.equals(stored). Importing the bug directly into C and swapping .equals(...) for == feels natural. It is wrong.

Defense

if (strcmp(submitted, stored) == 0) {
    grant_account_access();
}

strcmp walks both strings byte by byte and returns 0 when they match. Zero means equal: memorize it.

One more trap

Do not check strcmp(a, b) == 1 or == -1. The C standard only promises the sign of the non-zero return. strcmp("apple", "apricot") can return -1, -2, -14, or any negative value depending on the libc. Portable code uses < 0, > 0, or == 0.

CWE-483: missing braces on a multi-statement `if` body

The pattern (real code)

/* Apple's SSL/TLS certificate verification, simplified. CVE-2014-1266. */
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
    goto fail;
/* more verification checks below, all skipped */

The second goto fail; is indented as if it is part of the if. It is not. Without braces, only the one statement immediately after the if is conditional. The second goto fail; runs every time, unconditionally, skipping every check below it.

The code was signature-verification for SSL/TLS certificates on iOS and macOS. For roughly 18 months, every iPhone and Mac accepted forged certificates that should have been rejected. An attacker on the same Wi-Fi as the victim could present a self-signed certificate for any HTTPS site and the OS would accept it.

Why indentation is not enough

Human eyes read indentation; the compiler reads braces. As long as the grammar accepts the code, the compiler produces exactly what the parser derived, regardless of whitespace.

Defense

Always use braces, even for a single statement.
gcc -Wmisleading-indentation (on via -Wall since GCC 6) catches the exact pattern above.
Static analyzers (clang-tidy, Coverity, the Apple post-mortem analyzer) flag unreachable-code-after-goto-fail patterns.

CWE-134: format-string vulnerability

The pattern

char name[64];
fgets(name, sizeof(name), stdin);
printf(name);                       /* BUG */

The programmer passed user input as the format string itself, not as an argument to a format string. Normally you write:

printf("%s", name);                 /* safe */

With the safe form, %s is your literal and name is the data.

With the buggy form, the user supplies the format string. If the user types %s %s %s %x %x, printf treats those as conversion specifiers and reads five arguments that never existed. On most architectures that means reading from stack slots above printf’s frame, which leaks stack contents (pointers, return addresses, sometimes secrets) to stdout.

Worse: %n tells printf to write the number of bytes printed so far into a pointer argument. When the format string is attacker-controlled, %n becomes a write-what-where primitive. Combined with the read primitives of %s and %x, an attacker can leak stack addresses, compute offsets, and then write arbitrary values to arbitrary memory. This is the mechanism behind a class of exploits from the late 1990s and early 2000s that compromised wu-ftpd, rpc.statd, and many others.

Defense

Never pass user input as the format string. Always:

printf("%s", user_input);           /* data is the argument, not the format */

Modern GCC warns on printf(variable) when the variable is not a compile-time constant: -Wformat-security (on via -Wformat=2).

CWE-120: classic buffer overflow (no length cap on input)

The pattern

char name[30];
int  boarding_zone = 4;        /* sits immediately after name on the stack */

scanf("%s", name);             /* reads one "word" with no upper bound */

With scanf("%s", name) there is no length cap. A "word" is “as many non-whitespace bytes as the user provides.” Handing the program 100 non-whitespace bytes writes 101 bytes (plus '\0') starting at name, and bytes 30 through 100 fall outside name’s space.

On the stack layout above, boarding_zone lives right after name. The attacker picks exactly which bytes appear at offset 30: whatever they want boarding_zone to become. Carefully chosen bytes can also overwrite the saved return address further up the frame, which on classic (non-stack-protected) systems redirects execution to attacker-controlled code or existing code gadgets.

Historic examples

Morris Worm (1988): exploited gets() in fingerd to overflow a stack buffer and execute shellcode on VAX systems. First internet-scale worm.
CVE-2021-3156 “Baron Samedit”: heap buffer overflow in sudo. An off-by-one in the command-line parsing allowed writing past a heap allocation, eventually letting a local user gain root on most Linux distributions.
CVE-2019-11931: a stack overflow in WhatsApp’s GIF parser, exploitable by sending a crafted GIF. Remote code execution on Android.

Defense

The fix is always “bound the read”:

/* Option 1: width specifier on scanf */
scanf("%29s", name);                   /* at most 29 chars + '\0' in a 30-byte buffer */

/* Option 2: fgets for a whole line */
char line[100];
fgets(line, sizeof(line), stdin);      /* at most 99 chars + '\0' */

And never gets(). It was removed from the C standard in C11 because it fundamentally cannot be used safely.

The habit that defeats all five

The five CWEs have five different fixes, but one shared habit underwrites all of them: compile with warnings on and treat warnings as errors.

gcc lab1.c -Wall -Wextra -Wformat=2 -pedantic -std=c90 -o lab1

-Wall -Wextra catches the = vs == mistake, the implicit declarations, the unused variables.
-Wformat=2 catches printf(user_input).
-pedantic -std=c90 refuses non-standard extensions so your code behaves the same on every C90 compiler.
-Wmisleading-indentation (in -Wall on GCC 6+) catches the Apple “goto fail” shape.

None of these flags would have caught every historic CVE listed above, but each one catches a category. The combination, applied consistently, is why modern systems are not as exploitable as 1988 systems. Your Lab 1 autograder runs essentially this command.

Where to go next

Back to Variables & I/O
Back to If/Else, Switch & Loops (has the goto fail walkthrough)
For the machine-level mechanics, see Deep Dive: The C Machine Model.

CWE-481: assignment in a condition (= vs ==)

The pattern

Why it compiles

Defense

CWE-597: using == to compare strings

The pattern

Why a C programmer writes this

Defense

One more trap

CWE-483: missing braces on a multi-statement if body

The pattern (real code)

Why indentation is not enough

Defense

CWE-134: format-string vulnerability

The pattern

Defense

CWE-120: classic buffer overflow (no length cap on input)

The pattern

Historic examples

Defense

The habit that defeats all five

Where to go next

CWE-481: assignment in a condition (`=` vs `==`)

CWE-597: using `==` to compare strings

CWE-483: missing braces on a multi-statement `if` body