Deep Dive: C Standards, String Design, and Why C90 Discipline Still Matters
How C became C, why strings are null-terminated, where C90 code still runs, and what each standard added
Based on content from Dr. Stu Steiner, Eastern Washington University.
This page is optional reading. The main lessons tell you what rules to follow in C90 and what changes in C99. This page tells you why the language looks this way, and why the C90 dialect this course opens with still matters in 2026.
How C got here (very short version)
- 1969–1971, Bell Labs. Ken Thompson writes B, a stripped-down BCPL, to rewrite parts of Unix. B is typeless; every cell is an integer on a PDP-7. Dennis Ritchie modifies B to add types and renames it C for the PDP-11.
- 1978. Kernighan and Ritchie publish The C Programming Language. This “K&R C” is the de-facto standard for a decade.
- 1989. ANSI ratifies C89. ISO adopts it as ISO/IEC 9899:1990 the next year. The two are informally “the same standard”; the whole C world calls it C90.
- 1999. C99 adds
//comments, inline declarations infor, variable-length arrays,long long,<stdbool.h>,<stdint.h>, compound literals, and designated initializers. - 2011. C11 makes VLAs optional, adds
_Static_assert,_Generic, atomics, threads,_Noreturn, and anonymous structs. Removesgets()entirely. - 2017. C17 is a bug-fix release with no new features.
- 2023. C23 adds
bool/true/falseas keywords,nullptr, two’s-complement as required,[[attributes]], binary literals, typeof, improvedenum.
The two tectonic shifts are C89 (the language got a formal grammar) and C99 (the language got things modern programmers expect). Everything after C99 is refinements.
Why null-terminated strings
A string in C is a char array whose last byte is '\0'. Every string library function scans forward until it hits that byte. Modern languages almost universally prefix the string with its length instead. Why did C go the other way?
Two reasons, both historical.
Register pressure on the PDP-11. The PDP-11 had eight general-purpose registers. A string operation that needed to carry both a pointer and a length required two registers; one that only carried a pointer needed one. On a machine where registers were the most constrained resource in the language runtime, halving the register pressure on string operations was a real win. Length-prefixed strings would have been more expensive to pass through function calls.
BCPL heritage. BCPL, C’s grandparent, was entirely typeless. Arrays were pointer-to-word, strings were pointer-to-byte-sequence, and the convention that “end of sequence is marked by a zero byte” came over from BCPL directly. C inherited the convention along with much of its syntax.
The trade-off C accepted is permanent. Every strlen call is O(n). Every buffer-overflow bug since the 1980s has some part of its root cause in “the function had no way to know how long the buffer was.” Length-prefixed strings (Pascal, modern languages) make overflow structurally harder; null termination makes it structurally easy.
Rob Pike’s 2003 essay The Good, the Bad, and the Ugly: The JavaScript Language is uncharacteristically not relevant here, but his The UTF-8 history page is: UTF-8 was deliberately designed so that existing C string code keeps working on UTF-8 data, which would not have been possible with a length-prefixed design.
What each standard added (programmer’s view)
C89 / C90
The baseline this course uses for week 4. Everything K&R plus:
- A formal grammar and semantics.
- Function prototypes (
int add(int, int);). Before C89, function parameters were declared K&R-style in a separate list after the parameter list. Many C90 compilers still accept that dialect for legacy code. - Standard library headers as we know them (
<stdio.h>,<stdlib.h>,<string.h>,<math.h>,<time.h>, and others). voidas a keyword, includingvoid *.- Signed vs unsigned types as distinct categories.
- Trigraph sequences (
??=for#, and so on) that almost no one uses now.
What is not in C90, and that you will feel the absence of:
- No
//comments. - No declarations in the middle of a block.
- No
for (int i = ...)counter declaration. - No
long long. - No
bool. - No VLAs, no designated initializers, no compound literals.
C99
The big upgrade this course switches to in week 5:
//single-line comments.- Declarations anywhere in a block, including
for (int i = 0; ...). long long(guaranteed at least 64 bits).<stdbool.h>withbool,true,false.<stdint.h>withint32_t,int64_t, and friends (fixed-width integers you actually want when writing portable code).- Variable-length arrays (
int arr[runtime_n];). Optional in C11 and later. - Compound literals:
(struct Point){ .x = 3, .y = 4 }. - Designated initializers:
int days[12] = { [0] = 31, [1] = 28, ... };. inlinefunctions,restrictpointers, variadic macros,_Bool,_Complex.
C99 is the dialect most open-source C projects target today, often with a few C11 features backfilled by the toolchain.
C11
_Static_assert(compile-time assertions)._Generic(type-based macro dispatch, the closest C gets to overloading).- Optional atomics and threading (
<stdatomic.h>,<threads.h>). - Removed
gets()from the standard library entirely. - Anonymous unions and structs.
C17 / C23
C17 is a bug-fix release: no new language features.
C23 pulls several long-overdue ergonomics into the language itself: bool, true, false, and nullptr are keywords; [[nodiscard]], [[deprecated]], [[maybe_unused]] come from C++; binary literals (0b1010); #embed for embedding binary assets; typeof for type expressions. Few toolchains have full C23 support as of this writing.
Why C90 discipline still matters
You might wonder why this course opens in a 36-year-old dialect. Three concrete reasons.
Kernel and embedded code. The Linux kernel was strictly C89 until 2022 and is now pragmatically C11 with many C89 idioms retained. FreeBSD and OpenBSD kernels are C89 / C99. Most microcontroller SDKs, and the CMSIS standard for ARM Cortex-M, are C89. The AUTOSAR standard for automotive software requires MISRA C (which subsets C90 or C99). If your career goes anywhere near an OS, a driver, a microcontroller, an avionics system, or a car, you will read code in this dialect.
Portable code. If you want your C code to compile under every C compiler still in maintenance (GCC, Clang, MSVC, Tiny C, Intel, IBM XL, Oracle Studio, Green Hills, Keil, IAR, Watcom), the intersection of their feature sets is roughly C90 with a handful of C99 features everyone supports. Compiler vendors are slow because their customers are slow.
The discipline transfers. C90 makes you declare variables before statements, which makes you think about what variables a function actually needs. It makes you write complete prototypes. It makes you treat const as opt-in rather than the default. These are the habits that show up in every C code review from every company. C99 syntax is nicer; the C90 habits are more valuable.
Once you have the C90 discipline, you switch on C99 features one at a time as the problem calls for them, instead of reaching for for (int i = 0; ...) by reflex and losing the ability to read older code.
MISRA C, just so you have heard the name
MISRA C is a style and correctness guideline maintained by the Motor Industry Software Reliability Association. It defines roughly 150 rules (about half “required,” half “advisory”) that subset C to make it safer for safety-critical use: no recursion, no dynamic memory, no goto, no implicit conversions, mandatory braces on every block, no side effects in a single expression, and many more. Every MISRA-compliant program is a C90 (or C99) program; not every C90 program is MISRA-compliant.
You will not be graded on MISRA in this course. But if you end up writing embedded or automotive C after graduation, MISRA is almost certainly what you will be held to, and the habits of this course (braces, declarations at top, warnings as errors, no gets) are direct ancestors of MISRA requirements.
Where to go next
- Back to Introduction to C
- Back to Arrays & Strings
- For the security consequences of legacy string design, see Deep Dive: C Memory Safety CVEs.