Arrays & Strings
Fixed-size arrays, null-terminated strings, and bounded I/O
Based on content from Dr. Stu Steiner, Eastern Washington University.
In a nutshell
Java’s int[] knew its own length. Java’s String was an object with methods. C does neither. A C array is a fixed-size, contiguous block of memory that does not carry its length and does not check its bounds. A C string is a char array whose last used byte is '\0'; every string function scans forward until it finds that byte. If the byte is missing, the function keeps reading. This lesson covers the habits that make arrays and strings safe to use without the safety net.
Practice this topic: Arrays or Strings drills, or browse the practice gallery.
After this lesson, you will be able to:
- Declare and initialize C arrays, including zero-filling with
= {0} - Compute element count with
sizeof(arr) / sizeof(arr[0])and say why it fails inside functions - Say why a C string is a
chararray with'\0'at the end - Use
strlen,strcmp,strcpy, andstrcatfrom<string.h> - Read input with
fgetsand say whygetsis never an acceptable substitute - Cap a
scanf("%s", ...)read with a width specifier
Quick reference
| Goal | C code | Notes |
|---|---|---|
| Fixed-size array | int scores[5]; |
uninitialized; holds garbage |
| Zero-filled array | int zeros[100] = {0}; |
first 0 is duplicated for the rest |
| Partially initialized | int g[5] = {90, 85}; |
remaining slots get 0 |
| Element count (where declared) | sizeof(arr) / sizeof(arr[0]) |
does not work on function parameters |
| String literal | char s[] = "hi"; |
3 bytes including '\0' |
| String length | strlen(s) |
does not count the '\0' |
| Compare strings for equality | strcmp(a, b) == 0 |
zero means equal |
| Read a word, bounded | scanf("%49s", name) for char name[50] |
leaves room for '\0' |
| Read a line, bounded | fgets(buf, sizeof(buf), stdin) |
reads at most size - 1 chars |
| Dangerous input function | gets(buf) |
never; removed from C11 |
Coming from CSCD 210
In Java you called arr.length, s.length(), s.equals(other), and the language refused out-of-bounds access. In C you do none of that. A “string” is a plain char array with a sentinel byte at the end. Every <string.h> function you will meet exists because the language does not have string methods. The techniques below are the defensive habits for working without the safety net.
Style note for this lesson
This lesson appears in week 5. The course has moved from C90 to C99, so for (int i = 0; ...) is legal. For consistency with the earlier lessons, examples below still declare the loop counter at the top and use for (i = 0; ...). That style compiles under both C90 and C99.
Arrays without a safety net
In Java, arr[100] on a size-10 array throws ArrayIndexOutOfBoundsException. In C, the same access silently does something. Maybe it reads garbage. Maybe it overwrites another variable. Maybe it crashes. C does not check array bounds, ever.
Declaring arrays
int scores[5]; /* 5 ints, UNINITIALIZED (garbage) */
int grades[5] = {90, 85, 77, 92, 88}; /* fully initialized */
int zeros[100] = {0}; /* first element 0, rest auto-filled with 0 */
int partial[5] = {90, 85}; /* rest auto-filled with 0 */
Always initialize. Either with values or with = {0} to zero-fill. An uninitialized array holds whatever bytes were last in that memory.
Size must be a compile-time constant
#define MAX_SCORES 10
int scores[MAX_SCORES]; /* fine */
int n = 100;
int buf[n]; /* C99 VLA; not in C90; avoid */
A literal, a #define, or (in C99+) an enum value all qualify. A const int variable does not qualify in C (it does in C++). Variable-length arrays (VLAs) are legal in C99 but optional in C11 and poorly supported across toolchains. For a runtime-sized buffer, use malloc (covered this week).
No .length property
C arrays do not know their own size. You track it yourself:
int grades[5] = {90, 85, 77, 92, 88};
int size = sizeof(grades) / sizeof(grades[0]); /* 20 / 4 = 5 */
int i;
for (i = 0; i < size; i++) {
printf("%d\n", grades[i]);
}
The sizeof trick works where the array was declared. It does not work inside functions that receive the array as a parameter. Inside a function, sizeof(arr) gives the size of a pointer (typically 8 on 64-bit), not the array.
Passing arrays to functions
When you pass an array to a function, the function receives a pointer to the original storage, not a copy. You must also pass the size.
void print_array(const int arr[], int size)
{
int i;
for (i = 0; i < size; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
const documents “I promise not to write to this” and the compiler enforces it. Modifying array elements inside the function affects the caller’s array.
Check your understanding (what is the bug?)
void total(int arr[])
{
int i;
int sum = 0;
int n = sizeof(arr) / sizeof(arr[0]);
for (i = 0; i < n; i++) {
sum += arr[i];
}
printf("sum = %d\n", sum);
}
The caller passes a 10-element array. total prints a sum much smaller than expected. What is wrong?
Reveal answer
Inside total, arr is a pointer to the first element, not a 10-element array. On 64-bit systems sizeof(arr) is 8 and sizeof(arr[0]) is 4, so n is 2, and only the first two elements get summed.
Fix: pass the size as a separate parameter.
void total(const int arr[], int n) { ... }
Strings as character arrays
There is no String class in C. A string is a char array ending with '\0' (one byte, value 0). Every string function walks forward until it hits that byte.
char greeting[] = "Hello";
In memory:
Index: [0] [1] [2] [3] [4] [5]
Value: 'H' 'e' 'l' 'l' 'o' '\0'
The array is 6 bytes: 5 characters plus the terminator. If the byte is missing, the function keeps reading past the array.
For the historical reason C strings look this way (register pressure on the PDP-11, BCPL heritage), see the deep dive on C standards and string history.
<string.h> functions
#include <string.h> to use these.
char name[] = "Alice";
printf("%zu\n", strlen(name)); /* 5, not 6 */
if (strcmp(name, "Alice") == 0) { /* equal */ }
if (strcmp(name, "Bob") < 0) { /* name sorts before "Bob" */ }
char dest[50];
strcpy(dest, "Hello"); /* dest is now "Hello\0" */
char greeting[50] = "Hello, ";
strcat(greeting, "world!"); /* greeting is now "Hello, world!" */
strlen returns size_t. Use %zu or cast to (int).
strcmp returns 0 when equal, a negative value when the first sorts before the second, a positive value when it sorts after. Zero means equal: memorize it. The idiom is strcmp(a, b) == 0 for equality; writing strcmp(a, b) == 1 is non-portable (the standard only promises the sign).
strcpy and strcat do not check destination size. Copying a 100-character string into a 50-character buffer overflows silently. Use bounded variants (strncpy, strncat, or POSIX strlcpy, strlcat) when the source is not a known literal.
Never use == on strings
if (a == b) /* compares two ADDRESSES, not contents */
Because a string is an address, a == b is “do both names refer to the same memory location?” not “do they hold the same text?” Two arrays holding identical text compare as not equal. This is CWE-597.
Check your understanding (predict the output)
#include <stdio.h>
#include <string.h>
int main(void)
{
char a[] = "apple";
char b[] = "apricot";
printf("%d\n", strcmp(a, b) == 0);
printf("%d\n", strcmp(a, b) == -1);
printf("%d\n", strcmp(a, b) < 0);
return 0;
}
Reveal answer
0
0 or 1 (depends on libc)
1
strcmp(a, b) == 0is false. Prints0.strcmp(a, b) == -1is the anti-pattern. On glibc it usually prints1becausestrcmpreturns exactly-1for “apple” vs “apricot”; on some other libcs the result is-2or-14, and the line prints0. Do not write this.strcmp(a, b) < 0is the portable form of “a sorts before b.” Prints1.
Reading strings safely
scanf("%s", ...) stops at whitespace and has no upper bound on how many characters it writes unless you give it one. Two bounded options.
Width specifier for a single word
char name[50];
scanf("%49s", name); /* at most 49 chars, leave room for '\0' */
The cap is one less than the array size so the null terminator fits. Lab 1 does not require this cap, but the reflection on CWE-120 walks through what happens without it.
fgets for a whole line, including spaces
char line[100];
fgets(line, sizeof(line), stdin); /* reads at most 99 chars plus '\0' */
line[strcspn(line, "\n")] = '\0'; /* strip the trailing newline if present */
fgets(buf, N, stream) reads at most N - 1 characters (the -1 leaves room for '\0'), stops at a newline or EOF, and always writes '\0' at the last used byte. If the newline arrives within the limit it is stored in buf; the strcspn trick finds and overwrites it.
Never use gets
char buf[100];
gets(buf); /* NEVER. Removed from C11. */
gets reads until a newline with no upper bound. It is what the 1988 Morris Worm exploited in fingerd, the first internet-scale worm. gets was removed from C11 entirely. If you see it in legacy code, file a ticket. For the exploit mechanics (attacker-controlled bytes overwriting boarding_zone or a return address), see the CWE-120 walkthrough in the memory-safety deep-dive.
Java vs. C, and what comes next
| Operation | Java | C |
|---|---|---|
| Declare | String s = "hello"; |
char s[50] = "hello"; |
| Length | s.length() |
strlen(s) |
| Compare | s.equals("hello") |
strcmp(s, "hello") == 0 |
| Copy | s2 = s; |
strcpy(s2, s); |
| Concatenate | s + " world" |
strcat(s, " world"); (pre-sized buffer) |
| Read line | scanner.nextLine() |
fgets(s, sizeof(s), stdin); |
Java’s String is an immutable object with methods. A C string is a mutable char array plus a pointer convention. You manage the buffer, you check the bounds, you track the null terminator.
- B is wrong: inside a function,
sizeof(arr)gives the size of a pointer. Pass the length separately. - E is wrong:
getswas removed from the C standard in C11. Trusting the user does not make an unbounded read safe.
Next, Headers, Makefiles & CLI Args covers splitting a program across multiple files with headers, how extern shares a declaration, and how argc / argv work. Drill this page: Arrays · Strings · practice gallery.