How Do I Work with Strings as Character Arrays?
Null terminators, string.h, and the biggest difference from Java
After this lesson, you will be able to:
- Explain that C strings are
chararrays terminated by'\0' - Use
strlen,strcmp,strcpy, andstrcatfrom<string.h> - Explain why
==compares addresses for strings and usestrcmpinstead - Use
fgetsfor safe string input and strip the trailing newline - Identify buffer overflow risks with
strcpy,strcat, andscanf("%s") - Use
<ctype.h>functions to test and convert individual characters
There Is No String Type
In Java, you write:
String name = "Alice";
if (name.equals("Alice")) { ... }
String greeting = "Hello, " + name;
Clean, safe, and object-oriented. Now here’s the C equivalent:
char name[50] = "Alice";
if (strcmp(name, "Alice") == 0) { ... }
// String concatenation? It's... complicated.
C has no String class. A “string” in C is just an array of char that ends with a special byte: '\0' (the null terminator). Every string function — strlen, strcmp, printf("%s") — works by scanning forward until it hits that '\0'. If it’s missing, your program reads off into random memory.
Character Arrays and String Functions
How C Strings Work
When you write:
char greeting[] = "Hello";
C stores this in memory:
Index: [0] [1] [2] [3] [4] [5]
Value: 'H' 'e' 'l' 'l' 'o' '\0'
The array is 6 bytes — 5 characters plus the null terminator. The '\0' is ASCII value 0, and it marks the end of the string.
Key Insight: A C string is just an array of characters ending with
'\0'(the null terminator, ASCII value 0). Every string function —strlen,strcmp,printf("%s")— scans forward until it finds that'\0'. Without it, functions read past the end of your array into garbage memory.
String Library Functions (<string.h>)
strlen — string length (not counting '\0'):
char name[] = "Alice";
printf("%zu\n", strlen(name)); // 5 (not 6!)
strcmp — compare two strings:
if (strcmp(name, "Alice") == 0) // Equal
if (strcmp(name, "Bob") < 0) // name comes before "Bob" alphabetically
if (strcmp(name, "Aaron") > 0) // name comes after "Aaron"
Common Pitfall: NEVER use
==to compare strings in C.if (str1 == str2)compares memory addresses, not contents. Two identical strings stored in different locations would compare as “not equal.” Always usestrcmp(str1, str2) == 0.
strlen("Hello") return?strlen counts characters until it hits '\0', but does not count the null terminator itself. The string "Hello" occupies 6 bytes in memory (H-e-l-l-o-\0), but strlen returns 5. This distinction matters when allocating buffer space — you always need strlen + 1 bytes to store a string.
strcpy — copy one string to another:
char dest[50];
strcpy(dest, "Hello"); // dest is now "Hello"
strcat — concatenate (append) strings:
char greeting[50] = "Hello, ";
strcat(greeting, "world!"); // greeting is now "Hello, world!"
Common Pitfall: Both
strcpyandstrcatdon’t check whether the destination buffer is large enough. Copying a 100-character string into a 50-character buffer overflows silently. Always ensure your destination is big enough, or use the saferstrncpy/strncatvariants.
Reading Strings from Users
scanf("%s") — reads one word (stops at whitespace):
char word[50];
scanf("%s", word); // Note: no & needed (array name IS an address)
Common Pitfall:
scanf("%s")has no way to know how large your buffer is. If the user types 1000 characters into a 50-byte buffer,scanfwrites past the end — a buffer overflow. Usefgetsfor safe input.
fgets — reads a full line (safe):
char line[100];
fgets(line, sizeof(line), stdin);
fgets reads at most sizeof(line) - 1 characters, always adds '\0', and includes the newline if the line fits. But that trailing \n is often unwanted:
// Strip the trailing newline
line[strcspn(line, "\n")] = '\0';
The Trick: Use
fgets+ newline stripping for safe line reading.scanf("%s")stops at whitespace and has no overflow protection. Thestrcspnapproach cleanly removes the trailing newline.
char name[5] = "Alice";. What's wrong with this?"Alice" requires 6 bytes: 5 for the characters plus 1 for the null terminator '\0'. A char[5] array can only hold 5 bytes, so the null terminator either gets cut off or overwrites adjacent memory. Without the null terminator, string functions like strlen and printf("%s") will read past the array into garbage memory. Use char name[6] or char name[] = "Alice"; (the compiler figures out the size).
Why does this matter?
Off-by-one errors with string buffers are one of the most common C bugs. Every string needs one extra byte for the null terminator. When you allocate buffers for user input, always account for it: a 50-character input needs a char[51] at minimum.
Deep dive: Why gets() was removed from C — a real security disaster
C once had a function called gets() that read a line from stdin:
char buffer[100];
gets(buffer); // NEVER USE THIS — removed from C11
The problem: gets() had no way to specify buffer size. A user could type 10,000 characters and overflow any buffer. This was so dangerous that it was the attack vector for the 1988 Morris Worm — one of the first internet worms, which exploited gets() in the Unix fingerd daemon to gain unauthorized access to thousands of machines.
The function was deprecated in C99, removed entirely in C11, and is classified as CWE-242 (Use of Inherently Dangerous Function).
Always use fgets() instead:
char buffer[100];
fgets(buffer, sizeof(buffer), stdin); // Safe — reads at most 99 chars + '\0'
The general rule: any function that writes to a buffer without a size limit (gets, strcpy, sprintf, scanf("%s")) is a potential buffer overflow. Prefer the bounded versions (fgets, strncpy, snprintf, scanf("%49s")).
| Dangerous | Safe alternative | Why |
|———–|—————–|—–|
| gets(buf) | fgets(buf, size, stdin) | No size limit vs. bounded read |
| strcpy(d, s) | strncpy(d, s, size) | No overflow check vs. bounded copy |
| sprintf(d, ...) | snprintf(d, size, ...) | No overflow check vs. bounded write |
| scanf("%s", buf) | scanf("%49s", buf) or fgets | No size limit vs. field width limit |
Character Functions (<ctype.h>)
Test and convert individual characters:
#include <ctype.h>
char ch = 'a';
if (isalpha(ch)) { ... } // Is it a letter?
if (isdigit(ch)) { ... } // Is it a digit?
if (isupper(ch)) { ... } // Is it uppercase?
char upper = toupper(ch); // 'A'
char lower = tolower('B'); // 'b'
Java vs. C String Comparison
| Operation | Java | C |
|---|---|---|
| Declare | String s = "hello"; |
char s[50] = "hello"; |
| Length | s.length() |
strlen(s) |
| Compare | s.equals("hello") |
strcmp(s, "hello") == 0 |
| Copy | s2 = s; (reference copy) |
strcpy(s2, s); |
| Concatenate | s + " world" |
strcat(s, " world"); |
| Read line | scanner.nextLine() |
fgets(s, size, stdin); |
| Char at index | s.charAt(i) |
s[i] |
From Java: Java’s
Stringis an immutable object with methods. C’s “string” is a mutablechararray with library functions. There’s no garbage collection, no automatic resizing, no bounds checking. You manage the buffer, you check the bounds, you track the null terminator.
if (str1 == str2) actually compare when str1 and str2 are char arrays?str1 == str2 checks whether both arrays start at the same memory location — which is almost never what you want. Two arrays containing "Hello" at different locations would compare as "not equal." Use strcmp(str1, str2) == 0 to compare actual string contents.
Complete Example: Name Processor
#include <stdio.h>
#include <string.h>
#include <ctype.h>
void capitalize(char str[])
{
if (strlen(str) > 0)
{
str[0] = toupper(str[0]);
}
}
int main(void)
{
char first[50];
char last[50];
char full[100];
printf("Enter first name: ");
fgets(first, sizeof(first), stdin);
first[strcspn(first, "\n")] = '\0';
printf("Enter last name: ");
fgets(last, sizeof(last), stdin);
last[strcspn(last, "\n")] = '\0';
capitalize(first);
capitalize(last);
strcpy(full, last);
strcat(full, ", ");
strcat(full, first);
printf("Full name: %s\n", full);
printf("Name length: %zu characters\n", strlen(full));
return 0;
}
Quick Check: What does the null terminator '\0' do?
It marks the end of a string. Every string function (strlen, strcmp, printf("%s")) scans characters until it encounters '\0'. Without it, these functions read past the array into undefined memory.
Quick Check: Why is if (str1 == str2) wrong for comparing strings?
== compares memory addresses, not string contents. Two identical strings stored at different memory locations would compare as “not equal.” Use strcmp(str1, str2) == 0 to compare contents.
Quick Check: Why is fgets safer than scanf("%s")?
fgets takes a size parameter and never reads more than that many characters, preventing buffer overflows. scanf("%s") has no size limit — it reads until whitespace, potentially writing far past the end of your buffer.
Strings Are Where C Gets Real
The gap between Java and C is widest with strings. Java gives you an immutable, bounds-checked, garbage-collected String object. C gives you a mutable array of bytes with a null terminator and no safety net. This is harder, but it’s also how strings actually work at the hardware level — and understanding this makes you a stronger programmer in any language.
Next: sorting and searching. You’ll implement selection sort on arrays and use it to find medians — building the algorithms by hand that Java’s Arrays.sort() hides from you.
Big Picture: Strings in C are where most security vulnerabilities live. Buffer overflows from
strcpy,strcat, andscanf("%s")have caused some of the worst security breaches in computing history. Understanding null terminators and buffer sizes isn’t academic — it’s the foundation of writing secure C code. Every string function you use from here on should make you ask: “Is my buffer big enough?”