Pointer Basics
Addresses, the & and * operators, and the three states every pointer is in
Based on content from Dr. Stu Steiner, Eastern Washington University.
In a nutshell
In CSCD 210 you wrote Scanner sc = new Scanner(System.in); and sc secretly held a reference, an invisible address that pointed at the Scanner object on the heap. Java never let you see that address. C does. A pointer is a variable that holds the address of another object. You get an address with the unary & operator, you declare storage for an address with T *p, and you reach through the address back to the object with unary *. By the end of this lesson you can read int *p = &x; *p = 42; out loud: “p is a pointer to an int, initialized to the address of x; writing through p stores 42 at that address, so x is now 42.”
Practice this topic: Pointers drill, or browse the practice gallery.
After this lesson, you will be able to:
- Explain what an address is and why every variable has one
- Use
&to read an address and*to follow one - Declare pointer variables of different types and say what each row of the type table controls
- Distinguish declaration
*from expression*, and&from bitwise& - Identify whether a pointer is valid,
NULL, or indeterminate, and know which of those is safe to dereference
Quick reference
| Operator / construct | Read as | Example |
|---|---|---|
&x |
address of x |
&x has type int * when x is int |
T *p |
declaration: p is a pointer to T |
int *p; reserves storage for one address |
*p |
expression: the object p points to |
*p = 42; writes 42 into that object |
NULL |
the “points to no object” value | int *p = NULL; safe default |
%p |
printf conversion for an address |
printf("%p\n", (void *)&x); |
Coming from CSCD 210
Java gave you references to objects and primitives by value, and refused to let you mix the two. You could not take the address of an int, print the address of a String, or write a method that reassigned a caller’s variable. C lifts all three restrictions. The cost: you now have to keep track of whether a variable is an object, an address, or the object reached through an address, because the compiler will not stop you from mixing them up.
Every variable has an address
Every variable lives in memory, at a specific address. An address is just a number that identifies a byte in your process’s memory. The unary address-of operator & reads that number for you:
#include <stdio.h>
int main(void)
{
int x = 42;
printf("x = %d\n", x);
printf("&x = %p\n", (void *)&x);
return 0;
}
x = 42
&x = 0x7ffd1a2b4c50
Run it twice and the &x line changes. Linux randomizes stack layout at startup (Address Space Layout Randomization), so the exact number differs every run. Inside a single run the layout is stable: if int y; sits next to int x;, &y is four bytes away from &x every time.
Size, bytes, and the %p convention
On the lab machines (Linux x86-64):
| Type | Size |
|---|---|
char |
1 byte |
int |
4 bytes |
double |
8 bytes |
Any pointer T * |
8 bytes |
An int occupies four contiguous bytes. The address of a variable is the address of its first byte, and the type tells the compiler how many bytes to read. When you print an address with printf, use %p and cast the argument to void *:
printf("%p\n", (void *)&x);
%p is the portable pointer-printing conversion (ISO/IEC 9899:2018 §7.21.6.1). -Wall will warn if you pass a concrete pointer type without the cast.
Three rules for &
&applies only to named storage (variables, array elements, struct members).&xis legal;&42and&(x + 1)are not, because those are computed values that never had an address.- If
xhas typeT, then&xhas typeT *.&adds one level of indirection to the type. - The unary
&is unrelated to the binary&(bitwise AND). Same character, different operator, disambiguated by context:a & bis bitwise AND,&ais address-of.
Check your understanding (fill in the blank)
int n = 7;
int *p = &n;
The type of &n is _______. The type of p is _______. The value stored in p is the _______ of n.
Reveal answer
&n has type int *. p has type int *. The value stored in p is the address of n. “Address” is not itself a type; the type is int * (“pointer to int”), and the value is the particular address where n lives.
Declaring and dereferencing pointers
T *p means “*p is a T”
The declaration
int *p;
is K&R’s mnemonic: read it as “*p is an int.” The * binds to the variable name, not to the type. One common trap:
int *p, q; /* p is int *, q is plain int. NOT two pointers. */
To declare two pointers, write int *p, *q;, or put each on its own line.
Every pointer variable carries two independent facts: the 8-byte address it holds, and the type of what that address points at. The type controls three things.
| Pointer type | *p reads |
p + 1 advances by |
Assign to / from |
|---|---|---|---|
char * |
1 byte | 1 byte | other char * only |
int * |
4 bytes | 4 bytes | other int * only |
double * |
8 bytes | 8 bytes | other double * only |
int ** |
8 bytes (as an int *) |
8 bytes | other int ** only |
void * |
illegal to dereference | illegal arithmetic | any object pointer |
void * is the generic address type. It is what malloc returns, and it is the one pointer type that converts to and from other pointer types without a cast. It cannot be dereferenced directly, because the compiler has no size to read.
Reaching the object: unary *
Unary * is the dereference (or indirection) operator. Apply it to a value of type T * and you get the T at that address:
int x = 10;
int *p = &x;
int y = *p; /* READ through p. y becomes 10. */
*p = 42; /* WRITE through p. x is now 42. */
*p += 1; /* READ-MODIFY-WRITE. x is now 43. */
*p is an lvalue: it names storage, so it can appear on either side of =. On the right it reads; on the left it writes. That read/write duality is what makes the next lesson’s pass-by-pointer pattern work.
The two faces of *
The * character means different things in different grammatical positions.
| Position | Meaning | Example |
|---|---|---|
| Declaration | “pointer to” (type syntax) | int *p; reserves a pointer variable |
| Expression | dereference (runtime operator) | *p = 42; writes through the pointer |
| Arithmetic | multiplication | n * sizeof(int) |
One line can carry all three:
int *p = malloc(n * sizeof(*p));
The first * is declaration syntax, the * inside sizeof(*p) is expression dereference (unevaluated, used only for its type), and the n * is multiplication. Three different operators sharing one character.
Check your understanding (predict the output)
#include <stdio.h>
int main(void)
{
int a = 3;
int b = 5;
int *p = &a;
*p = *p + b;
p = &b;
*p = *p * 2;
printf("a = %d, b = %d\n", a, b);
return 0;
}
Reveal answer
Prints a = 8, b = 10.
*p = *p + b;writes throughpintoa:abecomes3 + 5 = 8.p = &b;retargetspto point atbinstead.aandbare unchanged.*p = *p * 2;writes through the newp:bbecomes5 * 2 = 10.
The pointer variable p and the object *p are separate things. Changing p (retargeting) is not the same as changing *p (writing through it).
Every & on the outside of an expression adds one * to its type. Every * strips one off. That one rule turns every pointer question into a two-column problem (type, value). See the pointer type algebra deep dive for a full worked-through set of examples.
For the full machine-level picture of where x lives on the stack, what &x resolves to in the ELF layout, and why the same program prints different addresses each run, see the machine model deep dive.
Valid, NULL, indeterminate: the three states of a pointer
At any moment, a pointer variable is in one of three states. Only one of them is safe to dereference.
1. Valid. Holds the address of a live object of the matching type. Dereference is defined and does exactly what you expect.
int x = 10;
int *p = &x; /* valid */
*p = 42; /* safe */
2. NULL. Holds the null pointer value. Dereference is undefined behavior. On Linux /proc/sys/vm/mmap_min_addr leaves the zero page unmapped, so a NULL dereference reliably traps (SIGSEGV) instead of silently corrupting data.
int *p = NULL;
if (p != NULL) {
*p = 42; /* not reached */
}
3. Indeterminate. Declared inside a function without an initializer. The bytes of p are whatever happened to be in that stack slot (ISO/IEC 9899:2018 §6.7.9). Not NULL. Not zero. Dereferencing is undefined behavior, and the compiler is entitled to assume no execution path reaches it and optimize the code accordingly.
void f(void)
{
int *p; /* INDETERMINATE, not NULL */
*p = 5; /* undefined behavior */
}
Defensive style
- Initialize every automatic-storage pointer on the line that declares it. Either to
NULLor to a real address. - Check library returns that can fail (
malloc,calloc,realloc,fopen,getenv) againstNULLbefore you use them. - Write
NULL, not0or(void *)0, for pointer context. It compiles to the same thing but reads as a pointer. - Enable
-Wall -Wextraand fix the warnings. gcc’s-Wuninitializedcatches the common cases.
int *p = malloc(n * sizeof(*p));
if (p == NULL) {
fprintf(stderr, "out of memory\n");
return 1;
}
/* from here on, p is valid; dereference is safe */
p[0] = 42;
Why this is not a Java NullPointerException
Java zero-initializes object fields and throws a clean NullPointerException when you dereference null. C does neither. An uninitialized pointer contains garbage; dereferencing it may crash immediately, may silently corrupt an unrelated variable, or may appear to work until ten function calls later. There is no JVM to catch the mistake. The discipline that replaces Java’s runtime check is the four bullets above.
Check your understanding (what is wrong?)
#include <stdio.h>
int *make_five(void)
{
int x = 5;
return &x;
}
int main(void)
{
int *p = make_five();
printf("%d\n", *p);
return 0;
}
Reveal answer
x is a local variable inside make_five. It lives on make_five’s stack frame, which is reclaimed the moment make_five returns. The returned address still looks valid, but the storage underneath it is no longer alive, and any later function call is free to reuse those bytes. Dereferencing *p in main is undefined behavior: sometimes it prints 5, sometimes garbage, sometimes crashes under optimization.
The fix is either to return by value (int make_five(void) { return 5; }) or to allocate on the heap with malloc and return that address; that is Thursday’s lecture.
What comes next
- C is wrong: the
*binds to the name, soqis a plainint. Writeint *p, *q;for two pointers. - E is wrong: C has no exceptions. A NULL dereference on Linux raises
SIGSEGVbecause the zero page is unmapped, but the standard calls it undefined behavior, not an exception.
Next, Pass-by-Pointer uses the mechanism you just learned to fix the broken swap function and to explain the & in scanf("%d", &x). Drill this page with the Pointers skill card, or browse the practice gallery.