Pointer Arithmetic & Arrays
p + n scales by sizeof(*p), arr[i] is *(arr + i), and array parameters decay
Based on content from Dr. Stu Steiner, Eastern Washington University.
In a nutshell
When you write p + 1 in C, the address does not move by one byte. It moves by one element: sizeof(*p) bytes. For an int * on x86-64, that is four bytes. This scaling rule is the reason arrays and pointers click together so cleanly. The C standard literally defines a[i] as *(a + i), so a subscript is pointer arithmetic in disguise. In most expressions an array name decays into a pointer to its first element, which is why you can pass arr to a function and index through the pointer on the other side. The scaling rule is also why Heartbleed (CVE-2014-0160) was possible: C does not bounds-check pointer arithmetic, so an attacker-controlled length walks right past the end of your buffer.
Practice this topic: Pointer Arithmetic drill, or browse the practice gallery.
After this lesson, you will be able to:
- Compute the byte-accurate result of
p + nfor any pointer type - Explain why
a[i]and*(a + i)are defined as the same expression - Walk an array with either a subscript or a pointer, and translate between the two
- Say why
sizeof(arr)inside a function returns 8, not the array’s total bytes - Parse
*p + 1,*(p + 1),*p++, and(*p)++correctly, and know why they differ
Quick reference
| Expression | Parses as | Type | Effect |
|---|---|---|---|
p + n |
p + n |
T * |
advance by n * sizeof(T) bytes |
a[i] |
*(a + i) |
T |
element at index i (language definition) |
*p + 1 |
(*p) + 1 |
T |
integer add on the dereferenced value |
*(p + 1) |
*(p + 1) |
T |
pointer arithmetic, then dereference |
*p++ |
*(p++) |
T |
read *p, then advance p |
(*p)++ |
(*p)++ |
T |
post-increment the pointed-to object |
Coming from CSCD 210
In Java an int[] is an object. It carries its length in arr.length, and every access goes through a bounds check that throws ArrayIndexOutOfBoundsException on overflow. In C an array is a run of bytes with a name. The name decays into a pointer in almost every expression, and once it has decayed there is no length attached. No bounds check happens at runtime, so arr[100] on a size-10 array reads whatever is 400 bytes past the start. Every function that takes an array has to also take a length parameter, because the pointer alone cannot tell the callee how many elements are there.
The scaling rule
Memory is byte-addressable. Every byte has its own address. An int on the lab machines (x86-64 Linux) occupies four bytes. So an int array at address 0x7540 lays out like this:
| Element | Address | Bytes |
|---|---|---|
arr[0] |
0x7540 |
0x7540–0x7543 |
arr[1] |
0x7544 |
0x7544–0x7547 |
arr[2] |
0x7548 |
0x7548–0x754B |
arr[3] |
0x754C |
0x754C–0x754F |
The gap between elements is four bytes, not one. C’s pointer arithmetic accounts for that automatically: for any pointer p of type T *,
p + n == (byte address of p) + n * sizeof(T)
So if arr lives at 0x7540:
arr + 0 -> 0x7540 /* type int * */
arr + 1 -> 0x7544 /* 0x7540 + 1*4 */
arr + 2 -> 0x7548 /* 0x7540 + 2*4 */
arr + 3 -> 0x754C /* 0x7540 + 3*4 */
arr + 4 -> 0x7550 /* legal to form; illegal to dereference */
The last line is the one-past-the-end pointer. The C standard guarantees you can form it and compare against it; dereferencing it is undefined behavior. That one-past-the-end pointer is what powers the idiomatic for (int *p = arr; p < arr + n; p++) loop.
Canonical sizes on the lab machines
| Type | sizeof |
|---|---|
char |
1 |
int |
4 |
long |
8 |
double |
8 |
any T * |
8 |
These are not the ISO C guarantees (the standard only promises minimum ranges); they are what gcc on Linux x86-64 produces, which is what every CSCD 240 quiz will assume. When you trace pointer arithmetic on paper, always show the scale factor:
base + n * sizeof(T) = result
If the pointer is int *, the scale is 4. If it is double *, long *, or any pointer-to-pointer, the scale is 8. void * cannot be incremented in standard C because it has no element size to scale by.
Check your understanding (fill in the blank)
long data[4]; /* lives at 0xA000 */
long *p = data;
p + 2 has type ______ and value 0x______. p + 4 has type ______ and value 0x______; dereferencing it is __.
Reveal answer
p + 2has typelong *and value0xA010(0xA000 + 2 * sizeof(long) = 0xA000 + 16 = 0xA010).p + 4has typelong *and value0xA020(0xA000 + 4 * 8 = 0xA020). It is the one-past-the-end pointer; forming it is legal, dereferencing it is undefined behavior.
Arrays and pointers are the same expression
a[i] is defined as *(a + i)
This is the language definition, not an implementation detail. ISO/IEC 9899:2018 §6.5.2.1 paragraph 2: “The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).” Three consequences:
a[i]and*(a + i)produce the same value, the same type, and the same machine code. You can mix the notations freely.- Because addition commutes,
a[i]is the same asi[a].5[arr]is a legal (if ugly) way to writearr[5]. Do not write it in production, but know why it is legal:[]is pointer arithmetic. - Indexing works identically on stack arrays,
malloc-ed blocks, and function parameters, because the compiler emits the same arithmetic regardless of where the storage lives.
Array-name decay
When you declare int arr[5], the name arr has type “array of 5 int.” In almost every expression, C silently converts it to int *, pointing at &arr[0]. This is array-to-pointer decay (§6.3.2.1 paragraph 3).
Three contexts where decay does not happen:
sizeof(arr)returns the size of the whole array, not the size of a pointer.&arryields a pointer of typeint (*)[5], the address of the array as a whole object.- A string literal used to initialize an array:
char s[] = "hello";copies the literal rather than decaying it.
Everywhere else (arr + 1, arr[i], printf("%p", arr), foo(arr), arr == other) the name arr is an int * pointing at the first element.
sizeof(arr) versus sizeof(p)
Inside the scope that declared the array, sizeof tells the truth:
int arr[10];
size_t count = sizeof(arr) / sizeof(arr[0]); /* 40 / 4 == 10 */
Cross a function boundary, and the story changes. ISO §6.7.6.3 paragraph 7 says: “A declaration of a parameter as ‘array of type’ shall be adjusted to ‘qualified pointer to type.’” All three of the following declare the same function:
void f(int arr[]); /* looks like an array parameter */
void f(int arr[10]); /* the 10 is ignored by the compiler */
void f(int *arr); /* what the compiler actually sees */
The [10] is decoration. Inside f, the parameter is int *, and sizeof(arr) is 8, not 40.
void print_all(int arr[])
{
size_t n = sizeof(arr) / sizeof(arr[0]); /* 8 / 4 == 2. WRONG. */
for (size_t i = 0; i < n; i++) {
printf("%d\n", arr[i]);
}
}
The sizeof(arr) / sizeof(arr[0]) idiom only works inside the scope that declared the array. Across a function boundary, pass the length as a separate parameter.
Check your understanding (predict the output)
#include <stdio.h>
void inspect(int arr[20])
{
printf("inside: %zu\n", sizeof(arr));
}
int main(void)
{
int a[20];
printf("outside: %zu\n", sizeof(a));
inspect(a);
return 0;
}
Reveal answer
Prints:
outside: 80
inside: 8
Inside main, a still has array type, so sizeof(a) is 20 * sizeof(int) = 80. Inside inspect, the int arr[20] parameter has already been adjusted to int *arr; the [20] is cosmetic. sizeof(arr) is sizeof(int *) = 8.
The fix is to pass the length: void inspect(int *arr, size_t n).
For the full machine-level story of array layout, &arr vs arr, and how a compiler lowers arr[i] to an address plus offset, see the machine model deep dive.
Precedence traps
The precedence table has 15 levels. Today you need four rows.
| Precedence | Operators | Associativity |
|---|---|---|
| Highest (postfix) | ++ -- (postfix), [], () |
left-to-right |
| Next (prefix/unary) | ++ -- (prefix), unary *, & |
right-to-left |
| Additive | binary +, binary - |
left-to-right |
| Relational | < > <= >= == != |
left-to-right |
Two facts fall out. Unary operators bind tighter than binary +, so *p + 1 is (*p) + 1, not *(p + 1). Postfix binds tighter than prefix, so *p++ is *(p++), not (*p)++.
*p + 1 versus *(p + 1)
With int arr[] = {10, 20, 30, 40} at 0x7540 and int *p = arr;:
| Expression | Parse | Type | Value |
|---|---|---|---|
*p + 1 |
(*p) + 1 |
int |
10 + 1 = 11 |
*(p + 1) |
*(p + 1) |
int |
*0x7544 = 20 |
Same operators, different parentheses, different type-value pair. *p + 1 dereferences first, then does integer addition. *(p + 1) does pointer arithmetic first (scaled by sizeof(int)), then dereferences.
*p++ versus (*p)++
| Expression | Parse | What it does |
|---|---|---|
*p++ |
*(p++) |
read *p, then advance p by one element |
*++p |
*(++p) |
advance p first, then read *p |
(*p)++ |
(*p)++ |
post-increment the object p points to; p unchanged |
++*p |
++(*p) |
pre-increment the object p points to; p unchanged |
K&R §5.3 uses *p++ in the canonical string copy:
while ((*dst++ = *src++) != '\0') { }
Every iteration reads one byte from src, writes it to dst, and advances both pointers. When the read hits the \0 terminator, the loop exits (because \0 is also copied, and then compared to \0, which is false).
Defensive habit: parenthesize when operators cross classes
CERT C EXP00-C puts it plainly: when an expression mixes operators from different precedence classes, parenthesize. The classic traps:
x & 0xFF == yparses asx & (0xFF == y)because==binds tighter than&. Write(x & 0xFF) == y.*p + 1vs*(p + 1): different type-value pair. Parenthesize the one you mean.*p++vs(*p)++: different semantics. If you want “increment the objectppoints to,” write(*p)++.
If you cannot draw the parse tree for an expression in under ten seconds, add parentheses. The compiler optimizes redundant parentheses away; the reader of your code keeps their sanity.
Check your understanding (what-is-wrong)
A student writes this to print every element of a 10-element array:
void print_all(int arr[])
{
for (int i = 0; i <= sizeof(arr) / sizeof(arr[0]); i++) {
printf("%d\n", arr + i);
}
}
Three bugs hide in four lines. Find all three.
Reveal answer
sizeof(arr)inside the function returns 8 (the size of the pointer), not 40.sizeof(arr) / sizeof(arr[0])is 2, not 10. The function only reads the first two elements. Fix: take the length as a parameter (int *arr, size_t n).i <= n(whennis the element count) accesses one past the last element, which is undefined behavior for a read through the subscript. Usei < n.printf("%d\n", arr + i)prints the address, not the element.%dwith anint *argument is a format-specifier mismatch (undefined behavior). Writearr[i]or*(arr + i).
Corrected:
void print_all(const int *arr, size_t n)
{
for (size_t i = 0; i < n; i++) {
printf("%d\n", arr[i]);
}
}
What comes next
int arr[] = {10, 20, 30, 40}; stored at address 0x7540, and int *p = arr;, which rows are correct?- B is wrong:
*p + 1parses as(*p) + 1, which is10 + 1 = 11, typeint. The type isint, notint *. - E is wrong: the
[4]in the parameter is adjusted toint *, sosizeof(a)is8, not16.
In C, every time you do pointer arithmetic on an untrusted index, you are the bounds check. The memory-safety deep dive walks through Heartbleed and the other named exploits built on exactly this omission.
Next, Double Pointers introduces int **, the level of indirection you need when a function must reassign the caller’s pointer (not just write through it). Drill this page with the Pointer Arithmetic skill card, or browse the practice gallery.