Searching Arrays
Linear search walks the row; binary search halves the space (sorted required)
In a nutshell
You have an array. You want to know whether some target value is in it, and if so, at what index. There are two algorithms for this question, and the choice between them is the first algorithmic decision in this course where the shape of your data matters.
- Linear search walks the array from
0upward, comparing each slot against the target. It works on any array, sorted or not. In the worst case it inspects every slot. - Binary search jumps to the middle, decides whether the target is to the left or to the right, and halves the search space each step. It works only on a sorted array. In the worst case it inspects about $\log_2 n$ slots.
Both algorithms answer the same question and use the same return convention: return the index where the target was found, or return -1 if it was not. Today you will write both, trace a binary search step by step, and learn when each is the right tool.
Today in three sentences. Linear search is the everyday default and it works on any array. Binary search is faster but requires a sorted array. Both return
-1for “not found” and the index for “found.”
After this lesson, you will be able to:
- Write linear search and binary search as static methods that return the index of the target (or
-1). - Trace a binary search by hand using a
lo/mid/hitable for a small input. - State the precondition that binary search relies on, and what happens if you call it on an unsorted array.
- Compare the two algorithms by counting comparisons in the worst case.
From CSCD 110. Python’s
nums.index(target)is linear search under the hood, and it raisesValueErrorif the target is missing. Java conventionally returns-1instead of throwing. The standard library’sArrays.binarySearchdoes the binary version, and it returns a negative number with a slightly different meaning (the negated insertion point) — but for everything you write this week, “found → index, not found → −1” is the convention.
Linear search and the -1 sentinel
The simplest possible array search. Walk every slot. As soon as the slot’s value equals the target, return the index. If you finish the loop without finding it, return -1.
public static int indexOf(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i;
}
}
return -1; // walked the whole array, no match
}
That is the entire algorithm. Read the body and notice three things.
The early return matters. As soon as the loop finds a match, it returns the index right there. The rest of the array is never inspected. This is what makes linear search “fast in the best case” — if the target happens to be at the front, you stop after one comparison.
The final return -1 matters too. If the loop finishes naturally (the index i reached arr.length), no match was found anywhere. The compiler will refuse to compile a method missing this final return: it cannot prove that the loop will always find a match, so it cannot let you fall off the end of the method.
-1 is a sentinel. It is a value chosen specifically because it can never be a valid array index (which always run 0..length-1). If a caller writes int idx = indexOf(arr, target); if (idx == -1) { ... }, the check is unambiguous. Using 0 as the sentinel would be a bug, because 0 is a perfectly valid index for “the target is in the first slot.”
Common pitfall: returning
0for “not found”. Some students reach for0because it feels like “nothing.” It is not.0is a valid index. IfindexOf(arr, target)returns0, the caller cannot tell whether the target was found at the front or was missing. Always use-1as the missing-element sentinel. Document the convention in the Javadoc.
For non-numeric arrays you compare with .equals(), not ==:
public static int indexOf(String[] arr, String target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] != null && arr[i].equals(target)) {
return i;
}
}
return -1;
}
The arr[i] != null guard handles the awkward case where some slots in the array are null (which they will be by default for any reference-type array; see lesson 6a). Calling .equals() on null throws NullPointerException.
Check your understanding. What does
indexOfreturn for these calls, givenint[] arr = {3, 7, 2, 9, 7}?A.
indexOf(arr, 9)B.indexOf(arr, 7)C.indexOf(arr, 100)Reveal answer
A.
3. The value9is in slot 3. B.1. The value7first appears at index 1; the loop returns immediately on the first match, so the second7(at index 4) is never reached. (If you wanted the last index, you would walk the array right to left.) C.-1. The loop runs to completion without finding a match.
Check your understanding. Spot the bug.
public static int indexOf(int[] arr, int target) { for (int i = 0; i < arr.length; i++) { if (arr[i] == target) { return i; } else { return -1; // ??? } } return -1; }Reveal answer
The
return -1inside theelsebranch fires on the first iteration wheneverarr[0]does not equal the target. The method only ever inspects slot0. The loop never reaches slot1. The fix is to delete the entireelsebranch (the trailingreturn -1after the loop already handles the not-found case correctly).
Binary search: halving the search space
Linear search inspects up to n slots in an n-element array. Binary search inspects up to about $\log_2 n$ slots — but it has a precondition: the array must already be sorted.
The idea, in English: keep track of a region of the array that might contain the target (call it lo to hi, inclusive). Look at the middle slot. If it matches, you are done. If the target is smaller than the middle, the answer (if any) must be in the left half: shrink hi to one less than the middle. If the target is larger, shrink lo to one more than the middle. Repeat. When lo passes hi, the region is empty, and the target is not in the array.
public static int binarySearch(int[] arr, int target) {
int lo = 0;
int hi = arr.length - 1;
while (lo <= hi) {
int mid = (lo + hi) / 2;
if (arr[mid] == target) {
return mid;
} else if (arr[mid] < target) {
lo = mid + 1; // target must be to the right
} else {
hi = mid - 1; // target must be to the left
}
}
return -1; // region became empty
}
Three subtle pieces.
The loop condition is lo <= hi. Both endpoints are inclusive. The region lo == hi still has one slot in it (the slot at index lo). The loop should run that final iteration. Off-by-one alert: writing lo < hi makes the loop quit one step too early.
mid = (lo + hi) / 2 rounds down. If lo = 0 and hi = 7, then mid = 3. If lo = 4 and hi = 7, then mid = 5. The bias does not matter for correctness; it just means you tend to look slightly left of true center. (Industrial code uses lo + (hi - lo) / 2 to avoid integer overflow when lo + hi exceeds Integer.MAX_VALUE. For Reges-style CS1 arrays you will never hit that.)
The mid + 1 and mid - 1 adjustments matter. When you decide the target is to the right, the new region must exclude the slot you just looked at. Otherwise you risk looping forever, repeatedly examining the same mid. Off-by-one alert: writing lo = mid instead of lo = mid + 1 is a classic infinite-loop bug.
If the array is not sorted, the algorithm still runs and still terminates, but its answer is meaningless. It might return -1 for a target that is in the array. It might return the index of one occurrence and miss another. Treat “the array is sorted” as a precondition that the caller is responsible for honoring; document it in the Javadoc.
Common pitfall: writing
lo < hifor the loop condition. With<, the loop stops whenlo == hi, which means the single-element region never gets inspected. The target can be in the array and the search will return-1. The correct condition is<=.
Common pitfall: writing
lo = midinstead oflo = mid + 1. If the target is greater thanarr[mid], the new region should exclude indexmid(you already know it does not match). Forgetting the+ 1keepsmidin the region, and the very next iteration may pick the samemid, looping forever. The same applies tohi = midvshi = mid - 1.
Tracing a binary search by hand
Look up target = 41 in this sorted array of length 8:
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| Value | 3 | 8 | 12 | 19 | 25 | 33 | 41 | 58 |
Build a table. Each row is one iteration. Columns: lo, hi, mid, arr[mid], action.
| Iter | lo |
hi |
mid = (lo+hi)/2 |
arr[mid] |
Compare to target (41) | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | 3 | 19 | 19 < 41 | lo = 4 |
| 2 | 4 | 7 | 5 | 33 | 33 < 41 | lo = 6 |
| 3 | 6 | 7 | 6 | 41 | 41 == 41 | return 6 |
Three iterations. (Linear search would have inspected 7 slots before reaching the match.)
Now trace a miss. Look up target = 30 in the same array.
| Iter | lo |
hi |
mid |
arr[mid] |
Compare | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | 3 | 19 | 19 < 30 | lo = 4 |
| 2 | 4 | 7 | 5 | 33 | 33 > 30 | hi = 4 |
| 3 | 4 | 4 | 4 | 25 | 25 < 30 | lo = 5 |
| 4 | 5 | 4 | lo > hi, exit |
return -1 |
Four iterations to be confident the target is missing. The region shrunk from 8 slots → 4 → 2 → 1 → 0, and once it was empty the loop exited.
The mechanical rhythm — mid, compare, narrow — is the whole algorithm. Practice on small inputs by hand until it is second nature. Trace tables are also exactly how you should solve binary-search exam questions: write the table, do not try to hold all of lo, hi, mid in your head at once.
Check your understanding. Trace
binarySearch(arr, 12)on the same array{3, 8, 12, 19, 25, 33, 41, 58}. How many iterations? What does the method return?Reveal answer
Iter lohimidarr[mid]Compare Action 1 0 7 3 19 19 > 12 hi = 22 0 2 1 8 8 < 12 lo = 23 2 2 2 12 12 == 12 return 2Three iterations. Returns
2.
Check your understanding. What does
binarySearch({3, 8, 12, 19}, 8)return?Reveal answer
Iter lohimidarr[mid]Compare Action 1 0 3 1 8 8 == 8 return 1One iteration. The integer division
(0 + 3) / 2 = 1lands on the match immediately.
When to use which
A small table.
| Linear search | Binary search | |
|---|---|---|
| Precondition | None | Array is sorted |
| Worst-case comparisons | n |
about $\log_2 n$ |
| Best-case comparisons | 1 | 1 |
| Code complexity | About 4 lines | About 12 lines |
| Easy to write correctly | Yes | Off-by-one prone |
Works on String[] |
Yes (with .equals and a comparison rule) |
Yes (with .compareTo and a sorted array) |
Pick linear search when the array is unsorted, when you are searching only a few times, or when you need every match (in which case you want the linear walk anyway, because binary search returns one index and forgets the rest). Pick binary search when the array is already sorted and you are going to search it many times — the cost of sorting once amortizes across many fast lookups.
A concrete number to internalize. For an array of one million elements:
- Linear search worst case: one million comparisons.
- Binary search worst case: about 20 comparisons.
The gap grows fast. For a billion elements, linear is a billion and binary is about 30. This is one of the few cases in CS1 where the algorithm choice matters more than the language, the loop syntax, or the variable names. Sort once, then search billions of times in 30 comparisons each.
Common pitfall: calling binary search on an unsorted array. The method runs and returns something, but the result is wrong: a target that is in the array can come back as
-1. The standard library’sArrays.binarySearchdocuments this with a hard “the results are undefined if the array is not sorted.” Treat your own binary search the same way: precondition it.
Check your understanding. You have an array
int[] scoresthat students enter in arbitrary order. The program needs to answer “is85in this array?” exactly once. Which search should you use?Reveal answer
Linear search. The array is unsorted, so binary search would either give wrong answers or you would have to sort first. Sorting an
n-element array costs at leastn log ncomparisons (much more than thenof a single linear walk). For a single search on an unsorted array, linear wins.
Check your understanding. A program reads 10,000 sorted student IDs into an array, and over the course of the day handles 50,000 lookup queries: “is ID X in the array?” Which search should it use, and roughly how many comparisons total?
Reveal answer
Binary search. Each lookup costs at most about $\log_2 10\,000 \approx 14$ comparisons. Across 50,000 queries, that is roughly 700,000 comparisons. Linear search would average 5,000 comparisons per lookup (worst case 10,000), totaling 250 million to 500 million. Same answer; very different runtime.
Wrap up and what’s next
Recap.
- Linear search walks every slot until it finds a match. Works on any array. Returns the index, or
-1for not found. - Binary search keeps a
[lo, hi]region, halves it each iteration by comparing againstarr[mid], and returns the index or-1. Works only on sorted arrays. - Use
-1as the not-found sentinel. Never use0. - The binary-search shape —
lo <= hi,mid = (lo + hi) / 2,lo = mid + 1andhi = mid - 1— is off-by-one prone. Trace small examples on paper until the pattern is automatic. - For a few searches on an unsorted array, prefer linear. For many searches on data that can be kept sorted, binary.
What you can do now. Write either search method from a blank page. Trace a binary search by hand on a small array. Decide which algorithm fits a given problem, and explain the trade-off in one sentence.
Next up: Sorting an Array. Two ways to put an array in order. Selection sort scans for the min and swaps it to the front. Insertion sort grows a sorted prefix one element at a time. Both rest on the swap pattern from this week, both are O(n²), and the choice between them turns out to depend on how nearly sorted the input already is.
Related resources
- Reges & Stepp, Building Java Programs, Chapter 13 sections 13.1 (linear) and 13.2 (binary). The Reges treatment is the source of the
-1-sentinel convention. - FAQ entries on infinite loops in binary search and on
Arrays.binarySearch’s slightly different return convention.