Sorting an Array
Selection, insertion, the swap pattern, and the parallel-array anti-pattern
In a nutshell
You have an array of values in arbitrary order. You want them in nondecreasing order so you can binary-search them, render them as a leaderboard, or compute a median. You will write two classic sorting algorithms today, and you will see why neither is the algorithm modern Java actually uses for Arrays.sort — but both are the right way to learn what “in place” sorting really means.
- Selection sort. For each position
i(from0tolength - 1), find the smallest value in the remaining unsorted portion of the array and swap it into positioni. Done inlength - 1outer iterations. - Insertion sort. Grow a sorted prefix one element at a time. For each new element, slide it leftward through the prefix until it sits in the correct place.
Both algorithms rely on the swap pattern (which you have already seen): exchange two slots’ values using a temporary variable. Both are O(n^2) in the worst case, which means they get slow on big arrays but are perfectly fine on the hundred-element arrays you will see in this course.
The lesson closes with a design lesson that comes up the moment you have multiple arrays representing related data: parallel arrays are an anti-pattern. The two-arrays-that-must-stay-in-sync problem is the natural lead-in to classes in Week 8.
Today in three sentences. Selection sort: find the min, swap to position
i, repeat. Insertion sort: grow a sorted prefix, slide each new element back. Two arrays that must stay synchronized are a class waiting to happen.
After this lesson, you will be able to:
- Write selection sort and insertion sort as static
voidmethods that mutate an array in place. - Trace one outer-loop pass of either sort on a small array, on paper.
- Pick between selection and insertion sort given a description of the input.
- Recognize parallel arrays in code and explain why the design fails the moment you sort or filter.
From CSCD 110. Python’s
nums.sort()does not use either of these algorithms; it uses Timsort, a hybrid of merge sort and insertion sort tuned for real-world data. Both selection sort and insertion sort do show up inside Timsort for small subarrays. The reason CS1 starts with these two is pedagogical: they are short, you can trace them by hand, and they teach the in-place mutation patterns you will need for any later sort.
The swap pattern
Both sorts use swap as a primitive. You cannot exchange two variable values without a temporary.
int a = 1, b = 2;
// Wrong:
a = b; // a is now 2, b is still 2 -- the original a is gone
b = a; // b is now 2 (still!) -- a was already overwritten
// Right:
int temp = a; // save the original a
a = b; // copy b into a
b = temp; // copy the saved a into b
You met this in lesson 6c as swap(int[] arr, int i, int j):
public static void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
Both selection sort and insertion sort will call this (or inline it) repeatedly. The temp is non-negotiable. Try writing a swap without a temp once, on paper, to convince yourself the obvious pair a = b; b = a; clobbers a.
Common pitfall: forgetting the temp. Some students remember the shape of swap (three lines, three assignments) but write the assignments in the wrong order. The mnemonic: save first. Stash the value you are about to overwrite into a temp, then do the two reassignments.
Check your understanding. What is the array after this code runs?
int[] arr = {10, 20, 30, 40}; int temp = arr[0]; arr[0] = arr[3]; arr[3] = temp;Reveal answer
{40, 20, 30, 10}. The temp pattern swapped the values at indices0and3. The values at indices1and2are untouched.
Selection sort: find the min, swap to front
The English description, then the code.
Walk through positions
i = 0, 1, 2, ..., length - 2. At each position, scan the unsorted portion (from indexitolength - 1) for the smallest value. Swap that smallest value into positioni. The slot at positioniis now correctly placed; move on.
Each outer-loop iteration places exactly one element into its final sorted position. After the outer loop runs length - 1 times, the array is sorted.
public static void selectionSort(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
// find the index of the smallest value in arr[i..length-1]
int minIndex = i;
for (int j = i + 1; j < arr.length; j++) {
if (arr[j] < arr[minIndex]) {
minIndex = j;
}
}
// swap that smallest value into position i
int temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
Trace one pass. Start with arr = {5, 3, 8, 1, 4} and i = 0:
- Scan
jfrom1to4. The smallest value is1, at index3. SominIndex = 3. - Swap
arr[0]andarr[3]. Array becomes{1, 3, 8, 5, 4}.
Move on to i = 1:
- Scan from index
2to4. The smallest value is3(already at index 1) — actually,minIndexstarts ati = 1, and the values at indices2..4are8, 5, 4, none smaller than3. SominIndexstays at1. - Swap
arr[1]with itself. (Harmless, but wasteful — see the optimization below.) Array unchanged:{1, 3, 8, 5, 4}.
Continue with i = 2, i = 3. After i = 3, the array is {1, 3, 4, 5, 8} and the loop exits (the outer condition was i < arr.length - 1, so i = 4 is not visited; the last slot is automatically correct once the rest are).
A small optimization: skip the swap when minIndex == i. It is a no-op write, but on hardware where writes are slower than reads, it can matter:
if (minIndex != i) {
int temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
Common pitfall: starting the inner loop at
j = iinstead ofj = i + 1. Both work, but starting ati + 1is cleaner. The slot atiis the initialminIndex; comparing it to itself is wasted work. Startingjone step ahead avoids it.
Check your understanding. Trace one pass of selection sort on
arr = {6, 2, 4, 1, 5}withi = 0. What isminIndexat the end of the inner loop, and what is the array after the outer iteration completes?Reveal answer
Inner loop: starting
minIndex = 0(arr[0] = 6). Comparearr[1] = 2witharr[0] = 6: smaller, sominIndex = 1. Comparearr[2] = 4witharr[1] = 2: not smaller, leaveminIndex = 1. Comparearr[3] = 1witharr[1] = 2: smaller, sominIndex = 3. Comparearr[4] = 5witharr[3] = 1: not smaller, leaveminIndex = 3. End of inner loop:minIndex = 3. Swaparr[0]witharr[3]. Array after the outer iteration:{1, 2, 4, 6, 5}.
Insertion sort: grow a sorted prefix
A different shape. The English description, then the code.
Treat the slot at index
0as a sorted prefix of length 1. For eachi = 1, 2, 3, ..., length - 1, take the value atarr[i]and slide it leftward through the sorted prefix, swapping with each smaller-than-it neighbor, until it sits in its correct position. The prefix is now sorted up through indexi.
After the outer loop runs length - 1 times, the entire array is sorted.
public static void insertionSort(int[] arr) {
for (int i = 1; i < arr.length; i++) {
// slide arr[i] left until it is in the right place
for (int j = i; j > 0 && arr[j] < arr[j - 1]; j--) {
int temp = arr[j];
arr[j] = arr[j - 1];
arr[j - 1] = temp;
}
}
}
Trace it on {5, 3, 8, 1, 4}. Outer loop visits i = 1, 2, 3, 4.
i = 1. Element to insert: arr[1] = 3. Prefix {5} is sorted. Slide left: arr[1] = 3 vs arr[0] = 5. 3 < 5, swap. Array: {3, 5, 8, 1, 4}. j is now 0; loop condition j > 0 fails, stop.
i = 2. Element to insert: arr[2] = 8. Slide left: arr[2] = 8 vs arr[1] = 5. 8 < 5 is false, stop. Array unchanged: {3, 5, 8, 1, 4}.
i = 3. Element to insert: arr[3] = 1. Slide left: 1 vs 8 swap → {3, 5, 1, 8, 4}. 1 vs 5 swap → {3, 1, 5, 8, 4}. 1 vs 3 swap → {1, 3, 5, 8, 4}. j is now 0; stop.
i = 4. Element to insert: arr[4] = 4. Slide left: 4 vs 8 swap → {1, 3, 5, 4, 8}. 4 vs 5 swap → {1, 3, 4, 5, 8}. 4 vs 3 is false, stop.
Final: {1, 3, 4, 5, 8}. Sorted.
A subtlety in the inner loop. The condition j > 0 && arr[j] < arr[j - 1] uses short-circuit evaluation: Java evaluates j > 0 first, and only checks the array comparison if j > 0 is true. If you swapped the order (arr[j] < arr[j - 1] && j > 0), the loop would attempt to read arr[-1] once j reaches 0, throwing ArrayIndexOutOfBoundsException. Order matters.
Common pitfall: writing the condition as
arr[j] < arr[j - 1] && j > 0. With&&, Java evaluates left to right and short-circuits — but evaluatingarr[j - 1]forj = 0readsarr[-1]and throws. The bound check has to come first:j > 0 && arr[j] < arr[j - 1].
Check your understanding. Trace insertion sort on
{4, 1, 3, 2}. Show the array after each iteration of the outer loop.Reveal answer
Start:
{4, 1, 3, 2}.
i = 1: insert1. Slide left past4:{1, 4, 3, 2}.
i = 2: insert3. Slide past4:{1, 3, 4, 2}.3 < 1false, stop.
i = 3: insert2. Slide past4:{1, 3, 2, 4}. Slide past3:{1, 2, 3, 4}.2 < 1false, stop.Final:
{1, 2, 3, 4}.
Choosing between selection and insertion
Both algorithms are O(n^2) in the worst case. They differ in two practical ways.
| Property | Selection sort | Insertion sort |
|---|---|---|
| Best case (sorted input) | O(n^2) (still scans for min every pass) |
O(n) (inner loop never runs) |
| Number of swaps | At most n - 1 |
Up to about n^2 / 2 |
| Comparisons | About n^2 / 2 (always) |
About n^2 / 2 (worst case) |
| Adaptive to “almost sorted” data? | No | Yes |
The takeaway: insertion sort is faster on data that is already nearly sorted, because most elements only need to slide a few positions. Selection sort always does the full scan-for-min work regardless of input order. Selection sort uses fewer swaps (which can matter if a swap is expensive — say, swapping large objects). Insertion sort uses fewer comparisons on near-sorted data.
For unsorted random data of any nontrivial size, both are slow. The library’s Arrays.sort uses dual-pivot quicksort or Timsort, both O(n log n). You will write neither of those in CS1; the point of these two algorithms is to learn the loop-and-swap shape that every sort uses underneath.
Check your understanding. A program reads in 50 student names already in alphabetical order, then appends 5 new names at the end. Which sort would finish faster, and why?
Reveal answer
Insertion sort. The input is already 91% sorted. The first 50 outer iterations do almost no work (each new element is already in place). Only the last 5 outer iterations do real sliding. Selection sort would do the full inner scan on every outer iteration regardless of input order, costing about
50 * 50 / 2 = 1250comparisons; insertion sort would do roughly50 + 5 * 50 = 300. The “adaptive” property of insertion sort is what makes it the right choice here.
Parallel arrays as anti-pattern
Once you have arrays, you will be tempted to represent related data with two arrays of the same length, one slot per “thing.” A roster of 30 students might look like:
String[] names = new String[30];
int[] scores = new int[30];
names[0] = "Ada"; scores[0] = 95;
names[1] = "Linus"; scores[1] = 82;
// ...
The convention: names[i] and scores[i] describe the same student. Walk both with the same index. Most of the time it works.
The problem appears the moment you want to sort the data. Sort scores with selection sort, and scores is now in increasing order — but names is unchanged, so the relationships are broken. Student names[0] is still “Ada”, but scores[0] is now the lowest score in the whole class, which is not Ada’s score. The arrays have drifted apart.
selectionSort(scores); // scores now sorted; names is still in original order
// names[0] is "Ada" but scores[0] is the lowest score, not Ada's
The fix is to sort both arrays together, swapping names[i] and names[j] whenever you swap scores[i] and scores[j]. That works for sorting, but every other operation that touches the data has the same risk: filter scores, copy names separately, build a roster from a CSV in the wrong order, and the arrays drift.
The deeper problem is that the design is trying to represent something — a Student with both a name and a score — but using two separate arrays to do it. The right answer is to introduce a Student class with name and score fields, and use one Student[] array. Then sorting reorders whole Student objects; there is nothing to drift.
Classes are Week 8. For Week 6, the immediate response is one of two things: either commit to the two-array discipline and pair every operation that touches one array with the matching operation on the other, or rewrite the data layout once you can.
Common pitfall: a “small” change that desyncs parallel arrays. It always starts with one operation: a
removeAtonscoresto drop a withdrawn student, and a forgotten matchingremoveAtonnames. The arrays are now off by one for every subsequent index. The bug is rarely caught immediately because everything compiles and runs; the symptoms appear in unrelated output later.
Check your understanding. Why is the following call sequence buggy?
String[] names = {"Ada", "Linus", "Margaret", "Donald"}; int[] scores = {95, 82, 100, 70}; selectionSort(scores); System.out.println(names[0] + " scored " + scores[0]);Reveal answer
After
selectionSort(scores), the arrayscoresis{70, 82, 95, 100}. Butnamesis unchanged: still{"Ada", "Linus", "Margaret", "Donald"}. So the program printsAda scored 70, which is false (Ada’s actual score was 95; 70 was Donald’s). The fix is to sort both arrays together by swappingnames[i]withnames[j]whenever you swapscores[i]withscores[j]. The deeper fix (Week 8) is to use aStudentclass so the name and score travel together.
Wrap up and what’s next
Recap.
- The swap pattern needs a temp variable.
a = b; b = a;clobbersa. - Selection sort: for each position
i, scan the unsorted portion for the smallest value, swap intoi. AlwaysO(n^2). Few swaps. - Insertion sort: grow a sorted prefix. For each new element, slide left through the prefix.
O(n)on already-sorted input,O(n^2)worst case. Many swaps. - Pick insertion sort for nearly-sorted data; pick selection sort when swap cost matters.
- Parallel arrays drift apart the moment you sort or filter one without the other. The right answer is a class (Week 8); the immediate workaround is paired operations on every touch.
What you can do now. Write either sort from a blank page. Trace one outer pass on a five-element array. Decide between the two given a description of the input. Spot a parallel-array design and predict where it will break.
Next up: Array Tracing & Mystery Strategy. Before file I/O takes over, one more lesson — a strategy reference for reading array code on paper, building trace tables, and surviving the live-array writes and aliasing puzzles that show up on the quiz. After that, the unit closes and the next chapter is file I/O: reading lines and tokens from text files into arrays, then writing structured output back. After that comes the foundational shift of the course: classes, which replace the parallel-arrays anti-pattern with a single array of objects whose fields travel together.
You can revisit the Arrays & Algorithms series index whenever you want to jump back to a specific lesson.
Related resources
- Reges & Stepp, Building Java Programs, Chapter 13 covers selection and insertion sort with the same Reges-style code.
- FAQ entries on
ArrayIndexOutOfBoundsExceptionin insertion sort and on the parallel-array anti-pattern.