How Do I Sort and Search Collections?

After this lesson, you will be able to:

Implement selection sort on an integer array using nested loops and the swap pattern
Trace through selection sort pass-by-pass on a small array
Implement linear search that returns the index or -1 if not found
Compute the median by sorting a copy of the array
Explain why selection sort is O(n^2) and when that complexity is acceptable

Why Build Your Own Sort?

Java has Arrays.sort(). Python has sorted(). Why are we implementing sorting from scratch?

Because sorting is the algorithm that teaches you how algorithms work. When you implement selection sort, you practice array traversal, comparison, swapping, and loop design — all at once. And you’ll understand exactly what those library functions do when you call them later.

In this course, we use selection sort — not because it’s the fastest (it isn’t), but because it’s the clearest to understand and trace.

Sorting and Searching

The Swap Pattern

Before sorting, you need to swap two elements. This requires a temporary variable:

void swap(int *a, int *b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

(We’re using pointers here — you’ll fully understand the * syntax in Series 3. For now, just know this function swaps two values.)

Without pointers, you can swap array elements directly:

int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;

The Trick: Swapping requires a temporary variable. a = b; b = a; doesn’t work — after the first assignment, the original value of a is lost.

Selection Sort

The idea: find the smallest element and put it first. Then find the next smallest and put it second. Repeat until sorted.

void selection_sort(int arr[], int size)
{
    for (int i = 0; i < size - 1; i++)
    {
        int min_index = i;

        for (int j = i + 1; j < size; j++)
        {
            if (arr[j] < arr[min_index])
            {
                min_index = j;
            }
        }

        if (min_index != i)
        {
            int temp = arr[i];
            arr[i] = arr[min_index];
            arr[min_index] = temp;
        }
    }
}

Tracing Through an Example

Starting array: {64, 25, 12, 22, 11}

Pass 1 (i=0): Find minimum in entire array → 11 at index 4. Swap with index 0.

{11, 25, 12, 22, 64}

Pass 2 (i=1): Find minimum in indices 1–4 → 12 at index 2. Swap with index 1.

{11, 12, 25, 22, 64}

Pass 3 (i=2): Find minimum in indices 2–4 → 22 at index 3. Swap with index 2.

{11, 12, 22, 25, 64}

Pass 4 (i=3): Find minimum in indices 3–4 → 25 at index 3. Already in place.

{11, 12, 22, 25, 64}    ← Sorted!

During selection sort on {64, 25, 12, 22, 11}, which element gets placed at index 0 after the first pass?

A 11 — the smallest element in the entire array

B 64 — it's already at index 0

C 25 — the next smallest after 64

D 12 — the first small number found

Answer: A. Selection sort finds the minimum in the unsorted portion and swaps it into position. On the first pass, it scans the entire array, finds 11 (at index 4), and swaps it with the element at index 0. After pass 1: {11, 25, 12, 22, 64}.

Time Complexity

Selection sort is O(n²) — the outer loop runs n times, and for each iteration, the inner loop scans the remaining elements. For small arrays (Lab 5 sizes), this is perfectly fine. For large datasets, you’d use faster algorithms like quicksort or mergesort (covered in later courses).

Linear Search

Find an element and return its index (or -1 if not found):

int linear_search(const int arr[], int size, int target)
{
    for (int i = 0; i < size; i++)
    {
        if (arr[i] == target)
        {
            return i;
        }
    }
    return -1;    // Not found
}

From Java: This is equivalent to iterating through an ArrayList with indexOf(). The logic is identical — the difference is that C has no built-in indexOf method, so you write it yourself. This is the C philosophy: simple tools you build and combine.

Why does linear_search return -1 when the target isn't found, rather than 0?

A It's arbitrary — any negative number would work the same way

B Because 0 is a valid array index — returning 0 would be ambiguous with "found at index 0"

C Because -1 is the standard error code in C for all functions

D Because C arrays can't use index 0

Answer: B. Array indices start at 0, so returning 0 would mean "found at the first position." You need a value that can't be confused with a valid index. Since array indices are never negative, -1 is the convention for "not found." This pattern shows up everywhere in C — and in Java's indexOf() too.

Why does this matter?

The “return -1 for not found” pattern is universal in systems programming. You’ll see it in Unix system calls (open() returns -1 on failure), C library functions, and even Java’s String.indexOf(). Recognizing this convention helps you read and write code in any language.

Finding the Median

The median requires a sorted array. But you might need to keep the original order — so sort a copy:

double find_median(const int arr[], int size)
{
    int sorted[MAX_SIZE];

    // Copy the array
    for (int i = 0; i < size; i++)
    {
        sorted[i] = arr[i];
    }

    selection_sort(sorted, size);

    if (size % 2 == 1)
    {
        return sorted[size / 2];           // Odd: middle element
    }
    else
    {
        return (sorted[size / 2 - 1] + sorted[size / 2]) / 2.0;  // Even: average of two middle
    }
}

The Trick: When computing the median, sort a copy of the array, not the original. The caller might need the data in its original order. This “non-destructive sort” pattern comes up frequently.

Sorting Doubles

Selection sort works on any type — just change the array type and comparison:

void selection_sort_double(double arr[], int size)
{
    for (int i = 0; i < size - 1; i++)
    {
        int min_index = i;
        for (int j = i + 1; j < size; j++)
        {
            if (arr[j] < arr[min_index])
            {
                min_index = j;
            }
        }
        if (min_index != i)
        {
            double temp = arr[i];
            arr[i] = arr[min_index];
            arr[min_index] = temp;
        }
    }
}

In Series 4, you’ll learn how to write a single sort function that works on any type using void pointers and function pointers.

You need to find the median of a dataset, but you also need to display the data in its original order afterward. What should you do?

A Sort the array, find the median, then "unsort" it back to the original order

B Find the median without sorting by scanning for the middle value

C Copy the array, sort the copy, and find the median from the sorted copy

D Use const to prevent the sort from modifying the original

Answer: C. You can't "unsort" an array (option A) — the original order is lost once you sort. Finding the median without sorting (option B) requires a more advanced algorithm. And const (option D) would prevent the sort from working at all, since sorting requires modifying the array. The standard approach: copy the data, sort the copy, extract the median.

Quick Check: Why does the outer loop in selection sort go to size - 1 instead of size?

When only one element remains (the last one), it’s already in its correct position — there’s nothing left to compare it with. So the last pass (where i = size - 1) would be unnecessary.

Quick Check: Why sort a copy when finding the median?

Sorting modifies the array in place. If the caller needs the original order preserved (for display or other calculations), sorting the original destroys that order. Making a copy and sorting the copy is a non-destructive approach.

Quick Check: What's the time complexity of linear search?

O(n) — in the worst case, you check every element. On average, you check half the elements. For sorted data, binary search (O(log n)) would be faster, but linear search works on unsorted data too.

Algorithms Are Transferable

Selection sort is O(n²) and you wouldn’t use it on a million elements. But the thinking behind it — break the problem into passes, maintain an invariant (elements before i are sorted), use a helper pattern (swap) — that’s how you approach any algorithm.

Next: command-line arguments. You’ll learn argc and argv, connecting your C programs to the Unix commands you’ve been using all quarter. Every ls -la, every grep pattern file — they all use argc and argv.

Big Picture: You’ve now built algorithms by hand that Java hides behind method calls. Selection sort is slow (O(n^2)), but the thinking behind it — loop invariants, finding minimums, swapping elements — is the same thinking you’ll use for every algorithm in your career. In Series 4, you’ll see how C’s qsort() uses function pointers to sort any data type with a single function.