From File to Array: the Count-Allocate-Fill Pattern
Two passes over the file: count the values, allocate the right-size array, fill it
In a nutshell
Java arrays have a fixed size, decided when you call new int[count]. To allocate the right size, you have to know the count before you allocate. But the count of values in a file isn’t usually written down anywhere — it’s implicit in how many lines the file has.
The solution is the same one used everywhere this problem appears: walk the file twice.
- Pass 1 — count. Open a Scanner, walk the file once counting lines (or tokens), close.
- Allocate.
int[] arr = new int[count]; - Pass 2 — fill. Open a fresh Scanner on the same file, walk it again, write each value into
arr[i].
This is the count-allocate-fill pattern. It is the dominant file-to-array shape for this course and for the APE. The rest of the lesson walks through it carefully and ends with an end-to-end worked example: read an unsorted file, sort it, write the sorted result to a new file.
Today in three sentences. Arrays need a size at allocation time. Files don’t tell you the size up front, so you walk the file twice — once to count, once to fill. Scanner does not rewind, so the second pass needs a brand-new Scanner.
After this lesson, you will be able to:
- Write a
countLines(orcountValues) helper method that returns anint. - Write a
fillArrayhelper method that returns a freshly allocated, freshly filledint[]. - Combine reading, array work, and writing into one end-to-end program.
- Recognize the APE four-step pattern (validate, operate, handle errors, update state) and apply it to a file-I/O problem.
From CSCD 110. Python sidesteps this whole exercise:
nums = [int(line) for line in open("data.txt")]returns a list of unknown size, because Python lists grow on demand. Java arrays don’t grow, so you do the counting yourself. (ArrayListdoes grow — but that’s CSCD 211.) The two-pass pattern is what gets you there with what Java gives you in CS1.
Why two passes
Re-read the rule from Week 6: a Java array’s size is fixed at creation. You write new int[5] and you get exactly five slots — no more, no less. There is no arr.add(...) in CS1.
So here’s the constraint, stated plainly:
Before you can write
new int[count], you have to knowcount.
When the array’s data is hard-coded, this is easy: int[] arr = new int[5]; arr[0] = .... When the data comes from user input where you ask “how many values?” first, it’s also easy: int n = sc.nextInt(); int[] arr = new int[n];.
But when the data lives in a file with one value per line and no header that says how many, you have a chicken-and-egg problem. You can’t allocate without the count, and you can’t get the count without walking the file.
The fix is to walk it twice. Pass 1 counts, then closes. Allocate. Pass 2 reads, fills, closes.
Common pitfall: trying to “rewind” the Scanner. Some students think there must be a way to send the Scanner back to the start of the file. There isn’t, in CS1. (There’s
reset(), but it doesn’t rewind file position; that’s for parser state.) The simplest, idiomatic move is to throw away the first Scanner and construct a brand-new one for the second pass. SameFile, two Scanners.
Common pitfall: trying to allocate first, then “grow” the array. Java arrays are fixed-size. There is no
arr.length = 10. If you find yourself reaching for “make the array bigger,” you’re either (a) supposed to do count-allocate-fill, (b) supposed to build a new array and copy (lesson 6c), or (c) about to useArrayList, which is CSCD 211.
Check your understanding. Why can’t you do this in one pass?
int[] arr = new int[?]; // ??? what size? Scanner sc = new Scanner(new File("data.txt")); int i = 0; while (sc.hasNextInt()) { arr[i] = sc.nextInt(); i++; }Reveal answer
The
new int[?]line has no answer. You need to know how many ints are in the file before this line, but the only way to find out is to read the file — which is what you’re trying to do here. The deadlock is exactly why the two-pass pattern exists. Pass 1 (which doesn’t store anything, just counts) escapes the deadlock.
The pattern: count, allocate, fill
The pattern in three small pieces. Below, each piece is a static method; the driver main calls them in order.
Pass 1: countValues
Open a Scanner, walk every value, increment a counter, close. Return the count.
public static int countValues(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
int count = 0;
while (sc.hasNextInt()) {
sc.nextInt(); // read and discard — we only care about the count
count++;
}
sc.close();
return count;
}
Notice the sc.nextInt(); with no assignment. The point of pass 1 is to advance the cursor; we don’t store the value because we don’t have anywhere to put it yet. The counter is the only thing we keep.
If the file is line-based (one value per line, but they could be doubles or strings), the same shape works with hasNextLine() / nextLine():
public static int countLines(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
int count = 0;
while (sc.hasNextLine()) {
sc.nextLine();
count++;
}
sc.close();
return count;
}
Allocate
One line. Use the count from pass 1.
int[] arr = new int[count];
Pass 2: fillArray
Open a new Scanner on the same file. Walk it. Write into arr[i]. Return the array.
public static int[] fillArray(String filename, int count) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
int[] arr = new int[count];
for (int i = 0; i < count; i++) {
arr[i] = sc.nextInt();
}
sc.close();
return arr;
}
The for (int i = 0; i < count; i++) form is appropriate here because we know exactly how many values to read (we just counted them). You could also write while (sc.hasNextInt()) with an external int i = 0; i++; — both work. The counted-for is slightly cleaner when the count is already known.
Putting it together in main
public static void main(String[] args) throws FileNotFoundException {
int count = countValues("scores.txt");
int[] scores = fillArray("scores.txt", count);
// now `scores` is fully loaded — do array work here
}
Three lines. The shape is generic — every file-to-array program in this course (and on the APE) follows it.
Common pitfall: forgetting to close the first Scanner before opening the second. It will mostly work — the operating system will let you open the same file twice — but it’s sloppy and on some systems will eventually run you out of file handles. Close pass 1’s Scanner before opening pass 2’s. Each Scanner has a clear job, a clear start, and a clear close.
Check your understanding. What goes wrong if you write the second pass like this?
public static int[] fillArray(String filename, int count) throws FileNotFoundException { Scanner sc = new Scanner(new File(filename)); int[] arr = new int[count]; int i = 0; while (sc.hasNextInt()) { arr[i] = sc.nextInt(); i++; } sc.close(); return arr; }Reveal answer
Nothing goes wrong, as long as the file’s contents have not changed between the two passes. Both forms (
for (int i = 0; i < count; i++)andwhile (sc.hasNextInt())) read every value in the file. The difference is what happens if the file changed: the counted-for loop reads exactlycountvalues (good if pass 1’s count is the trusted source); the while-hasNextInt loop reads however many are actually in the file now. For CS1 problems, both work. Use whichever shape you find clearer.
End-to-end: from an unsorted file to a sorted file
Here is the full program. It reads unsorted scores from scores.txt, sorts them with selection sort (Week 6 code, untouched), and writes the sorted scores to sorted.txt along with the median.
scores.txt:
78
62
85
91
70
45
99
The program:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintStream;
import java.util.Scanner;
public class SortedReport {
public static int countValues(String filename) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
int count = 0;
while (sc.hasNextInt()) {
sc.nextInt();
count++;
}
sc.close();
return count;
}
public static int[] fillArray(String filename, int count) throws FileNotFoundException {
Scanner sc = new Scanner(new File(filename));
int[] arr = new int[count];
for (int i = 0; i < count; i++) {
arr[i] = sc.nextInt();
}
sc.close();
return arr;
}
public static void selectionSort(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
int minIndex = i;
for (int j = i + 1; j < arr.length; j++) {
if (arr[j] < arr[minIndex]) minIndex = j;
}
int temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
public static int median(int[] sorted) {
int n = sorted.length;
if (n % 2 == 1) return sorted[n / 2];
return (sorted[n / 2 - 1] + sorted[n / 2]) / 2;
}
public static void main(String[] args) throws FileNotFoundException {
// 1. READ the file into an array (count, allocate, fill).
int count = countValues("scores.txt");
int[] scores = fillArray("scores.txt", count);
// 2. WORK on the array (Week-6 code, unchanged).
selectionSort(scores);
int med = median(scores);
// 3. WRITE the result to a different file.
PrintStream out = new PrintStream(new File("sorted.txt"));
for (int i = 0; i < scores.length; i++) {
out.println(scores[i]);
}
out.println("Median: " + med);
out.close();
}
}
After running, sorted.txt contains:
45
62
70
78
85
91
99
Median: 78
Re-read main. The structure is the same three blocks from lesson 7a — read, work, write — but now the read block is the count-allocate-fill pattern (two helper-method calls), and the file’s size is no longer hard-coded. This is the form your Week 7 lab and your APE problems will take.
Check your understanding. What single line would you change in
mainto make the output file contain the unsorted scores plus the median, instead of the sorted ones? (And would the median value change?)Reveal answer
Remove the
selectionSort(scores);call. The output file would then list the values in their original file order. The median value would still be wrong, becausemedian(...)requires a sorted array. To compute median on unsorted data, you must sort first — but you can sort a copy if you also want to preserve the original order. Two-array pattern: build a sorted copy with a manual loop (orArrays.sortif you’ve covered it), pass the copy tomedian, but write the original array to the file.
The APE four-step pattern
The APE File I/O exercises all follow a consistent four-step recipe. Once you see it, the problems become much less intimidating.
| Step | What you do | What it looks like in code |
|---|---|---|
| 1. Validate | Check that the inputs are usable. Is the filename non-null? Does the file exist? Is the array non-null? | if (filename == null) throw new IllegalArgumentException(...); Or rely on throws FileNotFoundException to fail loudly when the file is missing. |
| 2. Operate | Do the actual file work — read, count, fill, write, whatever the problem asks. | Scanner sc = new Scanner(new File(filename)); ... sc.close(); |
| 3. Handle errors | Decide what to return (or do) if the file or input is wrong. The APE often expects a specific sentinel (e.g., null, -1, or an empty array). |
if (count == 0) return new int[0]; |
| 4. Update state | Update any class-level fields the problem says to maintain (e.g., numFilesProcessed++). Many APE problems track state in static fields between calls. |
numFilesProcessed++; |
Not every problem has all four steps. Many have just steps 2 and 3 (operate, handle errors). Steps 1 and 4 show up most often in the more elaborate APE exercises that involve a class with state.
The most useful thing about naming the steps: when you read a problem statement, you can tag each sentence with its step. “The method should read the file and store its lines in an array” → step 2. “If the file does not exist, the method should return an empty array” → step 3. “After processing, the method should increment the totalFiles counter” → step 4.
A worked APE-style example. The problem:
Write a method
int countNonBlankLines(String filename)that returns the number of non-blank lines in the named file. If the filename isnull, throwIllegalArgumentException. If the file does not exist, return-1.
Tagging the sentences:
- “If the filename is null, throw
IllegalArgumentException” → step 1 (validate). - “Returns the number of non-blank lines” → step 2 (operate).
- “If the file does not exist, return
-1” → step 3 (handle errors). - (No step-4 work — no state to update.)
Translate each tag to code:
public static int countNonBlankLines(String filename) {
// Step 1: validate.
if (filename == null) {
throw new IllegalArgumentException("filename cannot be null");
}
// Step 3 (early): handle the missing-file case before the main work.
File f = new File(filename);
if (!f.exists()) return -1;
// Step 2: operate.
int count = 0;
try {
Scanner sc = new Scanner(f);
while (sc.hasNextLine()) {
String line = sc.nextLine();
if (!line.isBlank()) count++;
}
sc.close();
} catch (FileNotFoundException e) {
return -1; // belt-and-suspenders; we already checked exists()
}
return count;
}
(This is one of the few CSCD 210 places where try/catch shows up — APE problems sometimes ask for sentinel return values that throws cannot satisfy. You will not have to write try/catch from scratch on the lab; recognize the shape and read it.)
Notice how each block of the method maps directly to a step. The four-step pattern is not just a study tool — it’s a way to read the problem statement and a way to organize the code you write. Use it.
Common pitfall: writing the operate step before checking validate. If
filenameisnull, the linenew File(filename)will throwNullPointerExceptionbefore you ever get a chance to handle it cleanly. Always validate first, then operate.
Check your understanding. A problem says: “Write
int sumValues(String filename)that reads ints from the named file and returns their sum. If the filename is null, throwIllegalArgumentException. If the file is missing, return0. Increment the static fieldfilesReadafter a successful read.” Tag each sentence with its APE step.Reveal answer
- “If the filename is null, throw
IllegalArgumentException” → step 1 (validate).- “Reads ints from the named file and returns their sum” → step 2 (operate).
- “If the file is missing, return 0” → step 3 (handle errors).
- “Increment the static field
filesReadafter a successful read” → step 4 (update state).All four steps. The
filesRead++only fires on the success path, after step 2 completes — not on the early-return error path.
Wrap up and what’s next
Recap.
- Java arrays are fixed-size. To allocate the right size from a file, you walk the file twice: pass 1 counts, pass 2 fills.
- Scanner does not rewind. The second pass needs a fresh
new Scanner(new File(...))on the same path. - Three small helper methods carry the pattern:
countValues(filename),fillArray(filename, count), and the array-work method (sort, search, stat) from Week 6. - The end-to-end shape: count, allocate, fill, array work, write. This is the dominant CS1 file-I/O program shape and the APE’s preferred problem shape.
- The APE four-step pattern (validate, operate, handle errors, update state) maps each sentence in a problem statement to a block of code.
What you can do now. Read any “load this file into an array and do array work on it” problem and produce a working program with three helper methods and a tidy main. Read an APE problem statement and tag each sentence with its step before writing any code.
Next up: The I/O Class Recognition Guide. The unit continues with three more lessons before the chapter closes. 7f surveys the JDK’s full reader and writer family (BufferedReader, PrintWriter, FileWriter, etc.) at recognition level so you can read someone else’s code without panic. 7g covers the File class operations beyond exists (append mode, parent-directory creation, inspection, deletion). 7h is the capstone: a sentence-by-sentence walkthrough of all six APE Practice File I/O problems with the four-step pattern. After 7h the next chapter is classes and objects, where this week’s file-I/O patterns reappear inside the OOP shape (read a file, build an array of Student objects, do object work, write the report).
Related resources
- The full APE File I/O walkthrough lives in the deep-dive lecture (
week8-deepdive-fileio.texin the course materials). It works the Spring 2023 Practice APE’s File I/O section problem-by-problem using the four-step pattern. - Reges & Stepp, Building Java Programs, Chapter 6 section 6.4 covers count-then-fill on a real example (the same approach with slightly different naming).
- The APE practice exam (login required, EWU SSO) — download the Spring 2023 Practice APE and try the File I/O section after this lesson.