A while ago at work we were confronted with the task of creating a directory listing in a Grails application. We’ve tried a couple of approaches, one the Groovy way and one the Java way. Both delivered only a poor performance. A short search brought forth a Stackoverflow thread addressing the issue of slow Java i/o performance with only one real solution: switch from Java 6 to 7. That’s no option at work but out of personal interest I gave it a try at home.
The Task
Write a simple Java application that uses the traditional way with File.list()
and the new Java 7 features from the java.nio.file
package that scans a path passed on the command line. The application then prints how long it took to scan through it all and how many files and folders it encountered on its journey through the filesystem jungle.
File.list()
This is a straight forward approach. There’s one class that contains the main()
method and the workhorse countRecursive
which calls itself repeatedly for each directory found. Time is measured in milliseconds.
public class IOPerformance { private int files = 0; private int folders = 0; void countRecursive(File folder) { if (null == folder) { return; } String[] list = folder.list(); if (null == list) { return; } for (String fileName : list) { String folderName = folder.getAbsolutePath(); if (!folderName.endsWith(File.separator)) { folderName = folderName + File.separator; } File item = new File(folderName + fileName); if (item.isDirectory()) { folders++; countRecursive(item); } else { files++; } } } public void printResults() { System.out.println(String.format("Files found : %d", files)); System.out.println(String.format("Folders found: %d", folders)); } public static void main(String[] args) { IOPerformance tester = new IOPerformance(); long start = Calendar.getInstance().getTimeInMillis(); tester.countRecursive(new File(args[1])); long end = Calendar.getInstance().getTimeInMillis(); System.out.println(String.format("Time passed : %d", + end - start)); tester.printResults(); } }
One interesting note: this code automatically follows symbolic links!
java.nio.file
This is a new package in Java 7 which is supposed to facilitate the work with filesystems (take a look at this and this).
A recursive scan is actually much easier now since it is reduced to one method call Files.walkFileTree(...)
. Well, that’d be too easy so you’ll need to implement an interface, too. The whole process is explained in the Java Tutorials so I won’t repeat that. I’ll just show you my code.
public class IOPerformance implements FileVisitor { private int files = 0; private int folders = 0; public void countWithVisitor(String source) throws IOException { Path p = Paths.get(source); Files.walkFileTree( p, EnumSet.of(FileVisitOption.FOLLOW_LINKS), Integer.MAX_VALUE, this); } @Override public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) { folders++; return FileVisitResult.CONTINUE; } @Override public FileVisitResult postVisitDirectory(Path dir, IOException exc) { return FileVisitResult.CONTINUE; } @Override public FileVisitResult visitFile(Path file, BasicFileAttributes attr) { files++; return FileVisitResult.CONTINUE; } @Override public FileVisitResult visitFileFailed(Path file, IOException exc) { System.out.println("Could not visit: " + file); return FileVisitResult.CONTINUE; } public void printResults() { System.out.println(String.format("Files found : %d", files)); System.out.println(String.format("Folders found: %d", folders)); } public static void main(String[] args) { try { IOPerformance tester = new IOPerformance(); long start = Calendar.getInstance().getTimeInMillis(); tester.countWithVisitor(args[0]); long end = Calendar.getInstance().getTimeInMillis(); System.out.println(String.format("Time passed : %d", + end - start)); tester.printResults(); } catch (IOException e) { e.printStackTrace(); } } }
If you prefer the easy way out then extend SimpleFileVisitor<Path>
instead of implementing the whole interface. This would reduce your code to one method to override instead of four.
Results
So is it worth it? Has anything changed at all? Before I answer these questions I give you a short insight in what I’ve tested.
- The traditional approach …
- compiled with Java 6 running on Java 6
- compiled with Java 6 running on Java 7
- compiled with Java 7 running on Java 7
- The new approach (obviously compiled with and running on Java 7)
- Just for fun: the traditional approach with C++
I ran each test four times but only measured the last three. The first counts as warmup when the runtime and application is loaded an thus spoils the results. Example: running the Java 6 version for the first time required 12 seconds to finish. Subsequent runs were way faster.
Technique | Compiler | Runtime | Result |
---|---|---|---|
Traditional | Java 6 | Java 6 | 4410 ms |
Traditional | Java 6 | Java 7 | 4813 ms |
Traditional | Java 7 | Java 7 | 4764 ms |
java.nio.file | Java 7 | Java 7 | 3610 ms |
C++ | Clang 3 | Native | 1526 ms |
Roundabout a second advantage for java.nio.file
. I scanned my home folder which contains 332802 files and 81678 folders. It is faster but not blazingly fast. Only when scanning very complex tree structures then that one second will add up to make a real difference. It’s progress nevertheless and I appreciate that! However, C++ beats the crap out of Java in this regard.
[…] To make things more interesting, I implemented this in C++ using the Windows API and the Qt framework, in C# in combination with its buddy the .NET framework and, for good measure, I also threw in the old Java code from over a year ago. […]
LikeLike