Performance Iterating Directories Revisited

I have written about the performance of iterating directories before, in the context of Java and its switch from version 6 to 7 that brought with it the new Java NIO API. For whatever reason I felt the urge to do something similar again, but this time I wanted to compare two different approaches to recursively scanning a directory’s contents:

To make things more interesting, I implemented this in C++ using the Windows API and the Qt framework, in C# in combination with its buddy the .NET framework and, for good measure, I also threw in the old Java code from over a year ago.

Update (26.12.2014): I added additional data at the bottom of the article.

The real motivation was to find out if any of the two approaches has a performance advantage over the other. Before having done any measurements, my assumption was that recursion is slower than doing it the iterative way. Why? Every function call costs time. New variables have to be pushed on the stack (the function parameters) as well as the return address of the current location (for when the called function finishes that the program knows where it came from). And, of course, a stack-unwind has to happen when the called function finishes. Now imagine doing this several hundred or thousand times in quick succession. This adds up. The iterative approach on the other hand keeps a list of directories that have been found and revisits each one once the currently processed one has been read from top to bottom, processing one folder at a time and removing it from the list after it has been examined.

Since I can’t decide whether to start off with code or the actual results, I do it Sheldon style and roll the dices. Well, figuratively – I’ll toss a coin instead.

*toss*

The coin says to discuss the code first and since I am obedient to the laws of randomness: that’s what I’ll do. The biggest difference can be seen using C++ and the Windows API and therefore, this is what I will show you first. This satisfies the underlying curiosity from which this article emanated and allows me to then move on and show different approaches and how they behave.

C++ WinAPI

The following code presents the iterative approach and serves as a reference to all other iterative implementations. The recursive version is just a slight modification of that.

void
IterativeScanner::Scan(const std::wstring& p_path)
{
    std::queue directories;
    directories.push(p_path);

while (!directories.empty())
 {
 std::wstring dir = directories.front();
 directories.pop();

WIN32_FIND_DATA findData;
 std::wstring findString = dir + TEXT("\\*");
 HANDLE finder = FindFirstFile(findString.c_str(), &findData);

if (INVALID_HANDLE_VALUE == finder)
 {
 std::wcerr << ("First FindFirstFile failed.") << std::endl;
 return;
 }

do
 {
 if (findData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
 {
 std::wstring found = findData.cFileName;
 if (TEXT(".") != found && TEXT("..") != found)
 {
 m_countDirs++;
 directories.push(dir + TEXT("\\") + found);
 }
 }
 else
 {
 m_countFiles++;

LARGE_INTEGER filesize;
 filesize.LowPart = findData.nFileSizeLow;
 filesize.HighPart = findData.nFileSizeHigh;
 m_totalSize += filesize.QuadPart;
 }
 } while (0 != FindNextFile(finder, &findData));

DWORD error = GetLastError();
 if (ERROR_NO_MORE_FILES != error)
 {
 std::wcerr << TEXT("Looping FindNextFile failed.") << std::endl;
 }

FindClose(finder);
 }
}

As can be seen, programming against the Windows API can be pretty messy. And this is just a very simple example of that. What I am doing in this series of experiments is to count the number of files and folders and also add up the individual file sizes in bytes of all the files found. These numbers are stored in member variables of a base class and their names are:

  • m_countFiles : int
  • m_countDirs : int
  • m_totalSize : __int64

I have used this design for Qt, C# and Java as well.

The only differences between this and the recursive way of doing things are as follows:

    • Remove the queue named directories and everything that relates to it.
      • The loop checking if the queue is empty.
      • Fetching a folder path from the directories queue.
    • Replace directories.push(dir + TEXT(“\\”) + found); with Scan(p_path + TEXT(“\\”) + found);

There you go. Want some numbers now? The numbers show the average of 11 runs with each method where the first pass is not counted because it only serves to thrash the hard drive and make Windows cache the information for subsequent executions. The other ten then only measure the execution time of the algorithm itself without seeking on the drive. The measurements do not include startup time and the tool is only run once per method. It performs all eleven runs in one go.

First, I tested on my music folder but there was no difference, so I scanned all of my data as well. Here are the basic stats:

  • Music folder
    • 3315 files
    • 467 folders
    • 27GB
  • Data folder
    • 56393 files
    • 7858 folders
    • 292GB
Music Folder (ms) Data Folder (ms)
Recursive 14 275
Iterative 14 293

Are you surprised? I am. It is not much, but recursive method calls win out the more data needs to be scanned. For me this just means that I will prefer writing the more concise code and also have the benefit of fast execution. But why is it faster? That’s a good question I ask myself and the only thing I can come up with is that the memory management of the queue that I have used requires more work than doing all those method calls. I could now try to artificially circumvent any memory management by allocating a big enough array up front and insert all the folder names that are found, but never delete them from that array. But that would increase the amount of code I’d have to write even more beyond what is already necessary with the Windows API and it is also very counter-intuitive to the easy to understand concept of the queue.

C++ Qt 5

Here comes the same project using the very powerful Qt 5 libraries. In fact, Qt provides two different methods of doing this. One, where you let the framework create a list and then iterate over it or, second, use a directory iterator, similar to FindFirstFile and FindNextFile on Windows. This saves some CPU cycles since you can count the numbers while going through the directory at the same time. There’s only one iteration to do.

First let me show the list-based approach. For the sake of brevity I use the recursive method from now on. The iterative approach requires the same steps as already shown in the WinAPI code.

void 
RecursiveScanner::Scan1(const QString& p_path)
{
    QDir dir(p_path);
    QFileInfoList found = dir.entryInfoList(QDir::AllEntries 
                                            | QDir::NoDotAndDotDot);
    for (const QFileInfo& item : found)
    {
        if (item.isDir())
        {
            m_countDirs++;
            Scan2(item.absoluteFilePath());
        }
        else
        {
            m_countFiles++;
            m_totalSize += item.size();
        }
    }
}

And now with the directory iterator.

void 
RecursiveScanner::Scan2(const QString& p_path)
{
    QDirIterator dir(p_path, QDir::AllEntries | QDir::NoDotAndDotDot);
    while (dir.hasNext())
    {
        dir.next();
        QFileInfo item = dir.fileInfo();
        if (item.isDir())
        {
            m_countDirs++;
            Scan2(item.absoluteFilePath());
        }
        else
        {
            m_countFiles++;
            m_totalSize += item.size();
        }
    }
}

Isn’t this code a thing of beauty? The only drawback here is the separation of class declaration and definition into header (*.h) and implementation (*.cpp) files. Here’s how long it takes.

Music Folder (ms) Data Folder (ms)
Recursive 29 481
Recursive (Iterator) 28 496
Iterative 32 548
Iterative (Iterator) 28 520

We see a similar behavior as with the Windows API, only that the Qt based code takes around 1.7 to 2 times longer. What surprises me is that using the directory iterator in recursive mode takes longer when a large amount of data is scanned compared to only testing the music folder.

Looking at the actual physical access through ProcessMonitor shows that both implementations (WinAPI and Qt) generate an almost identical output. In order to not be overwhelmed by the sheer amount of data that ProcessMonitor would generate, I chose a single directory with only one file inside for demonstration purposes. The only difference between Qt and WinAPI is the QueryOpen call made by Qt.

That means that all those additional milliseconds the Qt application takes longer is spent somewhere in the many abstractions that make your life as a programmer much easier. A worthy tradeoff if you ask me. Especially considering what I found with C# and Java.

C#

C# doesn’t come with the drawback of C++ in that the class is declared in a different file from where it is defined, it’s all in one file. And of course you can write the code as concise as with Qt. But not only that, you have a lot of different ways to scan a folders’ contents and each one impacts the performance in some way. Even more than you should be concerned about whether to write recursive or iterative code to save a few milliseconds.

The most efficient way is to use Directory.EnumerateDirectories to find all subfolders and then use DirectoryInfo to get the list of files (and meta data) of a specific directory.

public void Scan1(string p_path)
{
    foreach (string foundDir in Directory.EnumerateDirectories(p_path))
    {
        m_countDirs++;
        Scan1(foundDir);
    }

DirectoryInfo di = new DirectoryInfo(p_path);
 foreach (FileInfo foundFile in di.EnumerateFiles())
 {
 m_countFiles++;
 m_totalSize += foundFile.Length;
 }
}

Another option is to use DirectoryInfo to find folders and files inside of a directory but this additionally requires to create an instance of FileInfo to then get the file size.

public void Scan2(string p_path)
{
    DirectoryInfo di = new DirectoryInfo(p_path);
    foreach (FileSystemInfo entry in di.GetFileSystemInfos())
    {
        if (0 != (entry.Attributes & FileAttributes.Directory))
        {
            m_countDirs++;
            Scan2(entry.FullName);
        }
        else
        {
            FileInfo fi = new FileInfo(entry.FullName);
            m_countFiles++;
            m_totalSize += fi.Length;
        }
    }
}

Can you guess how this translates into execution time and file system access?

Music Folder (ms) Data Folder (ms)
Recursive (Directory) 59 1124
Recursive (DirectoryInfo) 100 1854

That’s a pretty significant difference and ProcessMonitor shows where it comes from.

The additional creation of FileInfo results in another access of the file system to fetch the file’s meta information, like its size as done in this test.

Moving on to the iterative methods and to answer the question whether recursive is faster than iterative in the .NET world as well. Recursive is faster but only by as much as in the WinAPI example. Since we’re now in the realm of a second to perform the scan, those 15ms saved by the recursive implementation don’t matter a single bit.

I have written three different ways to do an iterative scan and I’ll show you the two slow ones. The fast one looks like the recursive Scan1() method above, with a Queue to cache the found directories (just like the C++ WinAPI example).

public void Scan2(string p_path)
{
    var items = Directory.EnumerateFileSystemEntries(p_path, "*",
                                                     SearchOption.AllDirectories);
    foreach (string item in items)
    {
        FileAttributes fa = File.GetAttributes(item);
        if (FileAttributes.Directory == (fa & FileAttributes.Directory))
        {
            m_countDirs++;
        }
        else
        {
            FileInfo fi = new FileInfo(item);
            m_countFiles++;
            m_totalSize += fi.Length;
        }
    }
}

Armed with the knowledge gleaned from the recursive Scan2() method it is immediately obvious what is wrong with this code. We know that FileInfo results in a file system access, so what do you think will happen when File.GetAttributes() is called? Let the ProcessMonitor do the talking.

There are two recorded accesses to the files, one to find out if it is a directory and the second to get the meta information with FileInfo. That’s very bad for efficiency. The same algorithm can be implemented by using an instance of DirectoryInfo and call EnumerateFileSystemInfos on that instead of the similarly named static method on Directory. What that gives us is that we can omit the extra call to File.GetAttributes() in order to find out whether we’re dealing with a directory or a file and that also shows (rather: it does not show up) in ProcessMonitor. What is still necessary is an instance of FileInfo.

No matter how the code is twisted and turned, there seems to be no way you can have all the desired information in one class.

For completeness, here’s the code.

public void Scan3(string p_path)
{
    DirectoryInfo di = new DirectoryInfo(p_path);
    var items = di.EnumerateFileSystemInfos("*", SearchOption.AllDirectories);
    foreach (FileSystemInfo item in items)
    {
        if (FileAttributes.Directory == (item.Attributes & 
                                         FileAttributes.Directory))
        {
            m_countDirs++;
        }
        else
        {
            FileInfo fi = new FileInfo(item.FullName);
            m_countFiles++;
            m_totalSize += fi.Length;
        }
    }
}

And here are the execution times.

Music Folder (ms) Data Folder (ms)
Iterative (EnumerateDirectories) 60 1139
Iterative (EnumerateFileSystemEntries) 132 2341
Iterative (EnumerateFileSystemInfos) 113 2095

Java

Java provides two different approaches, different not in a way as they are different in C# where you have several methods that work similarly but return different data types. In Java’s case you can go the “traditional” route as I have shown so far or – if you’ve read my older post on this topic you already know – you can implement an interface and instead of actively pulling for data your code gets notified as data is found.

public class FileVisitorScanner extends AbstractScanner 
                                implements FileVisitor
{
    @Override
    public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs)
            throws IOException
    {
        m_countDirs++;
        return FileVisitResult.CONTINUE;
    }

@Override
 public FileVisitResult visitFile(Path file, BasicFileAttributes attrs)
 throws IOException
 {
 m_countFiles++;
 m_totalSize += attrs.size();
 return FileVisitResult.CONTINUE;
 }

@Override
 public FileVisitResult visitFileFailed(Path file, IOException exc) 
 throws IOException
 {
 System.out.println("Error accessing file: " + file);
 return FileVisitResult.CONTINUE;
 }

@Override
 public FileVisitResult postVisitDirectory(Path dir,
 IOException exc) throws IOException
 {
 return FileVisitResult.CONTINUE;
 }

public void scan(String p_path) throws IOException
 {
 Files.walkFileTree(
 Paths.get(p_path),
 EnumSet.of(FileVisitOption.FOLLOW_LINKS),
 Integer.MAX_VALUE,
 this);
 }
}

Compared to what I have shown in the previous two sections with Qt and C#, this is an equally verbose implementation as is the WinAPI example, only more object oriented. But, as you’ll see in a moment it has its advantages. In addition to showing the FileVisitor interface implementation, I have updated the recursive approach to use the Java 7 classes as well. It goes something like this.

public void scan(String p_path) throws IOException
{
    Path path = FileSystems.getDefault().getPath(p_path);
    try (DirectoryStream stream = Files.newDirectoryStream(path))
    {
        for (Path item : stream)
        {
            if (Files.isDirectory(item, LinkOption.NOFOLLOW_LINKS))
            {
                m_countDirs++;
                scan(item.toString());
            }
            else
            {
                m_countFiles++;
                m_totalSize += Files.size(item);
            }
        }
    }
    catch (DirectoryIteratorException e)
    {
        e.printStackTrace();
    }
}

I find it quite curious how much noise this piece of code generates in ProcessMonitor. Notice that each file is accessed four times.

Compare this to the FileVisitor implementation and you’ll ask yourself “why?”.

This is actually very close to the WinAPI and Qt examples. The execution times also speak for itself (but only if you stay on the Java island). As seen in all examples so far, recursive beats iterative. But as is the case with C#, these few milliseconds don’t really matter since the overall duration with Java is in the “it takes forever” category.

Music Folder (ms) Data Folder (ms)
Recursive 226 3564
Iterative 231 3595
FileVisitor 62 2395

Groovy

Last but not least, and for the fun of it, let me show you how little code you need to write to scan a whole tree of directories using Groovy.

int countFiles = 0
int countDirs  = 0
long totalSize = 0

file.eachFileRecurse(groovy.io.FileType.ANY) { item ->
 if (item.isDirectory())
 {
 countDirs++
 }
 else
 {
 countFiles++
 totalSize += item.size()
 }
}

Impressive, right? So are the execution times and the ProcessMonitor output.

Music Folder (ms) Data Folder (ms)
Groovy 362 5753

This is a lot of action for so little code. Ease of use kills efficiency.

Verdict

With the exception of Groovy, which is ridiculously slow no matter what, we can see that the native and managed implementations behave similarly. In every case the recursive approach wins out over the iterative one. Here comes the “But”: with the exception of the WinAPI the difference between the two is only marginal. Even with C++ it only saves a few milliseconds. In that case however, overall execution time is so fast that the ratio is much more distinct. Enter the managed world and efficiency goes down.

In doing so many different implementations one of course doesn’t get around of comparing the individual results as a whole. Let me show you a table of the fastest execution times of each runtime environment.

Music Folder (ms) Data Folder (ms)
WinAPI 14 275
Qt 28 481
C# 59 1124
Java 62 2395
Groovy 362 5753

The higher the abstraction, the longer it takes. Qt provides an equally user friendly class library as C# does and that already roughly doubles the execution time of the fastest code. Add a managed runtime and you can double or triple that again in case of C# and .NET or even quintuple it with Java in the worst case.

And although Java’s FileVisitor Interface is an interesting way of doing things, the amount of code you have to write doesn’t stand in any relation to how it performs. Granted, the file system access is as little as is needed, but so is Qt’s or even that of the fastest C# code. Unfortunately that doesn’t help if the runtime eats all of that saved time for lunch. Combine this with how much Java code you have to write (implement an interface!) and it’s obvious that Java’s strength lies elsewhere. On the other hand, consider the Groovy example and you can make a good case that the plus in programmer productivity can sometimes outweigh a negative runtime impact.

I’m curious to see how the same code would perform on Linux or OS X with Mono as the .NET runtime. Any volunteers?

Update 26.12.2014

Thanks to a tip from the comments I re-ran the C++ implementations against an even larger set of data and included the std::stack class for the WinAPI code and QStack for the Qt based version.

Since, originally, the whole point of this article was to find out if recursive or iterative code performs faster, I pass on C# and Java and concentrate on C++ for this update. Otherwise it’ll get out of hand.

The larger set of data not only includes my personal stuff (from the previous tests) but also all the code, documents and database files that I need for work. Unfortunately the WinAPI implementation finds a lot more files and folders than the Qt version. This makes a direct comparison a bit unfair for the WinAPI examples, but that’s not the point. I don’t even want to put in the effort to find the right flags that need to be set to make the numbers match. Because, that’s not the point.

Recursive vs. Iterative!

First off, the WinAPI numbers:

  • 554648 files
  • 45590 folders
  • 1113GB
Whole Data Drive (ms)
Recursive 1522
Iterative (std::queue) 1608
Iterative (std::stack) 1546

Now the Qt results:

  • 499767 files
  • 37676 folders
  • 1113GB
Whole Data Drive (ms)
Recursive 2616
Recursive (Iterator) 2600
Iterative (Iterator + QQueue) 2840
Iterative (Iterator + QStack) 2658
Iterative (Iterator + QStack resized) 2569

What can we learn from this? As suspected in the conclusion of the WinAPI results way up the article, using a more efficient data structure to hold the list of directories improves the performance of the iterative implementation. It seems that std::stack and QStack are exactly that and none of them loose the expressiveness of the queue counterpart.

In the case of the WinAPI implementation the recursive scan is still faster (although by a very slim margin). For Qt this would also be true if QStack didn’t provide a method to specify a size of the stack. If a large enough amount is pre-allocated this improves execution time by 100ms on my test data and that makes the iterative approach (a tiny bit) faster than recursion. Thanks to Matej for mentioning this in the comments.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s