File Handling in C++ 17 (Part-II) : Filesystem Path Operations

Share the Article

In C++17, the filesystem path object is capable of doing a rich set of operations. The path is available in namespace std::filesystem. This path represents any absolute or relative path for any kind of file type. The file means, not only a regular file, but also directory, links, character file, fifo, etc. Moreover, it is not necessary the target file should exists (yet).

The operations the path object supports are, creation, modification, comparison. etc. Secondly, many operations are OS-independent or generic therefore, these operation are cheap for any usage in code.

Creation of std::namespace::path

The most common format of filesystem path creation operation is by using strings. For creation of path object, it is not necessary that path must exist in physical location.

path p1 = "/project/code/file.txt"; std::string str = "abcd"; path p2{str}; path p3{u8path(u8"J\u00F6hn")}; //from u8 string

Second format is by using following functions which create path based on OS settings. These calls are expensive as they internally call OS APIs.

path p1 = current_path(); //present working dir path p2 = temp_directory_path(); // example, /tmp

Basic utilities : filesystem path operations

These utilities let a code inspect the paths and play-around with it. The following code displays some important utilities. From the name, these are self-explanatory.

#include <iostream> //main header #include <filesystem>//for filesystem using namespace std; //for namespace std using namespace std::filesystem; int main() { path p = "/project/code/file.txt"; cout << "Path : " << p << endl; cout << "empty: \t" << p.empty() << endl; cout << "is_absolute:\t" << p.is_absolute() << endl; cout << "is_relative:\t" << p.is_relative() << endl; cout << endl; cout << "has_filename:\t" << p.has_filename() << endl; cout << "filename:\t" << p.filename() << endl; cout << endl; cout << "has_extension:\t" << p.has_extension() << endl; cout << "extension:\t" << p.extension() << endl; cout << endl; cout << "has_parent_path:\t" << p.has_parent_path() << endl; cout << "parent_path:\t" << p.parent_path() << endl; cout << endl; cout << "has_root_path:\t" << p.has_root_path() << endl; cout << "root_path:\t" << p.root_path() << endl; return 0; }

Output

On Unix, the output is as follows.

ouptut of code using basic filesystem utilities

The behavior may change for some specific paths according to OS rules.

Example, for a path like, “c:/mydir”, the library shall treat it as an absolute path on Windows. However, the same path shall become a relative path on linux/unix.

Secondly, the library shall consider last entry as filename when there is no trailing slash ‘/’ in the end of path.

Iteration on the path elements

The path object returns iterators (path.begin( ) & path.end( )) to iterate on the individual elements. These iterators are bidirectional iterators, means, we can both increment and decrement them. Please note that the behavior may change between unix and windows. Especially, for the absolute paths which are valid on windows can run into trouble on unix.

#include <iostream> #include <filesystem> using namespace std; using namespace std::filesystem; void iteratePath(const std::filesystem::path& p) { cout << "Path : " << p << endl; for (auto pos = p.begin(); pos != p.end(); ++pos) { path elem = *pos; cout << elem << endl; } cout << endl; } int main() { path p1 = "/project/code/file.txt"; iteratePath(p1); path p2 = "~/project/code/file.txt"; iteratePath(p2); path p3 = "c:\\project\\code\\file.txt"; //Windows path iteratePath(p3); return 0; }

Output

code output doing iteration on path object using filesystem

Please Note

In the above code, the path “c:\\project\\code\\file.txt” did not break-down into individual elements. The reason behind is that it is not legal to do this on Unix. This is because, firstly, on unix the element “\\” is not at all a valid path separator. And secondly, same element “\\” is not a valid file or directory name either. Therefore, on unix platform, the filesystem cannot understand how to break this. However, the filesystem shall be be able to properly break down when program is executing in windows. The following shall be output of this call on windows.

"c:" "project" "code" "file.txt"

Normalization of path

The normalization cleans up the path by resolving the “..” and “.” elements and remove extra separators. The function “path::lexically_normal( )” provides a new path which may be completely different. The following code demonstrates this.

#include <iostream> //Main header #include <filesystem>//for filesystem using namespace std; //for namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code/../code1/./file.txt"; cout << p1.string() << endl; cout << p1.lexically_normal() << endl; cout << endl; path p2 = "~/project/../mycode/./././code/file.txt"; cout << p2.string() << endl; cout << p2.lexically_normal() << endl; return 0; }

Output

normalization using filesystem path operations

However, there are subtle differences how the paths behave in Unix and windows. Example, if the input path is:

//hostname\mydir1/subdir2\/./\ On Unix : "/hostname\\mydir1/subdir2\\/\\" On Windows: ""\\\\hostname\\mydir1\\subdir2\\"

The string forward-slash “/” is valid separator, therefore, “//” converts to a single separator “/” in Unix . However, the same program when executed in windows, properly normalizes the path by considering “\\” backlash format.

Relative path

The path utilities can compare 2 paths and can compute the relative path between them. The member function, path1::lexically_relative( path2 ) can do this computation. It return a relative path from path2 to path1.

If there is not relative path possible, then it returns an empty path.

#include <iostream> #include <filesystem> using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code/mydir1"; path p2 = "/project/code/code1/code2/mydir2"; cout << p1.string() << endl; cout << p2.string() << endl; cout << p1.lexically_relative(p2) << endl; return 0; }

Output

output of code doing lexically relative comparison using filesystem path operations

Utilities for Modification of path

Joining 2 paths

There are 2 kinds of operations available to join 2 paths together.

  • concatenation of paths (using “concat( )” and “+” operator)
  • appending paths (using “append( )” and “/” operator)

The concatenate operation, just join them, just like, in case of strings. However, the append operations adds them logically, like, adding a directory hierarchy below another hierarchy. The following example demonstrates both the operations.

#include <iostream> //main header #include <filesystem>//for filesystem using namespace std; //for namespace std using namespace std::filesystem; int main() { path p = "/project/code/mydir1"; p += "123"; // => .../mydir1123 cout << p << endl; p /= "456"; // => .../mydir123/456 cout << p << endl; p.concat("789"); // => .../456789 cout << p << endl; p.append("ABC"); // => .../456789/ABC cout << p << endl; return 0; }

Output

example of using filesystem append and concat with path

swap and clear

As the name suggests, the swap function exchange the paths between 2 objects. And secondly, the clear function causes the path to become empty. The following code demonstrates this.

#include <iostream> //main header #include <filesystem> //for filesystem using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code1/mydir1"; path p2 = "/project/code2/mydir2"; cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; p1.swap(p2); cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; p1.clear(); p2.clear(); cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; return 0; }

Assignment operation on paths

The assignment operations are possible with any of the 2 ways:

  • assign member function
  • “=” operator
#include <iostream> #include <filesystem> using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code1/mydir1"; path p2 = "/project"; cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; p2.assign(p1); //assign p1 to p2 cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; p1 = "/myproject/code"; //assign a new value to p1 p2 = p1; //assign p1 to p2 cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; return 0; }

Comparison of 2 path objects

The comparison between paths is of three 3 types:

  1. Raw comparison of values
  2. Lexically normal comparison
  3. Equivalence comparison

Raw Comparison

The path variable support following member functions for doing comparison of the path values.

  1. Equality Operator “==”
  2. Greater than operator “>” or “>=”
  3. Smaller than operator “<” or “<=”
  4. Inequality operator “!=”
  5. Using compare( ) member function

The first 4 operators return boolean – true or false result. However, the compare member returns an integer to denote the comparison. This means, if both are equal, then it returns Zero. Otherwise, it returns a count how many path-elements differ, by doing a kind of subtraction on elements.

The following code explains the use of these methods.

#include <iostream> //main header #include <filesystem> //for filesystem using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code1/mydir1"; path p2 = "/project"; cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; if(p1 == p2) cout << "Both are equal" << endl; if(p1 < p2) cout << "p1 is smaller" << endl; if(p1 != p2) cout << "Both are unequal" << endl; cout << p1.compare(p2) << endl; cout << p2.compare(p1) << endl; return 0; }

Output

program doing raw comparison of paths with filesystem library

Lexically normal comparison of paths

This comparison shall use compare( ) member function. However, it performes the compare( ) on the lexically normalized values of 2 paths as input. Therefore, even though the raw-comparison can say if the paths are different. But with lexically normalized comparison, the results can change. The following code demonstrates this.

#include <iostream> //main header #include <filesystem>//for filesystem using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code1/mydir1"; path p2 = "/project/code2/../code1/././mydir1"; //Both p1 & p2 are logically same cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; //Raw Comparison of paths if(p1 == p2) cout << "Both are equal" << endl; if(p1 < p2) cout << "p1 is smaller" << endl; if(p1 != p2) cout << "Both are unequal" << endl; cout << p1.compare(p2) << endl; cout << p2.compare(p1) << endl; //Comparison with lexically normalized paths cout << p1.lexically_normal().compare( p2.lexically_normal()) << endl; cout << p2.lexically_normal().compare( p1.lexically_normal()) << endl; return 0; }

Output

program doing lexically normal comparison of 2 paths with filesystem operations

Checking if 2 paths are equivalent

The function equivalent( ) is most accurate comparison tool. This is because, unlike above operations, it does the compare with real paths. This means, the paths must physically exist, otherwise, it shall throw an exception. This comparison also resolves symbolic links, if they exist in the hierarchy. On completion, it return a boolean value – true or false.

The only disadvantage of this method is that is the most expensive operation. Internally, it calls OS APIs and the underlying filesystem rules are checked.

The following code shows the use of equivalent( ).

Here, the two paths p1 and p2 are equal in lexically normal comparison. However, in the given setup, the physical paths do not exist, therefore, programs throws an exception.

#include <iostream> //main header #include <filesystem> //for filesystem using namespace std; using namespace std::filesystem; int main() { path p1 = "/project/code1/mydir1"; path p2 = "/project/code2/../code1/././mydir1"; cout << "p1=" << p1 << endl; cout << "p2=" << p2 << endl; equivalent(p1, p2); return 0; }

Output

exception thrown by equivalent function to compare 2 paths.

Main Funda: The filesystem path object provides good operations to develop portable code.

Related Topics:

Class Template Argument Deduction in C++17
What is a Tuple, a Pair and a Tie in C++
C++ Multithreading: Understanding Threads
What is Copy Elision, RVO & NRVO?
Lambda in C++11
Lambda in C++17
std::chrono in C++ 11
Thread Synchronization with Mutex
Template type deduction in functions
How std::forward( ) works?
How std::move() function works?
What is reference collapsing?

Share the Article

Leave a Reply

Your email address will not be published. Required fields are marked *