3.7 Validating Filenames and Paths

3.7.1 Problem

You need to resolve the path of a file provided by a user to determine the actual file that it refers to on the filesystem.

3.7.2 Solution

On Unix systems, use the function realpath( ) to resolve the canonical name of a file or path. On Windows, use the function GetFullPathName( ) to resolve the canonical name of a file or path.

3.7.3 Discussion

You must be careful when making access decisions for a file. Taking relative pathnames and links into account, it is possible for multiple filenames to refer to the same file. Failure to take this into account when attempting to perform access checks based on filename can have severe consequences.

On the surface, resolving the canonical name of a file or path may appear to be a reasonably simple task to undertake. However, many programmers fail to consider symbolic and hard links. On Windows, links are possible, but they are not as serious an issue as they are on Unix because they are much less frequently used.

Fortunately, most modern Unix systems provide, as part of the standard C runtime, a function called realpath( ) that will properly resolve the canonical name of a file or path, taking relative paths and links into account. Be careful when using realpath( ) because the function is not thread-safe, and the resolved path is stored in a fixed-size buffer that must be at least MAXPATHLEN bytes in size.

The function realpath( ) is not thread-safe because it changes the current directory as it resolves the path. On Unix, a process has a single current directory, regardless of how many threads it has, so changing the current directory in one thread will affect all other threads within the process.

The signature for realpath( ) is:

char *realpath(const char *pathname, char resolved_path[MAXPATHLEN]);

This function has the following arguments:

pathname: Path to be resolved.
resolved_path: Buffer into which the resolved path will be written. It must be at least MAXPATHLEN bytes in size. realpath( ) will never write more than that into the buffer, including the NULL-terminating byte.

If the function fails for any reason, the return value will be NULL, and errno will contain an error code indicating the reason for the failure. If the function is successful, a pointer to resolved_path will be returned.

On Windows, there is an equivalent function to realpath( ) called GetFullPathName( ). It will resolve relative paths, link information, and even UNC (Microsoft's Universal Naming Convention) names. The function is more flexible than its Unix counterpart in that it is thread-safe and provides an interface to allow you to dynamically allocate enough memory to hold the resolved canonical path.

The signature for GetFullPathName( ) is:

DWORD GetFullPathName(LPCTSTR lpFileName, DWORD nBufferLength, LPTSTR lpBuffer,
                       LPTSTR *lpFilePath);

This function has the following arguments:

lpFileName: Path to be resolved.
nBufferLength: Size of the buffer, in characters, into which the resolved path will be written.
lpBuffer: Buffer into which the resolved path will be written.
lpFilePart: Pointer into lpBuffer that points to the filename portion of the resolved path. GetFullPathName( ) will set this pointer on return if it is successful in resolving the path.

When you initially call GetFullPathName( ), you should specifiy NULL for lpBuffer, and 0 for nBufferLength. When you do this, the return value from GetFullPathName( ) will be the number of characters required to hold the resolved path. After you allocate the necessary buffer space, call GetFullPathName( ) again with nBufferLength and lpBuffer filled in appropriately.

GetFullPathName( ) requires the length of the buffer to be specified in characters, not bytes. Likewise, the return value from the function will be in units of characters rather than bytes. When allocating memory for the buffer, be sure to multiply the number of characters by sizeof(TCHAR).

If an error occurs in resolving the path, GetFullPathName( ) will return 0, and you can call GetLastError( ) to determine the cause of the error; otherwise, it will return the number of characters written into lpBuffer.

In the following example, SpcResolvePath( ) demonstrates how to use GetFullPathName( ) properly. If it is successful, it will return a dynamically allocated buffer that contains the resolved path; otherwise, it will return NULL. The allocated buffer must be freed by calling LocalFree( ).

#include <windows.h>
   
LPTSTR SpcResolvePath(LPCTSTR lpFileName) {
  DWORD  dwLastError, nBufferLength;
  LPTSTR lpBuffer, lpFilePart;
   
  if (!(nBufferLength = GetFullPathName(lpFileName, 0, 0, &lpFilePart))) return 0;
  if (!(lpBuffer = (LPTSTR)LocalAlloc(LMEM_FIXED, sizeof(TCHAR) * nBufferLength)))
    return 0;
  if (!GetFullPathName(lpFileName, nBufferLength, lpBuffer, &lpFilePart)) {
    dwLastError = GetLastError(  );
    LocalFree(lpBuffer);
    SetLastError(dwLastError);
    return 0;
  }
   
  return lpBuffer;
}

[ Team LiB ]