fileutils - Filesystem helpers¶
Virtually every Python programmer has used Python for wrangling
disk contents, and
fileutils collects solutions to some of the
most commonly-found gaps in the standard library.
Creating, Finding, and Copying¶
Creates a directory and any parent directories that may need to be created along the way, without raising errors for any existing directories. This function mimics the behavior of the
mkdir -pcommand available in Linux/BSD environments, but also works on Windows.
iter_find_files(directory, patterns, ignored=None)¶
Returns a generator that yields file paths under a directory, matching patterns using glob syntax (e.g.,
*.txt). Also supports ignored patterns.
For example, finding Python files in the current directory:
>>> filenames = sorted(iter_find_files(_CUR_DIR, '*.py')) >>> os.path.basename(filenames[-1]) 'urlutils.py'
Or, Python files while ignoring emacs lockfiles:
>>> filenames = iter_find_files(_CUR_DIR, '*.py', ignored='.#*')
copytree(src, dst, symlinks=False, ignore=None)¶
copy_treefunction is an exact copy of the built-in
shutil.copytree(), with one key difference: it will not raise an exception if part of the tree already exists. It achieves this by using
- src (str) – Path of the source directory to copy.
- dst (str) – Destination path. Existing directories accepted.
- symlinks (bool) – If
True, copy symlinks rather than their contents.
- ignore (callable) – A callable that takes a path and directory listing, returning the files within the listing to be ignored.
Atomic File Saving¶
Using the same API as a writable file, all output is saved to a temporary file, and when the file is closed, the old file is replaced by the new file in a single system call, portable across all major operating systems. No more partially-written or partially-overwritten files.
AtomicSaveris a configurable context manager that provides a writable
filewhich will be moved into place as long as no exceptions are raised within the context manager’s block. These “part files” are created in the same directory as the destination path to ensure atomic move operations (i.e., no cross-filesystem moves occur).
- dest_path (str) – The path where the completed file will be written.
- overwrite (bool) – Whether to overwrite the destination file if
it exists at completion time. Defaults to
- file_perms (int) – Integer representation of file permissions for the newly-created file. Defaults are, when the destination path already exists, to copy the permissions from the previous file, or if the file did not exist, to respect the user’s configured umask, usually resulting in octal 0644 or 0664.
- part_file (str) – Name of the temporary part_file. Defaults
to dest_path +
.part. Note that this argument is just the filename, and not the full path of the part file. To guarantee atomic saves, part files are always created in the same directory as the destination path.
- overwrite_part (bool) – Whether to overwrite the part_file,
should it exist at setup time. Defaults to
False, which results in an
OSErrorbeing raised on pre-existing part files. Be careful of setting this to
Truein situations when multiple threads or processes could be writing to the same part file.
- rm_part_on_exc (bool) – Remove part_file on exception cases.
Falsecan be useful for recovery in some cases. Note that resumption is not automatic and by default an
OSErroris raised if the part_file exists.
Practically, the AtomicSaver serves a few purposes:
- Avoiding overwriting an existing, valid file with a partially written one.
- Providing a reasonable guarantee that a part file only has one writer at a time.
- Optional recovery of partial data in failure cases.
atomic_rename(src, dst, overwrite=False)¶
Rename src to dst, replacing dst if *overwrite is True
os.replace()in Python 3.3+, this function will atomically create or replace the file at path dst with the file at path src.
On Windows, this function uses the ReplaceFile API for maximum possible atomicity on a range of filesystems.
Linux, BSD, Mac OS, and other Unix-like operating systems all share a
simple, foundational file permission structure that is commonly
complicit in accidental access denial, as well as file
FilePerms was built to increase clarity and cut down
on permission-related accidents when working with files from Python
FilePerms(user='', group='', other='')¶
FilePermstype is used to represent standard POSIX filesystem permissions:
Across three classes of user:
- Owning (u)ser
- Owner’s (g)roup
- Any (o)ther user
This class assists with computing new permissions, as well as working with numeric octal
rwx-style permissions. Currently it only considers the bottom 9 permission bits; it does not support sticky bits or more advanced permission systems.
- user (str) – A string in the ‘rwx’ format, omitting characters for which owning user’s permissions are not provided.
- group (str) – A string in the ‘rwx’ format, omitting characters for which owning group permissions are not provided.
- other (str) – A string in the ‘rwx’ format, omitting characters for which owning other/world permissions are not provided.
There are many ways to use
>>> FilePerms(user='rwx', group='xrw', other='wxr') # note character order FilePerms(user='rwx', group='rwx', other='rwx') >>> int(FilePerms('r', 'r', '')) 288 >>> oct(288)[-3:] # XXX Py3k '440'
See also the
FilePerms.from_path()classmethods for useful alternative ways to construct