diff options
author | Alyssa Ross <hi@alyssa.is> | 2023-11-21 16:12:21 +0100 |
---|---|---|
committer | Alyssa Ross <hi@alyssa.is> | 2023-11-21 16:12:48 +0100 |
commit | 048a4cd441a59cbf89defb18bb45c9f0b4429b35 (patch) | |
tree | f8f5850ff05521ab82d65745894714a8796cbfb6 /lib/fileset | |
parent | 030c5028b07afcedce7c5956015c629486cc79d9 (diff) | |
parent | 4c2d05dd6435d449a3651a6dd314d9411b5f8146 (diff) | |
download | nixpkgs-rootfs.tar nixpkgs-rootfs.tar.gz nixpkgs-rootfs.tar.bz2 nixpkgs-rootfs.tar.lz nixpkgs-rootfs.tar.xz nixpkgs-rootfs.tar.zst nixpkgs-rootfs.zip |
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Diffstat (limited to 'lib/fileset')
-rw-r--r-- | lib/fileset/README.md | 94 | ||||
-rw-r--r-- | lib/fileset/default.nix | 485 | ||||
-rw-r--r-- | lib/fileset/internal.nix | 581 | ||||
-rwxr-xr-x | lib/fileset/tests.sh | 1237 |
4 files changed, 2171 insertions, 226 deletions
diff --git a/lib/fileset/README.md b/lib/fileset/README.md index 6e57f1f8f2b..14b6877a906 100644 --- a/lib/fileset/README.md +++ b/lib/fileset/README.md @@ -1,5 +1,10 @@ # File set library +This is the internal contributor documentation. +The user documentation is [in the Nixpkgs manual](https://nixos.org/manual/nixpkgs/unstable/#sec-fileset). + +## Goals + The main goal of the file set library is to be able to select local files that should be added to the Nix store. It should have the following properties: - Easy: @@ -41,12 +46,20 @@ An attribute set with these values: - `_type` (constant string `"fileset"`): Tag to indicate this value is a file set. -- `_internalVersion` (constant `2`, the current version): +- `_internalVersion` (constant `3`, the current version): Version of the representation. +- `_internalIsEmptyWithoutBase` (bool): + Whether this file set is the empty file set without a base path. + If `true`, `_internalBase*` and `_internalTree` are not set. + This is the only way to represent an empty file set without needing a base path. + + Such a value can be used as the identity element for `union` and the return value of `unions []` and co. + - `_internalBase` (path): Any files outside of this path cannot influence the set of files. - This is always a directory. + This is always a directory and should be as long as possible. + This is used by `lib.fileset.toSource` to check that all files are under the `root` argument - `_internalBaseRoot` (path): The filesystem root of `_internalBase`, same as `(lib.path.splitRoot _internalBase).root`. @@ -111,9 +124,57 @@ Arguments: - (+) This can be removed later, if we discover it's too restrictive - (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example. +### Empty file set without a base + +There is a special representation for an empty file set without a base path. +This is used for return values that should be empty but when there's no base path that would makes sense. + +Arguments: +- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`. + - (+) Removes the need for a special representation. + - (-) Due to [influence tracking](#influence-tracking), + `union empty ./.` would have `/.` as the base path, + which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working, + which is not as one would expect. + - (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)), + this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`, + which is not as one would expect. +- Alternative: Do not have such a value and error when it would be needed as a return value + - (+) Removes the need for a special representation. + - (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`. + From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive. + +### No intersection for lists + +While there is `intersection a b`, there is no function `intersections [ a b c ]`. + +Arguments: +- (+) There is no known use case for such a function, it can be added later if a use case arises +- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions) + - (-) Could throw an error for that case + - (-) Create a special value to represent "all the files" and return that + - (+) Such a value could then not be used with `fileFilter` unless the internal representation is changed considerably + - (-) Could return the empty file set + - (+) This would be wrong in set theory +- (-) Inconsistent with `union` and `unions` + +### Intersection base path + +The base path of the result of an `intersection` is the longest base path of the arguments. +E.g. the base path of `intersection ./foo ./foo/bar` is `./foo/bar`. +Meanwhile `intersection ./foo ./bar` returns the empty file set without a base path. + +Arguments: +- Alternative: Use the common prefix of all base paths as the resulting base path + - (-) This is unnecessarily strict, because the purpose of the base path is to track the directory under which files _could_ be in the file set. It should be as long as possible. + All files contained in `intersection ./foo ./foo/bar` will be under `./foo/bar` (never just under `./foo`), and `intersection ./foo ./bar` will never contain any files (never under `./.`). + This would lead to `toSource` having to unexpectedly throw errors for cases such as `toSource { root = ./foo; fileset = intersect ./foo base; }`, where `base` may be `./bar` or `./.`. + - (-) There is no benefit to the user, since base path is not directly exposed in the interface + ### Empty directories -File sets can only represent a _set_ of local files, directories on their own are not representable. +File sets can only represent a _set_ of local files. +Directories on their own are not representable. Arguments: - (+) There does not seem to be a sensible set of combinators when directories can be represented on their own. @@ -129,7 +190,7 @@ Arguments: - `./.` represents all files in `./.` _and_ the directory itself, but not its subdirectories, meaning that at least `./.` will be preserved even if it's empty. - In that case, `intersect ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories. + In that case, `intersection ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories. But intuitively this operation should result in the same as `./foo` – everything else is just confusing. - (+) This matches how Git only supports files, so developers should already be used to it. - (-) Empty directories (even if they contain nested directories) are neither representable nor preserved when coercing from paths. @@ -144,7 +205,7 @@ File sets do not support Nix store paths in strings such as `"/nix/store/...-sou Arguments: - (+) Such paths are usually produced by derivations, which means `toSource` would either: - - Require IFD if `builtins.path` is used as the underlying primitive + - Require [Import From Derivation](https://nixos.org/manual/nix/unstable/language/import-from-derivation) (IFD) if `builtins.path` is used as the underlying primitive - Require importing the entire `root` into the store such that derivations can be used to do the filtering - (+) The convenient path coercion like `union ./foo ./bar` wouldn't work for absolute paths, requiring more verbose alternate interfaces: - `let root = "/nix/store/...-source"; in union "${root}/foo" "${root}/bar"` @@ -164,6 +225,9 @@ Arguments: This use case makes little sense for files that are already in the store. This should be a separate abstraction as e.g. `pkgs.drvLayout` instead, which could have a similar interface but be specific to derivations. Additional capabilities could be supported that can't be done at evaluation time, such as renaming files, creating new directories, setting executable bits, etc. +- (+) An API for filtering/transforming Nix store paths could be much more powerful, + because it's not limited to just what is possible at evaluation time with `builtins.path`. + Operations such as moving and adding files would be supported. ### Single files @@ -174,12 +238,22 @@ Arguments: And it would be unclear how the library should behave if the one file wouldn't be added to the store: `toSource { root = ./file.nix; fileset = <empty>; }` has no reasonable result because returing an empty store path wouldn't match the file type, and there's no way to have an empty file store path, whatever that would mean. +### `fileFilter` takes a path + +The `fileFilter` function takes a path, and not a file set, as its second argument. + +- (-) Makes it harder to compose functions, since the file set type, the return value, can't be passed to the function itself like `fileFilter predicate fileset` + - (+) It's still possible to use `intersection` to filter on file sets: `intersection fileset (fileFilter predicate ./.)` + - (-) This does need an extra `./.` argument that's not obvious + - (+) This could always be `/.` or the project directory, `intersection` will make it lazy +- (+) In the future this will allow `fileFilter` to support a predicate property like `subpath` and/or `components` in a reproducible way. + This wouldn't be possible if it took a file set, because file sets don't have a predictable absolute path. + - (-) What about the base path? + - (+) That can change depending on which files are included, so if it's used for `fileFilter` + it would change the `subpath`/`components` value depending on which files are included. +- (+) If necessary, this restriction can be relaxed later, the opposite wouldn't be possible + ## To update in the future Here's a list of places in the library that need to be updated in the future: -- > The file set library is currently somewhat limited but is being expanded to include more functions over time. - - in [the manual](../../doc/functions/fileset.section.md) -- Once a tracing function exists, `__noEval` in [internal.nix](./internal.nix) should mention it -- If/Once a function to convert `lib.sources` values into file sets exists, the `_coerce` and `toSource` functions should be updated to mention that function in the error when such a value is passed - If/Once a function exists that can optionally include a path depending on whether it exists, the error message for the path not existing in `_coerce` should mention the new function diff --git a/lib/fileset/default.nix b/lib/fileset/default.nix index 88c8dcd1a70..15af0813eec 100644 --- a/lib/fileset/default.nix +++ b/lib/fileset/default.nix @@ -3,19 +3,31 @@ let inherit (import ./internal.nix { inherit lib; }) _coerce + _singleton _coerceMany _toSourceFilter + _fromSourceFilter _unionMany + _fileFilter + _printFileset + _intersection + _difference + _mirrorStorePath + _fetchGitSubmodulesMinver ; inherit (builtins) + isBool isList isPath pathExists + seq typeOf + nixVersion ; inherit (lib.lists) + elemAt imap0 ; @@ -26,6 +38,7 @@ let inherit (lib.strings) isStringLike + versionOlder ; inherit (lib.filesystem) @@ -37,7 +50,9 @@ let ; inherit (lib.trivial) + isFunction pipe + inPureEvalMode ; in { @@ -115,11 +130,10 @@ in { Paths in [strings](https://nixos.org/manual/nix/stable/language/values.html#type-string), including Nix store paths, cannot be passed as `root`. `root` has to be a directory. -<!-- Ignore the indentation here, this is a nixdoc rendering bug that needs to be fixed: https://github.com/nix-community/nixdoc/issues/75 --> -:::{.note} -Changing `root` only affects the directory structure of the resulting store path, it does not change which files are added to the store. -The only way to change which files get added to the store is by changing the `fileset` attribute. -::: + :::{.note} + Changing `root` only affects the directory structure of the resulting store path, it does not change which files are added to the store. + The only way to change which files get added to the store is by changing the `fileset` attribute. + ::: */ root, /* @@ -128,10 +142,9 @@ The only way to change which files get added to the store is by changing the `fi This argument can also be a path, which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). -<!-- Ignore the indentation here, this is a nixdoc rendering bug that needs to be fixed: https://github.com/nix-community/nixdoc/issues/75 --> -:::{.note} -If a directory does not recursively contain any file, it is omitted from the store path contents. -::: + :::{.note} + If a directory does not recursively contain any file, it is omitted from the store path contents. + ::: */ fileset, @@ -147,36 +160,41 @@ If a directory does not recursively contain any file, it is omitted from the sto sourceFilter = _toSourceFilter fileset; in if ! isPath root then - if isStringLike root then + if root ? _isLibCleanSourceWith then throw '' - lib.fileset.toSource: `root` ("${toString root}") is a string-like value, but it should be a path instead. + lib.fileset.toSource: `root` is a `lib.sources`-based value, but it should be a path instead. + To use a `lib.sources`-based value, convert it to a file set using `lib.fileset.fromSource` and pass it as `fileset`. + Note that this only works for sources created from paths.'' + else if isStringLike root then + throw '' + lib.fileset.toSource: `root` (${toString root}) is a string-like value, but it should be a path instead. Paths in strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'' else throw '' lib.fileset.toSource: `root` is of type ${typeOf root}, but it should be a path instead.'' # Currently all Nix paths have the same filesystem root, but this could change in the future. # See also ../path/README.md - else if rootFilesystemRoot != filesetFilesystemRoot then + else if ! fileset._internalIsEmptyWithoutBase && rootFilesystemRoot != filesetFilesystemRoot then throw '' - lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` ("${toString root}"): - `root`: root "${toString rootFilesystemRoot}" - `fileset`: root "${toString filesetFilesystemRoot}" - Different roots are not supported.'' + lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` (${toString root}): + `root`: Filesystem root is "${toString rootFilesystemRoot}" + `fileset`: Filesystem root is "${toString filesetFilesystemRoot}" + Different filesystem roots are not supported.'' else if ! pathExists root then throw '' - lib.fileset.toSource: `root` (${toString root}) does not exist.'' + lib.fileset.toSource: `root` (${toString root}) is a path that does not exist.'' else if pathType root != "directory" then throw '' lib.fileset.toSource: `root` (${toString root}) is a file, but it should be a directory instead. Potential solutions: - If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function. - If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as ${toString (dirOf root)}, and set `fileset` to the file path.'' - else if ! hasPrefix root fileset._internalBase then + else if ! fileset._internalIsEmptyWithoutBase && ! hasPrefix root fileset._internalBase then throw '' lib.fileset.toSource: `fileset` could contain files in ${toString fileset._internalBase}, which is not under the `root` (${toString root}). Potential solutions: - Set `root` to ${toString fileset._internalBase} or any directory higher up. This changes the layout of the resulting store path. - Set `fileset` to a file set that cannot contain files outside the `root` (${toString root}). This could change the files included in the result.'' else - builtins.seq sourceFilter + seq sourceFilter cleanSourceWith { name = "source"; src = root; @@ -184,6 +202,75 @@ If a directory does not recursively contain any file, it is omitted from the sto }; /* + Create a file set with the same files as a `lib.sources`-based value. + This does not import any of the files into the store. + + This can be used to gradually migrate from `lib.sources`-based filtering to `lib.fileset`. + + A file set can be turned back into a source using [`toSource`](#function-library-lib.fileset.toSource). + + :::{.note} + File sets cannot represent empty directories. + Turning the result of this function back into a source using `toSource` will therefore not preserve empty directories. + ::: + + Type: + fromSource :: SourceLike -> FileSet + + Example: + # There's no cleanSource-like function for file sets yet, + # but we can just convert cleanSource to a file set and use it that way + toSource { + root = ./.; + fileset = fromSource (lib.sources.cleanSource ./.); + } + + # Keeping a previous sourceByRegex (which could be migrated to `lib.fileset.unions`), + # but removing a subdirectory using file set functions + difference + (fromSource (lib.sources.sourceByRegex ./. [ + "^README\.md$" + # This regex includes everything in ./doc + "^doc(/.*)?$" + ]) + ./doc/generated + + # Use cleanSource, but limit it to only include ./Makefile and files under ./src + intersection + (fromSource (lib.sources.cleanSource ./.)) + (unions [ + ./Makefile + ./src + ]); + */ + fromSource = source: + let + # This function uses `._isLibCleanSourceWith`, `.origSrc` and `.filter`, + # which are technically internal to lib.sources, + # but we'll allow this since both libraries are in the same code base + # and this function is a bridge between them. + isFiltered = source ? _isLibCleanSourceWith; + path = if isFiltered then source.origSrc else source; + in + # We can only support sources created from paths + if ! isPath path then + if isStringLike path then + throw '' + lib.fileset.fromSource: The source origin of the argument is a string-like value ("${toString path}"), but it should be a path instead. + Sources created from paths in strings cannot be turned into file sets, use `lib.sources` or derivations instead.'' + else + throw '' + lib.fileset.fromSource: The source origin of the argument is of type ${typeOf path}, but it should be a path instead.'' + else if ! pathExists path then + throw '' + lib.fileset.fromSource: The source origin (${toString path}) of the argument does not exist.'' + else if isFiltered then + _fromSourceFilter path source.filter + else + # If there's no filter, no need to run the expensive conversion, all subpaths will be included + _singleton path; + + /* The file set containing all files that are in either of two given file sets. This is the same as [`unions`](#function-library-lib.fileset.unions), but takes just two file sets instead of a list. @@ -216,11 +303,11 @@ If a directory does not recursively contain any file, it is omitted from the sto _unionMany (_coerceMany "lib.fileset.union" [ { - context = "first argument"; + context = "First argument"; value = fileset1; } { - context = "second argument"; + context = "Second argument"; value = fileset2; } ]); @@ -258,24 +345,368 @@ If a directory does not recursively contain any file, it is omitted from the sto */ unions = # A list of file sets. - # Must contain at least 1 element. # The elements can also be paths, # which get [implicitly coerced to file sets](#sec-fileset-path-coercion). filesets: if ! isList filesets then - throw "lib.fileset.unions: Expected argument to be a list, but got a ${typeOf filesets}." - else if filesets == [ ] then - # TODO: This could be supported, but requires an extra internal representation for the empty file set, which would be special for not having a base path. - throw "lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements." + throw '' + lib.fileset.unions: Argument is of type ${typeOf filesets}, but it should be a list instead.'' else pipe filesets [ # Annotate the elements with context, used by _coerceMany for better errors (imap0 (i: el: { - context = "element ${toString i}"; + context = "Element ${toString i}"; value = el; })) (_coerceMany "lib.fileset.unions") _unionMany ]; + /* + Filter a file set to only contain files matching some predicate. + + Type: + fileFilter :: + ({ + name :: String, + type :: String, + ... + } -> Bool) + -> Path + -> FileSet + + Example: + # Include all regular `default.nix` files in the current directory + fileFilter (file: file.name == "default.nix") ./. + + # Include all non-Nix files from the current directory + fileFilter (file: ! hasSuffix ".nix" file.name) ./. + + # Include all files that start with a "." in the current directory + fileFilter (file: hasPrefix "." file.name) ./. + + # Include all regular files (not symlinks or others) in the current directory + fileFilter (file: file.type == "regular") ./. + */ + fileFilter = + /* + The predicate function to call on all files contained in given file set. + A file is included in the resulting file set if this function returns true for it. + + This function is called with an attribute set containing these attributes: + + - `name` (String): The name of the file + + - `type` (String, one of `"regular"`, `"symlink"` or `"unknown"`): The type of the file. + This matches result of calling [`builtins.readFileType`](https://nixos.org/manual/nix/stable/language/builtins.html#builtins-readFileType) on the file's path. + + Other attributes may be added in the future. + */ + predicate: + # The path whose files to filter + path: + if ! isFunction predicate then + throw '' + lib.fileset.fileFilter: First argument is of type ${typeOf predicate}, but it should be a function instead.'' + else if ! isPath path then + if path._type or "" == "fileset" then + throw '' + lib.fileset.fileFilter: Second argument is a file set, but it should be a path instead. + If you need to filter files in a file set, use `intersection fileset (fileFilter pred ./.)` instead.'' + else + throw '' + lib.fileset.fileFilter: Second argument is of type ${typeOf path}, but it should be a path instead.'' + else if ! pathExists path then + throw '' + lib.fileset.fileFilter: Second argument (${toString path}) is a path that does not exist.'' + else + _fileFilter predicate path; + + /* + The file set containing all files that are in both of two given file sets. + See also [Intersection (set theory)](https://en.wikipedia.org/wiki/Intersection_(set_theory)). + + The given file sets are evaluated as lazily as possible, + with the first argument being evaluated first if needed. + + Type: + intersection :: FileSet -> FileSet -> FileSet + + Example: + # Limit the selected files to the ones in ./., so only ./src and ./Makefile + intersection ./. (unions [ ../LICENSE ./src ./Makefile ]) + */ + intersection = + # The first file set. + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + fileset1: + # The second file set. + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + fileset2: + let + filesets = _coerceMany "lib.fileset.intersection" [ + { + context = "First argument"; + value = fileset1; + } + { + context = "Second argument"; + value = fileset2; + } + ]; + in + _intersection + (elemAt filesets 0) + (elemAt filesets 1); + + /* + The file set containing all files from the first file set that are not in the second file set. + See also [Difference (set theory)](https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement). + + The given file sets are evaluated as lazily as possible, + with the first argument being evaluated first if needed. + + Type: + union :: FileSet -> FileSet -> FileSet + + Example: + # Create a file set containing all files from the current directory, + # except ones under ./tests + difference ./. ./tests + + let + # A set of Nix-related files + nixFiles = unions [ ./default.nix ./nix ./tests/default.nix ]; + in + # Create a file set containing all files under ./tests, except ones in `nixFiles`, + # meaning only without ./tests/default.nix + difference ./tests nixFiles + */ + difference = + # The positive file set. + # The result can only contain files that are also in this file set. + # + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + positive: + # The negative file set. + # The result will never contain files that are also in this file set. + # + # This argument can also be a path, + # which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + negative: + let + filesets = _coerceMany "lib.fileset.difference" [ + { + context = "First argument (positive set)"; + value = positive; + } + { + context = "Second argument (negative set)"; + value = negative; + } + ]; + in + _difference + (elemAt filesets 0) + (elemAt filesets 1); + + /* + Incrementally evaluate and trace a file set in a pretty way. + This function is only intended for debugging purposes. + The exact tracing format is unspecified and may change. + + This function takes a final argument to return. + In comparison, [`traceVal`](#function-library-lib.fileset.traceVal) returns + the given file set argument. + + This variant is useful for tracing file sets in the Nix repl. + + Type: + trace :: FileSet -> Any -> Any + + Example: + trace (unions [ ./Makefile ./src ./tests/run.sh ]) null + => + trace: /home/user/src/myProject + trace: - Makefile (regular) + trace: - src (all files in directory) + trace: - tests + trace: - run.sh (regular) + null + */ + trace = + /* + The file set to trace. + + This argument can also be a path, + which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + */ + fileset: + let + # "fileset" would be a better name, but that would clash with the argument name, + # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76 + actualFileset = _coerce "lib.fileset.trace: Argument" fileset; + in + seq + (_printFileset actualFileset) + (x: x); + + /* + Incrementally evaluate and trace a file set in a pretty way. + This function is only intended for debugging purposes. + The exact tracing format is unspecified and may change. + + This function returns the given file set. + In comparison, [`trace`](#function-library-lib.fileset.trace) takes another argument to return. + + This variant is useful for tracing file sets passed as arguments to other functions. + + Type: + traceVal :: FileSet -> FileSet + + Example: + toSource { + root = ./.; + fileset = traceVal (unions [ + ./Makefile + ./src + ./tests/run.sh + ]); + } + => + trace: /home/user/src/myProject + trace: - Makefile (regular) + trace: - src (all files in directory) + trace: - tests + trace: - run.sh (regular) + "/nix/store/...-source" + */ + traceVal = + /* + The file set to trace and return. + + This argument can also be a path, + which gets [implicitly coerced to a file set](#sec-fileset-path-coercion). + */ + fileset: + let + # "fileset" would be a better name, but that would clash with the argument name, + # and we cannot change that because of https://github.com/nix-community/nixdoc/issues/76 + actualFileset = _coerce "lib.fileset.traceVal: Argument" fileset; + in + seq + (_printFileset actualFileset) + # We could also return the original fileset argument here, + # but that would then duplicate work for consumers of the fileset, because then they have to coerce it again + actualFileset; + + /* + Create a file set containing all [Git-tracked files](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) in a repository. + + This function behaves like [`gitTrackedWith { }`](#function-library-lib.fileset.gitTrackedWith) - using the defaults. + + Type: + gitTracked :: Path -> FileSet + + Example: + # Include all files tracked by the Git repository in the current directory + gitTracked ./. + + # Include only files tracked by the Git repository in the parent directory + # that are also in the current directory + intersection ./. (gitTracked ../.) + */ + gitTracked = + /* + The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to the working directory of a local Git repository. + This directory must contain a `.git` file or subdirectory. + */ + path: + # See the gitTrackedWith implementation for more explanatory comments + let + fetchResult = builtins.fetchGit path; + in + if inPureEvalMode then + throw "lib.fileset.gitTracked: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`. See https://github.com/NixOS/nix/issues/9292." + else if ! isPath path then + throw "lib.fileset.gitTracked: Expected the argument to be a path, but it's a ${typeOf path} instead." + else if ! pathExists (path + "/.git") then + throw "lib.fileset.gitTracked: Expected the argument (${toString path}) to point to a local working tree of a Git repository, but it's not." + else + _mirrorStorePath path fetchResult.outPath; + + /* + Create a file set containing all [Git-tracked files](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) in a repository. + The first argument allows configuration with an attribute set, + while the second argument is the path to the Git working tree. + If you don't need the configuration, + you can use [`gitTracked`](#function-library-lib.fileset.gitTracked) instead. + + This is equivalent to the result of [`unions`](#function-library-lib.fileset.unions) on all files returned by [`git ls-files`](https://git-scm.com/docs/git-ls-files) + (which uses [`--cached`](https://git-scm.com/docs/git-ls-files#Documentation/git-ls-files.txt--c) by default). + + :::{.warning} + Currently this function is based on [`builtins.fetchGit`](https://nixos.org/manual/nix/stable/language/builtins.html#builtins-fetchGit) + As such, this function causes all Git-tracked files to be unnecessarily added to the Nix store, + without being re-usable by [`toSource`](#function-library-lib.fileset.toSource). + + This may change in the future. + ::: + + Type: + gitTrackedWith :: { recurseSubmodules :: Bool ? false } -> Path -> FileSet + + Example: + # Include all files tracked by the Git repository in the current directory + # and any submodules under it + gitTracked { recurseSubmodules = true; } ./. + */ + gitTrackedWith = + { + /* + (optional, default: `false`) Whether to recurse into [Git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) to also include their tracked files. + + If `true`, this is equivalent to passing the [--recurse-submodules](https://git-scm.com/docs/git-ls-files#Documentation/git-ls-files.txt---recurse-submodules) flag to `git ls-files`. + */ + recurseSubmodules ? false, + }: + /* + The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to the working directory of a local Git repository. + This directory must contain a `.git` file or subdirectory. + */ + path: + let + # This imports the files unnecessarily, which currently can't be avoided + # because `builtins.fetchGit` is the only function exposing which files are tracked by Git. + # With the [lazy trees PR](https://github.com/NixOS/nix/pull/6530), + # the unnecessarily import could be avoided. + # However a simpler alternative still would be [a builtins.gitLsFiles](https://github.com/NixOS/nix/issues/2944). + fetchResult = builtins.fetchGit { + url = path; + + # This is the only `fetchGit` parameter that makes sense in this context. + # We can't just pass `submodules = recurseSubmodules` here because + # this would fail for Nix versions that don't support `submodules`. + ${if recurseSubmodules then "submodules" else null} = true; + }; + in + if inPureEvalMode then + throw "lib.fileset.gitTrackedWith: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`. See https://github.com/NixOS/nix/issues/9292." + else if ! isBool recurseSubmodules then + throw "lib.fileset.gitTrackedWith: Expected the attribute `recurseSubmodules` of the first argument to be a boolean, but it's a ${typeOf recurseSubmodules} instead." + else if recurseSubmodules && versionOlder nixVersion _fetchGitSubmodulesMinver then + throw "lib.fileset.gitTrackedWith: Setting the attribute `recurseSubmodules` to `true` is only supported for Nix version ${_fetchGitSubmodulesMinver} and after, but Nix version ${nixVersion} is used." + else if ! isPath path then + throw "lib.fileset.gitTrackedWith: Expected the second argument to be a path, but it's a ${typeOf path} instead." + # We can identify local working directories by checking for .git, + # see https://git-scm.com/docs/gitrepository-layout#_description. + # Note that `builtins.fetchGit` _does_ work for bare repositories (where there's no `.git`), + # even though `git ls-files` wouldn't return any files in that case. + else if ! pathExists (path + "/.git") then + throw "lib.fileset.gitTrackedWith: Expected the second argument (${toString path}) to point to a local working tree of a Git repository, but it's not." + else + _mirrorStorePath path fetchResult.outPath; } diff --git a/lib/fileset/internal.nix b/lib/fileset/internal.nix index 2c329edb390..0769e654c8f 100644 --- a/lib/fileset/internal.nix +++ b/lib/fileset/internal.nix @@ -7,14 +7,15 @@ let isString pathExists readDir - typeOf split + trace + typeOf ; inherit (lib.attrsets) + attrNames attrValues mapAttrs - setAttrByPath zipAttrsWith ; @@ -25,9 +26,9 @@ let inherit (lib.lists) all commonPrefix - drop elemAt filter + findFirst findFirstIndex foldl' head @@ -64,7 +65,7 @@ rec { # - Increment this version # - Add an additional migration function below # - Update the description of the internal representation in ./README.md - _currentVersion = 2; + _currentVersion = 3; # Migrations between versions. The 0th element converts from v0 to v1, and so on migrations = [ @@ -89,8 +90,38 @@ rec { _internalVersion = 2; } ) + + # Convert v2 into v3: filesetTree's now have a representation for an empty file set without a base path + ( + filesetV2: + filesetV2 // { + # All v1 file sets are not the new empty file set + _internalIsEmptyWithoutBase = false; + _internalVersion = 3; + } + ) ]; + _noEvalMessage = '' + lib.fileset: Directly evaluating a file set is not supported. + To turn it into a usable source, use `lib.fileset.toSource`. + To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.''; + + # The empty file set without a base path + _emptyWithoutBase = { + _type = "fileset"; + + _internalVersion = _currentVersion; + + # The one and only! + _internalIsEmptyWithoutBase = true; + + # Due to alphabetical ordering, this is evaluated last, + # which makes the nix repl output nicer than if it would be ordered first. + # It also allows evaluating it strictly up to this error, which could be useful + _noEval = throw _noEvalMessage; + }; + # Create a fileset, see ./README.md#fileset # Type: path -> filesetTree -> fileset _create = base: tree: @@ -103,14 +134,17 @@ rec { _type = "fileset"; _internalVersion = _currentVersion; + + _internalIsEmptyWithoutBase = false; _internalBase = base; _internalBaseRoot = parts.root; _internalBaseComponents = components parts.subpath; _internalTree = tree; - # Double __ to make it be evaluated and ordered first - __noEval = throw '' - lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.''; + # Due to alphabetical ordering, this is evaluated last, + # which makes the nix repl output nicer than if it would be ordered first. + # It also allows evaluating it strictly up to this error, which could be useful + _noEval = throw _noEvalMessage; }; # Coerce a value to a fileset, erroring when the value cannot be coerced. @@ -133,16 +167,21 @@ rec { else value else if ! isPath value then - if isStringLike value then + if value ? _isLibCleanSourceWith then + throw '' + ${context} is a `lib.sources`-based value, but it should be a file set or a path instead. + To convert a `lib.sources`-based value to a file set you can use `lib.fileset.fromSource`. + Note that this only works for sources created from paths.'' + else if isStringLike value then throw '' - ${context} ("${toString value}") is a string-like value, but it should be a path instead. + ${context} ("${toString value}") is a string-like value, but it should be a file set or a path instead. Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'' else throw '' - ${context} is of type ${typeOf value}, but it should be a path instead.'' + ${context} is of type ${typeOf value}, but it should be a file set or a path instead.'' else if ! pathExists value then throw '' - ${context} (${toString value}) does not exist.'' + ${context} (${toString value}) is a path that does not exist.'' else _singleton value; @@ -155,19 +194,25 @@ rec { _coerce "${functionContext}: ${context}" value ) list; - firstBaseRoot = (head filesets)._internalBaseRoot; + # Find the first value with a base, there may be none! + firstWithBase = findFirst (fileset: ! fileset._internalIsEmptyWithoutBase) null filesets; + # This value is only accessed if first != null + firstBaseRoot = firstWithBase._internalBaseRoot; # Finds the first element with a filesystem root different than the first element, if any differentIndex = findFirstIndex (fileset: - firstBaseRoot != fileset._internalBaseRoot + # The empty value without a base doesn't have a base path + ! fileset._internalIsEmptyWithoutBase + && firstBaseRoot != fileset._internalBaseRoot ) null filesets; in - if differentIndex != null then + # Only evaluates `differentIndex` if there are any elements with a base + if firstWithBase != null && differentIndex != null then throw '' ${functionContext}: Filesystem roots are not the same: - ${(head list).context}: root "${toString firstBaseRoot}" - ${(elemAt list differentIndex).context}: root "${toString (elemAt filesets differentIndex)._internalBaseRoot}" - Different roots are not supported.'' + ${(head list).context}: Filesystem root is "${toString firstBaseRoot}" + ${(elemAt list differentIndex).context}: Filesystem root is "${toString (elemAt filesets differentIndex)._internalBaseRoot}" + Different filesystem roots are not supported.'' else filesets; @@ -203,22 +248,22 @@ rec { // value; /* - Simplify a filesetTree recursively: - - Replace all directories that have no files with `null` + A normalisation of a filesetTree suitable filtering with `builtins.path`: + - Replace all directories that have no files with `null`. This removes directories that would be empty - - Replace all directories with all files with `"directory"` + - Replace all directories with all files with `"directory"`. This speeds up the source filter function Note that this function is strict, it evaluates the entire tree Type: Path -> filesetTree -> filesetTree */ - _simplifyTree = path: tree: + _normaliseTreeFilter = path: tree: if tree == "directory" || isAttrs tree then let entries = _directoryEntries path tree; - simpleSubtrees = mapAttrs (name: _simplifyTree (path + "/${name}")) entries; - subtreeValues = attrValues simpleSubtrees; + normalisedSubtrees = mapAttrs (name: _normaliseTreeFilter (path + "/${name}")) entries; + subtreeValues = attrValues normalisedSubtrees; in # This triggers either when all files in a directory are filtered out # Or when the directory doesn't contain any files at all @@ -228,10 +273,112 @@ rec { else if all isString subtreeValues then "directory" else - simpleSubtrees + normalisedSubtrees + else + tree; + + /* + A minimal normalisation of a filesetTree, intended for pretty-printing: + - If all children of a path are recursively included or empty directories, the path itself is also recursively included + - If all children of a path are fully excluded or empty directories, the path itself is an empty directory + - Other empty directories are represented with the special "emptyDir" string + While these could be replaced with `null`, that would take another mapAttrs + + Note that this function is partially lazy. + + Type: Path -> filesetTree -> filesetTree (with "emptyDir"'s) + */ + _normaliseTreeMinimal = path: tree: + if tree == "directory" || isAttrs tree then + let + entries = _directoryEntries path tree; + normalisedSubtrees = mapAttrs (name: _normaliseTreeMinimal (path + "/${name}")) entries; + subtreeValues = attrValues normalisedSubtrees; + in + # If there are no entries, or all entries are empty directories, return "emptyDir". + # After this branch we know that there's at least one file + if all (value: value == "emptyDir") subtreeValues then + "emptyDir" + + # If all subtrees are fully included or empty directories + # (both of which are coincidentally represented as strings), return "directory". + # This takes advantage of the fact that empty directories can be represented as included directories. + # Note that the tree == "directory" check allows avoiding recursion + else if tree == "directory" || all (value: isString value) subtreeValues then + "directory" + + # If all subtrees are fully excluded or empty directories, return null. + # This takes advantage of the fact that empty directories can be represented as excluded directories + else if all (value: isNull value || value == "emptyDir") subtreeValues then + null + + # Mix of included and excluded entries + else + normalisedSubtrees else tree; + # Trace a filesetTree in a pretty way when the resulting value is evaluated. + # This can handle both normal filesetTree's, and ones returned from _normaliseTreeMinimal + # Type: Path -> filesetTree (with "emptyDir"'s) -> Null + _printMinimalTree = base: tree: + let + treeSuffix = tree: + if isAttrs tree then + "" + else if tree == "directory" then + " (all files in directory)" + else + # This does "leak" the file type strings of the internal representation, + # but this is the main reason these file type strings even are in the representation! + # TODO: Consider removing that information from the internal representation for performance. + # The file types can still be printed by querying them only during tracing + " (${tree})"; + + # Only for attribute set trees + traceTreeAttrs = prevLine: indent: tree: + foldl' (prevLine: name: + let + subtree = tree.${name}; + + # Evaluating this prints the line for this subtree + thisLine = + trace "${indent}- ${name}${treeSuffix subtree}" prevLine; + in + if subtree == null || subtree == "emptyDir" then + # Don't print anything at all if this subtree is empty + prevLine + else if isAttrs subtree then + # A directory with explicit entries + # Do print this node, but also recurse + traceTreeAttrs thisLine "${indent} " subtree + else + # Either a file, or a recursively included directory + # Do print this node but no further recursion needed + thisLine + ) prevLine (attrNames tree); + + # Evaluating this will print the first line + firstLine = + if tree == null || tree == "emptyDir" then + trace "(empty)" null + else + trace "${toString base}${treeSuffix tree}" null; + in + if isAttrs tree then + traceTreeAttrs firstLine "" tree + else + firstLine; + + # Pretty-print a file set in a pretty way when the resulting value is evaluated + # Type: fileset -> Null + _printFileset = fileset: + if fileset._internalIsEmptyWithoutBase then + trace "(empty)" null + else + _printMinimalTree fileset._internalBase + (_normaliseTreeMinimal fileset._internalBase fileset._internalTree); + # Turn a fileset into a source filter function suitable for `builtins.path` # Only directories recursively containing at least one files are recursed into # Type: Path -> fileset -> (String -> String -> Bool) @@ -239,7 +386,7 @@ rec { let # Simplify the tree, necessary to make sure all empty directories are null # which has the effect that they aren't included in the result - tree = _simplifyTree fileset._internalBase fileset._internalTree; + tree = _normaliseTreeFilter fileset._internalBase fileset._internalTree; # The base path as a string with a single trailing slash baseString = @@ -279,7 +426,7 @@ rec { # Filter suited when there's some files # This can't be used for when there's no files, because the base directory is always included nonEmpty = - path: _: + path: type: let # Add a slash to the path string, turning "/foo" to "/foo/", # making sure to not have any false prefix matches below. @@ -288,40 +435,147 @@ rec { # meaning this function can never receive "/" as an argument pathSlash = path + "/"; in - # Same as `hasPrefix pathSlash baseString`, but more efficient. - # With base /foo/bar we need to include /foo: - # hasPrefix "/foo/" "/foo/bar/" - if substring 0 (stringLength pathSlash) baseString == pathSlash then - true - # Same as `! hasPrefix baseString pathSlash`, but more efficient. - # With base /foo/bar we need to exclude /baz - # ! hasPrefix "/baz/" "/foo/bar/" - else if substring 0 baseLength pathSlash != baseString then - false - else - # Same as `removePrefix baseString path`, but more efficient. - # From the above code we know that hasPrefix baseString pathSlash holds, so this is safe. - # We don't use pathSlash here because we only needed the trailing slash for the prefix matching. - # With base /foo and path /foo/bar/baz this gives - # inTree (split "/" (removePrefix "/foo/" "/foo/bar/baz")) - # == inTree (split "/" "bar/baz") - # == inTree [ "bar" "baz" ] - inTree (split "/" (substring baseLength (-1) path)); + ( + # Same as `hasPrefix pathSlash baseString`, but more efficient. + # With base /foo/bar we need to include /foo: + # hasPrefix "/foo/" "/foo/bar/" + if substring 0 (stringLength pathSlash) baseString == pathSlash then + true + # Same as `! hasPrefix baseString pathSlash`, but more efficient. + # With base /foo/bar we need to exclude /baz + # ! hasPrefix "/baz/" "/foo/bar/" + else if substring 0 baseLength pathSlash != baseString then + false + else + # Same as `removePrefix baseString path`, but more efficient. + # From the above code we know that hasPrefix baseString pathSlash holds, so this is safe. + # We don't use pathSlash here because we only needed the trailing slash for the prefix matching. + # With base /foo and path /foo/bar/baz this gives + # inTree (split "/" (removePrefix "/foo/" "/foo/bar/baz")) + # == inTree (split "/" "bar/baz") + # == inTree [ "bar" "baz" ] + inTree (split "/" (substring baseLength (-1) path)) + ) + # This is a way have an additional check in case the above is true without any significant performance cost + && ( + # This relies on the fact that Nix only distinguishes path types "directory", "regular", "symlink" and "unknown", + # so everything except "unknown" is allowed, seems reasonable to rely on that + type != "unknown" + || throw '' + lib.fileset.toSource: `fileset` contains a file that cannot be added to the store: ${path} + This file is neither a regular file nor a symlink, the only file types supported by the Nix store. + Therefore the file set cannot be added to the Nix store as is. Make sure to not include that file to avoid this error.'' + ); in # Special case because the code below assumes that the _internalBase is always included in the result # which shouldn't be done when we have no files at all in the base # This also forces the tree before returning the filter, leads to earlier error messages - if tree == null then + if fileset._internalIsEmptyWithoutBase || tree == null then empty else nonEmpty; + # Turn a builtins.filterSource-based source filter on a root path into a file set + # containing only files included by the filter. + # The filter is lazily called as necessary to determine whether paths are included + # Type: Path -> (String -> String -> Bool) -> fileset + _fromSourceFilter = root: sourceFilter: + let + # During the recursion we need to track both: + # - The path value such that we can safely call `readDir` on it + # - The path string value such that we can correctly call the `filter` with it + # + # While we could just recurse with the path value, + # this would then require converting it to a path string for every path, + # which is a fairly expensive operation + + # Create a file set from a directory entry + fromDirEntry = path: pathString: type: + # The filter needs to run on the path as a string + if ! sourceFilter pathString type then + null + else if type == "directory" then + fromDir path pathString + else + type; + + # Create a file set from a directory + fromDir = path: pathString: + mapAttrs + # This looks a bit funny, but we need both the path-based and the path string-based values + (name: fromDirEntry (path + "/${name}") (pathString + "/${name}")) + # We need to readDir on the path value, because reading on a path string + # would be unspecified if there are multiple filesystem roots + (readDir path); + + rootPathType = pathType root; + + # We need to convert the path to a string to imitate what builtins.path calls the filter function with. + # We don't want to rely on `toString` for this though because it's not very well defined, see ../path/README.md + # So instead we use `lib.path.splitRoot` to safely deconstruct the path into its filesystem root and subpath + # We don't need the filesystem root though, builtins.path doesn't expose that in any way to the filter. + # So we only need the components, which we then turn into a string as one would expect. + rootString = "/" + concatStringsSep "/" (components (splitRoot root).subpath); + in + if rootPathType == "directory" then + # We imitate builtins.path not calling the filter on the root path + _create root (fromDir root rootString) + else + # Direct files are always included by builtins.path without calling the filter + # But we need to lift up the base path to its parent to satisfy the base path invariant + _create (dirOf root) + { + ${baseNameOf root} = rootPathType; + }; + + # Transforms the filesetTree of a file set to a shorter base path, e.g. + # _shortenTreeBase [ "foo" ] (_create /foo/bar null) + # => { bar = null; } + _shortenTreeBase = targetBaseComponents: fileset: + let + recurse = index: + # If we haven't reached the required depth yet + if index < length fileset._internalBaseComponents then + # Create an attribute set and recurse as the value, this can be lazily evaluated this way + { ${elemAt fileset._internalBaseComponents index} = recurse (index + 1); } + else + # Otherwise we reached the appropriate depth, here's the original tree + fileset._internalTree; + in + recurse (length targetBaseComponents); + + # Transforms the filesetTree of a file set to a longer base path, e.g. + # _lengthenTreeBase [ "foo" "bar" ] (_create /foo { bar.baz = "regular"; }) + # => { baz = "regular"; } + _lengthenTreeBase = targetBaseComponents: fileset: + let + recurse = index: tree: + # If the filesetTree is an attribute set and we haven't reached the required depth yet + if isAttrs tree && index < length targetBaseComponents then + # Recurse with the tree under the right component (which might not exist) + recurse (index + 1) (tree.${elemAt targetBaseComponents index} or null) + else + # For all values here we can just return the tree itself: + # tree == null -> the result is also null, everything is excluded + # tree == "directory" -> the result is also "directory", + # because the base path is always a directory and everything is included + # isAttrs tree -> the result is `tree` + # because we don't need to recurse any more since `index == length longestBaseComponents` + tree; + in + recurse (length fileset._internalBaseComponents) fileset._internalTree; + # Computes the union of a list of filesets. # The filesets must already be coerced and validated to be in the same filesystem root # Type: [ Fileset ] -> Fileset _unionMany = filesets: let - first = head filesets; + # All filesets that have a base, aka not the ones that are the empty value without a base + filesetsWithBase = filter (fileset: ! fileset._internalIsEmptyWithoutBase) filesets; + + # The first fileset that has a base. + # This value is only accessed if there are at all. + firstWithBase = head filesetsWithBase; # To be able to union filesetTree's together, they need to have the same base path. # Base paths can be unioned by taking their common prefix, @@ -332,14 +586,14 @@ rec { # so this cannot cause a stack overflow due to a build-up of unevaluated thunks. commonBaseComponents = foldl' (components: el: commonPrefix components el._internalBaseComponents) - first._internalBaseComponents + firstWithBase._internalBaseComponents # We could also not do the `tail` here to avoid a list allocation, # but then we'd have to pay for a potentially expensive # but unnecessary `commonPrefix` call - (tail filesets); + (tail filesetsWithBase); # The common base path assembled from a filesystem root and the common components - commonBase = append first._internalBaseRoot (join commonBaseComponents); + commonBase = append firstWithBase._internalBaseRoot (join commonBaseComponents); # A list of filesetTree's that all have the same base path # This is achieved by nesting the trees into the components they have over the common base path @@ -347,18 +601,18 @@ rec { # So the tree under `/foo/bar` gets nested under `{ bar = ...; ... }`, # while the tree under `/foo/baz` gets nested under `{ baz = ...; ... }` # Therefore allowing combined operations over them. - trees = map (fileset: - setAttrByPath - (drop (length commonBaseComponents) fileset._internalBaseComponents) - fileset._internalTree - ) filesets; + trees = map (_shortenTreeBase commonBaseComponents) filesetsWithBase; # Folds all trees together into a single one using _unionTree # We do not use a fold here because it would cause a thunk build-up # which could cause a stack overflow for a large number of trees resultTree = _unionTrees trees; in - _create commonBase resultTree; + # If there's no values with a base, we have no files + if filesetsWithBase == [ ] then + _emptyWithoutBase + else + _create commonBase resultTree; # The union of multiple filesetTree's with the same base path. # Later elements are only evaluated if necessary. @@ -379,4 +633,219 @@ rec { # The non-null elements have to be attribute sets representing partial trees # We need to recurse into those zipAttrsWith (name: _unionTrees) withoutNull; + + # Computes the intersection of a list of filesets. + # The filesets must already be coerced and validated to be in the same filesystem root + # Type: Fileset -> Fileset -> Fileset + _intersection = fileset1: fileset2: + let + # The common base components prefix, e.g. + # (/foo/bar, /foo/bar/baz) -> /foo/bar + # (/foo/bar, /foo/baz) -> /foo + commonBaseComponentsLength = + # TODO: Have a `lib.lists.commonPrefixLength` function such that we don't need the list allocation from commonPrefix here + length ( + commonPrefix + fileset1._internalBaseComponents + fileset2._internalBaseComponents + ); + + # To be able to intersect filesetTree's together, they need to have the same base path. + # Base paths can be intersected by taking the longest one (if any) + + # The fileset with the longest base, if any, e.g. + # (/foo/bar, /foo/bar/baz) -> /foo/bar/baz + # (/foo/bar, /foo/baz) -> null + longestBaseFileset = + if commonBaseComponentsLength == length fileset1._internalBaseComponents then + # The common prefix is the same as the first path, so the second path is equal or longer + fileset2 + else if commonBaseComponentsLength == length fileset2._internalBaseComponents then + # The common prefix is the same as the second path, so the first path is longer + fileset1 + else + # The common prefix is neither the first nor the second path + # This means there's no overlap between the two sets + null; + + # Whether the result should be the empty value without a base + resultIsEmptyWithoutBase = + # If either fileset is the empty fileset without a base, the intersection is too + fileset1._internalIsEmptyWithoutBase + || fileset2._internalIsEmptyWithoutBase + # If there is no overlap between the base paths + || longestBaseFileset == null; + + # Lengthen each fileset's tree to the longest base prefix + tree1 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset1; + tree2 = _lengthenTreeBase longestBaseFileset._internalBaseComponents fileset2; + + # With two filesetTree's with the same base, we can compute their intersection + resultTree = _intersectTree tree1 tree2; + in + if resultIsEmptyWithoutBase then + _emptyWithoutBase + else + _create longestBaseFileset._internalBase resultTree; + + # The intersection of two filesetTree's with the same base path + # The second element is only evaluated as much as necessary. + # Type: filesetTree -> filesetTree -> filesetTree + _intersectTree = lhs: rhs: + if isAttrs lhs && isAttrs rhs then + # Both sides are attribute sets, we can recurse for the attributes existing on both sides + mapAttrs + (name: _intersectTree lhs.${name}) + (builtins.intersectAttrs lhs rhs) + else if lhs == null || isString rhs then + # If the lhs is null, the result should also be null + # And if the rhs is the identity element + # (a string, aka it includes everything), then it's also the lhs + lhs + else + # In all other cases it's the rhs + rhs; + + # Compute the set difference between two file sets. + # The filesets must already be coerced and validated to be in the same filesystem root. + # Type: Fileset -> Fileset -> Fileset + _difference = positive: negative: + let + # The common base components prefix, e.g. + # (/foo/bar, /foo/bar/baz) -> /foo/bar + # (/foo/bar, /foo/baz) -> /foo + commonBaseComponentsLength = + # TODO: Have a `lib.lists.commonPrefixLength` function such that we don't need the list allocation from commonPrefix here + length ( + commonPrefix + positive._internalBaseComponents + negative._internalBaseComponents + ); + + # We need filesetTree's with the same base to be able to compute the difference between them + # This here is the filesetTree from the negative file set, but for a base path that matches the positive file set. + # Examples: + # For `difference /foo /foo/bar`, `negativeTreeWithPositiveBase = { bar = "directory"; }` + # because under the base path of `/foo`, only `bar` from the negative file set is included + # For `difference /foo/bar /foo`, `negativeTreeWithPositiveBase = "directory"` + # because under the base path of `/foo/bar`, everything from the negative file set is included + # For `difference /foo /bar`, `negativeTreeWithPositiveBase = null` + # because under the base path of `/foo`, nothing from the negative file set is included + negativeTreeWithPositiveBase = + if commonBaseComponentsLength == length positive._internalBaseComponents then + # The common prefix is the same as the positive base path, so the second path is equal or longer. + # We need to _shorten_ the negative filesetTree to the same base path as the positive one + # E.g. for `difference /foo /foo/bar` the common prefix is /foo, equal to the positive file set's base + # So we need to shorten the base of the tree for the negative argument from /foo/bar to just /foo + _shortenTreeBase positive._internalBaseComponents negative + else if commonBaseComponentsLength == length negative._internalBaseComponents then + # The common prefix is the same as the negative base path, so the first path is longer. + # We need to lengthen the negative filesetTree to the same base path as the positive one. + # E.g. for `difference /foo/bar /foo` the common prefix is /foo, equal to the negative file set's base + # So we need to lengthen the base of the tree for the negative argument from /foo to /foo/bar + _lengthenTreeBase positive._internalBaseComponents negative + else + # The common prefix is neither the first nor the second path. + # This means there's no overlap between the two file sets, + # and nothing from the negative argument should get removed from the positive one + # E.g for `difference /foo /bar`, we remove nothing to get the same as `/foo` + null; + + resultingTree = + _differenceTree + positive._internalBase + positive._internalTree + negativeTreeWithPositiveBase; + in + # If the first file set is empty, we can never have any files in the result + if positive._internalIsEmptyWithoutBase then + _emptyWithoutBase + # If the second file set is empty, nothing gets removed, so the result is just the first file set + else if negative._internalIsEmptyWithoutBase then + positive + else + # We use the positive file set base for the result, + # because only files from the positive side may be included, + # which is what base path is for + _create positive._internalBase resultingTree; + + # Computes the set difference of two filesetTree's + # Type: Path -> filesetTree -> filesetTree + _differenceTree = path: lhs: rhs: + # If the lhs doesn't have any files, or the right hand side includes all files + if lhs == null || isString rhs then + # The result will always be empty + null + # If the right hand side has no files + else if rhs == null then + # The result is always the left hand side, because nothing gets removed + lhs + else + # Otherwise we always have two attribute sets to recurse into + mapAttrs (name: lhsValue: + _differenceTree (path + "/${name}") lhsValue (rhs.${name} or null) + ) (_directoryEntries path lhs); + + # Filters all files in a path based on a predicate + # Type: ({ name, type, ... } -> Bool) -> Path -> FileSet + _fileFilter = predicate: root: + let + # Check the predicate for a single file + # Type: String -> String -> filesetTree + fromFile = name: type: + if + predicate { + inherit name type; + # To ensure forwards compatibility with more arguments being added in the future, + # adding an attribute which can't be deconstructed :) + "lib.fileset.fileFilter: The predicate function passed as the first argument must be able to handle extra attributes for future compatibility. If you're using `{ name, file }:`, use `{ name, file, ... }:` instead." = null; + } + then + type + else + null; + + # Check the predicate for all files in a directory + # Type: Path -> filesetTree + fromDir = path: + mapAttrs (name: type: + if type == "directory" then + fromDir (path + "/${name}") + else + fromFile name type + ) (readDir path); + + rootType = pathType root; + in + if rootType == "directory" then + _create root (fromDir root) + else + # Single files are turned into a directory containing that file or nothing. + _create (dirOf root) { + ${baseNameOf root} = + fromFile (baseNameOf root) rootType; + }; + + # Support for `builtins.fetchGit` with `submodules = true` was introduced in 2.4 + # https://github.com/NixOS/nix/commit/55cefd41d63368d4286568e2956afd535cb44018 + _fetchGitSubmodulesMinver = "2.4"; + + # Mirrors the contents of a Nix store path relative to a local path as a file set. + # Some notes: + # - The store path is read at evaluation time. + # - The store path must not include files that don't exist in the respective local path. + # + # Type: Path -> String -> FileSet + _mirrorStorePath = localPath: storePath: + let + recurse = focusedStorePath: + mapAttrs (name: type: + if type == "directory" then + recurse (focusedStorePath + "/${name}") + else + type + ) (builtins.readDir focusedStorePath); + in + _create localPath + (recurse storePath); } diff --git a/lib/fileset/tests.sh b/lib/fileset/tests.sh index 0ea96859e7a..3c88ebdd055 100755 --- a/lib/fileset/tests.sh +++ b/lib/fileset/tests.sh @@ -1,5 +1,7 @@ #!/usr/bin/env bash # shellcheck disable=SC2016 +# shellcheck disable=SC2317 +# shellcheck disable=SC2192 # Tests lib.fileset # Run: @@ -41,15 +43,29 @@ crudeUnquoteJSON() { cut -d \" -f2 } -prefixExpression='let - lib = import <nixpkgs/lib>; - internal = import <nixpkgs/lib/fileset/internal.nix> { - inherit lib; - }; -in -with lib; -with internal; -with lib.fileset;' +prefixExpression() { + echo 'let + lib = + (import <nixpkgs/lib>) + ' + if [[ "${1:-}" == "--simulate-pure-eval" ]]; then + echo ' + .extend (final: prev: { + trivial = prev.trivial // { + inPureEvalMode = true; + }; + })' + fi + echo ' + ; + internal = import <nixpkgs/lib/fileset/internal.nix> { + inherit lib; + }; + in + with lib; + with internal; + with lib.fileset;' +} # Check that two nix expression successfully evaluate to the same value. # The expressions have `lib.fileset` in scope. @@ -57,18 +73,35 @@ with lib.fileset;' expectEqual() { local actualExpr=$1 local expectedExpr=$2 - if ! actualResult=$(nix-instantiate --eval --strict --show-trace \ - --expr "$prefixExpression ($actualExpr)"); then - die "$actualExpr failed to evaluate, but it was expected to succeed" + if actualResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/actualStderr \ + --expr "$(prefixExpression) ($actualExpr)"); then + actualExitCode=$? + else + actualExitCode=$? fi - if ! expectedResult=$(nix-instantiate --eval --strict --show-trace \ - --expr "$prefixExpression ($expectedExpr)"); then - die "$expectedExpr failed to evaluate, but it was expected to succeed" + actualStderr=$(< "$tmp"/actualStderr) + + if expectedResult=$(nix-instantiate --eval --strict --show-trace 2>"$tmp"/expectedStderr \ + --expr "$(prefixExpression) ($expectedExpr)"); then + expectedExitCode=$? + else + expectedExitCode=$? + fi + expectedStderr=$(< "$tmp"/expectedStderr) + + if [[ "$actualExitCode" != "$expectedExitCode" ]]; then + echo "$actualStderr" >&2 + echo "$actualResult" >&2 + die "$actualExpr should have exited with $expectedExitCode, but it exited with $actualExitCode" fi if [[ "$actualResult" != "$expectedResult" ]]; then die "$actualExpr should have evaluated to $expectedExpr:\n$expectedResult\n\nbut it evaluated to\n$actualResult" fi + + if [[ "$actualStderr" != "$expectedStderr" ]]; then + die "$actualExpr should have had this on stderr:\n$expectedStderr\n\nbut it was\n$actualStderr" + fi } # Check that a nix expression evaluates successfully to a store path and returns it (without quotes). @@ -76,23 +109,30 @@ expectEqual() { # Usage: expectStorePath NIX expectStorePath() { local expr=$1 - if ! result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace \ - --expr "$prefixExpression ($expr)"); then + if ! result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace 2>"$tmp"/stderr \ + --expr "$(prefixExpression) ($expr)"); then + cat "$tmp/stderr" >&2 die "$expr failed to evaluate, but it was expected to succeed" fi # This is safe because we assume to get back a store path in a string crudeUnquoteJSON <<< "$result" } -# Check that a nix expression fails to evaluate (strictly, coercing to json, read-write-mode). +# Check that a nix expression fails to evaluate (strictly, read-write-mode). # And check the received stderr against a regex # The expression has `lib.fileset` in scope. # Usage: expectFailure NIX REGEX expectFailure() { + if [[ "$1" == "--simulate-pure-eval" ]]; then + maybePure="--simulate-pure-eval" + shift + else + maybePure="" + fi local expr=$1 local expectedErrorRegex=$2 - if result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace 2>"$tmp/stderr" \ - --expr "$prefixExpression $expr"); then + if result=$(nix-instantiate --eval --strict --read-write-mode --show-trace 2>"$tmp/stderr" \ + --expr "$(prefixExpression $maybePure) $expr"); then die "$expr evaluated successfully to $result, but it was expected to fail" fi stderr=$(<"$tmp/stderr") @@ -101,33 +141,123 @@ expectFailure() { fi } -# We conditionally use inotifywait in checkFileset. +# Check that the traces of a Nix expression are as expected when evaluated. +# The expression has `lib.fileset` in scope. +# Usage: expectTrace NIX STR +expectTrace() { + local expr=$1 + local expectedTrace=$2 + + nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTrace \ + --expr "$(prefixExpression) trace ($expr)" || true + + actualTrace=$(sed -n 's/^trace: //p' "$tmp/stderrTrace") + + nix-instantiate --eval --show-trace >/dev/null 2>"$tmp"/stderrTraceVal \ + --expr "$(prefixExpression) traceVal ($expr)" || true + + actualTraceVal=$(sed -n 's/^trace: //p' "$tmp/stderrTraceVal") + + # Test that traceVal returns the same trace as trace + if [[ "$actualTrace" != "$actualTraceVal" ]]; then + cat "$tmp"/stderrTrace >&2 + die "$expr traced this for lib.fileset.trace:\n\n$actualTrace\n\nand something different for lib.fileset.traceVal:\n\n$actualTraceVal" + fi + + if [[ "$actualTrace" != "$expectedTrace" ]]; then + cat "$tmp"/stderrTrace >&2 + die "$expr should have traced this:\n\n$expectedTrace\n\nbut this was actually traced:\n\n$actualTrace" + fi +} + +# We conditionally use inotifywait in withFileMonitor. # Check early whether it's available # TODO: Darwin support, though not crucial since we have Linux CI if type inotifywait 2>/dev/null >/dev/null; then - canMonitorFiles=1 + canMonitor=1 else - echo "Warning: Not checking that excluded files don't get accessed since inotifywait is not available" >&2 - canMonitorFiles= + echo "Warning: Cannot check for paths not getting read since the inotifywait command (from the inotify-tools package) is not available" >&2 + canMonitor= fi -# Check whether a file set includes/excludes declared paths as expected, usage: +# Run a function while monitoring that it doesn't read certain paths +# Usage: withFileMonitor FUNNAME PATH... +# - FUNNAME should be a bash function that: +# - Performs some operation that should not read some paths +# - Delete the paths it shouldn't read without triggering any open events +# - PATH... are the paths that should not get read +# +# This function outputs the same as FUNNAME +withFileMonitor() { + local funName=$1 + shift + + # If we can't monitor files or have none to monitor, just run the function directly + if [[ -z "$canMonitor" ]] || (( "$#" == 0 )); then + "$funName" + else + + # Use a subshell to start the coprocess in and use a trap to kill it when exiting the subshell + ( + # Assigned by coproc, makes shellcheck happy + local watcher watcher_PID + + # Start inotifywait in the background to monitor all excluded paths + coproc watcher { + # inotifywait outputs a string on stderr when ready + # Redirect it to stdout so we can access it from the coproc's stdout fd + # exec so that the coprocess is inotify itself, making the kill below work correctly + # See below why we listen to both open and delete_self events + exec inotifywait --format='%e %w' --event open,delete_self --monitor "$@" 2>&1 + } + + # This will trigger when this subshell exits, no matter if successful or not + # After exiting the subshell, the parent shell will continue executing + trap 'kill "${watcher_PID}"' exit + + # Synchronously wait until inotifywait is ready + while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do + : + done + + # Call the function that should not read the given paths and delete them afterwards + "$funName" + + # Get the first event + read -r -u "${watcher[0]}" event file + + # With funName potentially reading files first before deleting them, + # there's only these two possible event timelines: + # - open*, ..., open*, delete_self, ..., delete_self: If some excluded paths were read + # - delete_self, ..., delete_self: If no excluded paths were read + # So by looking at the first event we can figure out which one it is! + # This also means we don't have to wait to collect all events. + case "$event" in + OPEN*) + die "$funName opened excluded file $file when it shouldn't have" + ;; + DELETE_SELF) + # Expected events + ;; + *) + die "During $funName, Unexpected event type '$event' on file $file that should be excluded" + ;; + esac + ) + fi +} + + +# Create the tree structure declared in the tree variable, usage: # # tree=( -# [a/b] =1 # Declare that file a/b should exist and expect it to be included in the store path -# [c/a] = # Declare that file c/a should exist and expect it to be excluded in the store path -# [c/d/]= # Declare that directory c/d/ should exist and expect it to be excluded in the store path +# [a/b] = # Declare that file a/b should exist +# [c/a] = # Declare that file c/a should exist +# [c/d/]= # Declare that directory c/d/ should exist # ) -# checkFileset './a' # Pass the fileset as the argument +# createTree declare -A tree -checkFileset() ( - # New subshell so that we can have a separate trap handler, see `trap` below - local fileset=$1 - - # Process the tree into separate arrays for included paths, excluded paths and excluded files. - local -a included=() - local -a excluded=() - local -a excludedFiles=() +createTree() { # Track which paths need to be created local -a dirsToCreate=() local -a filesToCreate=() @@ -135,24 +265,9 @@ checkFileset() ( # If keys end with a `/` we treat them as directories, otherwise files if [[ "$p" =~ /$ ]]; then dirsToCreate+=("$p") - isFile= else filesToCreate+=("$p") - isFile=1 fi - case "${tree[$p]}" in - 1) - included+=("$p") - ;; - 0) - excluded+=("$p") - if [[ -n "$isFile" ]]; then - excludedFiles+=("$p") - fi - ;; - *) - die "Unsupported tree value: ${tree[$p]}" - esac done # Create all the necessary paths. @@ -167,55 +282,59 @@ checkFileset() ( mkdir -p "${parentsToCreate[@]}" touch "${filesToCreate[@]}" fi +} - # Start inotifywait in the background to monitor all excluded files (if any) - if [[ -n "$canMonitorFiles" ]] && (( "${#excludedFiles[@]}" != 0 )); then - coproc watcher { - # inotifywait outputs a string on stderr when ready - # Redirect it to stdout so we can access it from the coproc's stdout fd - # exec so that the coprocess is inotify itself, making the kill below work correctly - # See below why we listen to both open and delete_self events - exec inotifywait --format='%e %w' --event open,delete_self --monitor "${excludedFiles[@]}" 2>&1 - } - # This will trigger when this subshell exits, no matter if successful or not - # After exiting the subshell, the parent shell will continue executing - # shellcheck disable=SC2154 - trap 'kill "${watcher_PID}"' exit - - # Synchronously wait until inotifywait is ready - while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do - : - done - fi - - # Call toSource with the fileset, triggering open events for all files that are added to the store - expression="toSource { root = ./.; fileset = $fileset; }" - storePath=$(expectStorePath "$expression") +# Check whether a file set includes/excludes declared paths as expected, usage: +# +# tree=( +# [a/b] =1 # Declare that file a/b should exist and expect it to be included in the store path +# [c/a] = # Declare that file c/a should exist and expect it to be excluded in the store path +# [c/d/]= # Declare that directory c/d/ should exist and expect it to be excluded in the store path +# ) +# checkFileset './a' # Pass the fileset as the argument +checkFileset() { + # New subshell so that we can have a separate trap handler, see `trap` below + local fileset=$1 - # Remove all files immediately after, triggering delete_self events for all of them - rm -rf -- * + # Create the tree + createTree - # Only check for the inotify events if we actually started inotify earlier - if [[ -v watcher ]]; then - # Get the first event - read -r -u "${watcher[0]}" event file - - # There's only these two possible event timelines: - # - open, ..., open, delete_self, ..., delete_self: If some excluded files were read - # - delete_self, ..., delete_self: If no excluded files were read - # So by looking at the first event we can figure out which one it is! - case "$event" in - OPEN) - die "$expression opened excluded file $file when it shouldn't have" + # Process the tree into separate arrays for included paths, excluded paths and excluded files. + local -a included=() + local -a excluded=() + local -a excludedFiles=() + for p in "${!tree[@]}"; do + case "${tree[$p]}" in + 1) + included+=("$p") ;; - DELETE_SELF) - # Expected events + 0) + excluded+=("$p") + # If keys end with a `/` we treat them as directories, otherwise files + if [[ ! "$p" =~ /$ ]]; then + excludedFiles+=("$p") + fi ;; *) - die "Unexpected event type '$event' on file $file that should be excluded" - ;; + die "Unsupported tree value: ${tree[$p]}" esac - fi + done + + expression="toSource { root = ./.; fileset = $fileset; }" + + # We don't have lambda's in bash unfortunately, + # so we just define a function instead and then pass its name + # shellcheck disable=SC2317 + run() { + # Call toSource with the fileset, triggering open events for all files that are added to the store + expectStorePath "$expression" + if (( ${#excludedFiles[@]} != 0 )); then + rm "${excludedFiles[@]}" + fi + } + + # Runs the function while checking that the given excluded files aren't read + storePath=$(withFileMonitor run "${excludedFiles[@]}") # For each path that should be included, make sure it does occur in the resulting store path for p in "${included[@]}"; do @@ -230,15 +349,21 @@ checkFileset() ( die "$expression included path $p when it shouldn't have" fi done -) + + rm -rf -- * +} #### Error messages ##### # Absolute paths in strings cannot be passed as `root` -expectFailure 'toSource { root = "/nix/store/foobar"; fileset = ./.; }' 'lib.fileset.toSource: `root` \("/nix/store/foobar"\) is a string-like value, but it should be a path instead. +expectFailure 'toSource { root = "/nix/store/foobar"; fileset = ./.; }' 'lib.fileset.toSource: `root` \(/nix/store/foobar\) is a string-like value, but it should be a path instead. \s*Paths in strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.' +expectFailure 'toSource { root = cleanSourceWith { src = ./.; }; fileset = ./.; }' 'lib.fileset.toSource: `root` is a `lib.sources`-based value, but it should be a path instead. +\s*To use a `lib.sources`-based value, convert it to a file set using `lib.fileset.fromSource` and pass it as `fileset`. +\s*Note that this only works for sources created from paths.' + # Only paths are accepted as `root` expectFailure 'toSource { root = 10; fileset = ./.; }' 'lib.fileset.toSource: `root` is of type int, but it should be a path instead.' @@ -246,21 +371,21 @@ expectFailure 'toSource { root = 10; fileset = ./.; }' 'lib.fileset.toSource: `r mkdir -p {foo,bar}/mock-root expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset; toSource { root = ./foo/mock-root; fileset = ./bar/mock-root; } -' 'lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` \("'"$work"'/foo/mock-root"\): -\s*`root`: root "'"$work"'/foo/mock-root" -\s*`fileset`: root "'"$work"'/bar/mock-root" -\s*Different roots are not supported.' -rm -rf * +' 'lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` \('"$work"'/foo/mock-root\): +\s*`root`: Filesystem root is "'"$work"'/foo/mock-root" +\s*`fileset`: Filesystem root is "'"$work"'/bar/mock-root" +\s*Different filesystem roots are not supported.' +rm -rf -- * # `root` needs to exist -expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `root` \('"$work"'/a\) does not exist.' +expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `root` \('"$work"'/a\) is a path that does not exist.' # `root` needs to be a file touch a expectFailure 'toSource { root = ./a; fileset = ./a; }' 'lib.fileset.toSource: `root` \('"$work"'/a\) is a file, but it should be a directory instead. Potential solutions: \s*- If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function. \s*- If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as '"$work"', and set `fileset` to the file path.' -rm -rf * +rm -rf -- * # The fileset argument should be evaluated, even if the directory is empty expectFailure 'toSource { root = ./.; fileset = abort "This should be evaluated"; }' 'evaluation aborted with the following error message: '\''This should be evaluated'\' @@ -270,36 +395,53 @@ mkdir a expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `fileset` could contain files in '"$work"', which is not under the `root` \('"$work"'/a\). Potential solutions: \s*- Set `root` to '"$work"' or any directory higher up. This changes the layout of the resulting store path. \s*- Set `fileset` to a file set that cannot contain files outside the `root` \('"$work"'/a\). This could change the files included in the result.' -rm -rf * +rm -rf -- * + +# non-regular and non-symlink files cannot be added to the Nix store +mkfifo a +expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` contains a file that cannot be added to the store: '"$work"'/a +\s*This file is neither a regular file nor a symlink, the only file types supported by the Nix store. +\s*Therefore the file set cannot be added to the Nix store as is. Make sure to not include that file to avoid this error.' +rm -rf -- * # Path coercion only works for paths -expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a path instead.' -expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a path instead. +expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a file set or a path instead.' +expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` \("/some/path"\) is a string-like value, but it should be a file set or a path instead. \s*Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.' +expectFailure 'toSource { root = ./.; fileset = cleanSourceWith { src = ./.; }; }' 'lib.fileset.toSource: `fileset` is a `lib.sources`-based value, but it should be a file set or a path instead. +\s*To convert a `lib.sources`-based value to a file set you can use `lib.fileset.fromSource`. +\s*Note that this only works for sources created from paths.' # Path coercion errors for non-existent paths -expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` \('"$work"'/a\) does not exist.' +expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` \('"$work"'/a\) is a path that does not exist.' # File sets cannot be evaluated directly -expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.' +expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. +\s*To turn it into a usable source, use `lib.fileset.toSource`. +\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.' +expectFailure '_emptyWithoutBase' 'lib.fileset: Directly evaluating a file set is not supported. +\s*To turn it into a usable source, use `lib.fileset.toSource`. +\s*To pretty-print the contents, use `lib.fileset.trace` or `lib.fileset.traceVal`.' # Past versions of the internal representation are supported expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 0; _internalBase = ./.; }' \ - '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalVersion = 2; _type = "fileset"; }' + '{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalIsEmptyWithoutBase = false; _internalVersion = 3; _type = "fileset"; }' expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 1; }' \ - '{ _type = "fileset"; _internalVersion = 2; }' + '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }' +expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 2; }' \ + '{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }' # Future versions of the internal representation are unsupported -expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 3; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation: -\s*- Internal version of the file set: 3 -\s*- Internal version of the library: 2 +expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 4; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation: +\s*- Internal version of the file set: 4 +\s*- Internal version of the library: 3 \s*Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.' # _create followed by _coerce should give the inputs back without any validation expectEqual '{ inherit (_coerce "<test>" (_create ./. "directory")) _internalVersion _internalBase _internalTree; -}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 2; }' +}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 3; }' #### Resulting store path #### @@ -311,6 +453,12 @@ tree=( ) checkFileset './.' +# The empty value without a base should also result in an empty result +tree=( + [a]=0 +) +checkFileset '_emptyWithoutBase' + # Directories recursively containing no files are not included tree=( [e/]=0 @@ -388,33 +536,50 @@ mkdir -p {foo,bar}/mock-root expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset; toSource { root = ./.; fileset = union ./foo/mock-root ./bar/mock-root; } ' 'lib.fileset.union: Filesystem roots are not the same: -\s*first argument: root "'"$work"'/foo/mock-root" -\s*second argument: root "'"$work"'/bar/mock-root" -\s*Different roots are not supported.' +\s*First argument: Filesystem root is "'"$work"'/foo/mock-root" +\s*Second argument: Filesystem root is "'"$work"'/bar/mock-root" +\s*Different filesystem roots are not supported.' expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset; toSource { root = ./.; fileset = unions [ ./foo/mock-root ./bar/mock-root ]; } ' 'lib.fileset.unions: Filesystem roots are not the same: -\s*element 0: root "'"$work"'/foo/mock-root" -\s*element 1: root "'"$work"'/bar/mock-root" -\s*Different roots are not supported.' -rm -rf * +\s*Element 0: Filesystem root is "'"$work"'/foo/mock-root" +\s*Element 1: Filesystem root is "'"$work"'/bar/mock-root" +\s*Different filesystem roots are not supported.' +rm -rf -- * # Coercion errors show the correct context -expectFailure 'toSource { root = ./.; fileset = union ./a ./.; }' 'lib.fileset.union: first argument \('"$work"'/a\) does not exist.' -expectFailure 'toSource { root = ./.; fileset = union ./. ./b; }' 'lib.fileset.union: second argument \('"$work"'/b\) does not exist.' -expectFailure 'toSource { root = ./.; fileset = unions [ ./a ./. ]; }' 'lib.fileset.unions: element 0 \('"$work"'/a\) does not exist.' -expectFailure 'toSource { root = ./.; fileset = unions [ ./. ./b ]; }' 'lib.fileset.unions: element 1 \('"$work"'/b\) does not exist.' +expectFailure 'toSource { root = ./.; fileset = union ./a ./.; }' 'lib.fileset.union: First argument \('"$work"'/a\) is a path that does not exist.' +expectFailure 'toSource { root = ./.; fileset = union ./. ./b; }' 'lib.fileset.union: Second argument \('"$work"'/b\) is a path that does not exist.' +expectFailure 'toSource { root = ./.; fileset = unions [ ./a ./. ]; }' 'lib.fileset.unions: Element 0 \('"$work"'/a\) is a path that does not exist.' +expectFailure 'toSource { root = ./.; fileset = unions [ ./. ./b ]; }' 'lib.fileset.unions: Element 1 \('"$work"'/b\) is a path that does not exist.' -# unions needs a list with at least 1 element -expectFailure 'toSource { root = ./.; fileset = unions null; }' 'lib.fileset.unions: Expected argument to be a list, but got a null.' -expectFailure 'toSource { root = ./.; fileset = unions [ ]; }' 'lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements.' +# unions needs a list +expectFailure 'toSource { root = ./.; fileset = unions null; }' 'lib.fileset.unions: Argument is of type null, but it should be a list instead.' # The tree of later arguments should not be evaluated if a former argument already includes all files tree=() checkFileset 'union ./. (_create ./. (abort "This should not be used!"))' checkFileset 'unions [ ./. (_create ./. (abort "This should not be used!")) ]' +# unions doesn't include any files for an empty list or only empty values without a base +tree=( + [x]=0 + [y/z]=0 +) +checkFileset 'unions [ ]' +checkFileset 'unions [ _emptyWithoutBase ]' +checkFileset 'unions [ _emptyWithoutBase _emptyWithoutBase ]' +checkFileset 'union _emptyWithoutBase _emptyWithoutBase' + +# The empty value without a base is the left and right identity of union +tree=( + [x]=1 + [y/z]=0 +) +checkFileset 'union ./x _emptyWithoutBase' +checkFileset 'union _emptyWithoutBase ./x' + # union doesn't include files that weren't specified tree=( [x]=1 @@ -467,12 +632,818 @@ for i in $(seq 1000); do tree[$i/a]=1 tree[$i/b]=0 done -( - # Locally limit the maximum stack size to 100 * 1024 bytes - # If unions was implemented recursively, this would stack overflow - ulimit -s 100 - checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))' +# This is actually really hard to test: +# A lot of files would be needed to cause a stack overflow. +# And while we could limit the maximum stack size using `ulimit -s`, +# that turns out to not be very deterministic: https://github.com/NixOS/nixpkgs/pull/256417#discussion_r1339396686. +# Meanwhile, the test infra here is not the fastest, creating 10000 would be too slow. +# So, just using 1000 files for now. +checkFileset 'unions (mapAttrsToList (name: _: ./. + "/${name}/a") (builtins.readDir ./.))' + + +## lib.fileset.intersection + + +# Different filesystem roots in root and fileset are not supported +mkdir -p {foo,bar}/mock-root +expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset; + toSource { root = ./.; fileset = intersection ./foo/mock-root ./bar/mock-root; } +' 'lib.fileset.intersection: Filesystem roots are not the same: +\s*First argument: Filesystem root is "'"$work"'/foo/mock-root" +\s*Second argument: Filesystem root is "'"$work"'/bar/mock-root" +\s*Different filesystem roots are not supported.' +rm -rf -- * + +# Coercion errors show the correct context +expectFailure 'toSource { root = ./.; fileset = intersection ./a ./.; }' 'lib.fileset.intersection: First argument \('"$work"'/a\) is a path that does not exist.' +expectFailure 'toSource { root = ./.; fileset = intersection ./. ./b; }' 'lib.fileset.intersection: Second argument \('"$work"'/b\) is a path that does not exist.' + +# The tree of later arguments should not be evaluated if a former argument already excludes all files +tree=( + [a]=0 +) +checkFileset 'intersection _emptyWithoutBase (_create ./. (abort "This should not be used!"))' +# We don't have any combinators that can explicitly remove files yet, so we need to rely on internal functions to test this for now +checkFileset 'intersection (_create ./. { a = null; }) (_create ./. { a = abort "This should not be used!"; })' + +# If either side is empty, the result is empty +tree=( + [a]=0 +) +checkFileset 'intersection _emptyWithoutBase _emptyWithoutBase' +checkFileset 'intersection _emptyWithoutBase (_create ./. null)' +checkFileset 'intersection (_create ./. null) _emptyWithoutBase' +checkFileset 'intersection (_create ./. null) (_create ./. null)' + +# If the intersection base paths are not overlapping, the result is empty and has no base path +mkdir a b c +touch {a,b,c}/x +expectEqual 'toSource { root = ./c; fileset = intersection ./a ./b; }' 'toSource { root = ./c; fileset = _emptyWithoutBase; }' +rm -rf -- * + +# If the intersection exists, the resulting base path is the longest of them +mkdir a +touch x a/b +expectEqual 'toSource { root = ./a; fileset = intersection ./a ./.; }' 'toSource { root = ./a; fileset = ./a; }' +expectEqual 'toSource { root = ./a; fileset = intersection ./. ./a; }' 'toSource { root = ./a; fileset = ./a; }' +rm -rf -- * + +# Also finds the intersection with null'd filesetTree's +tree=( + [a]=0 + [b]=1 + [c]=0 ) +checkFileset 'intersection (_create ./. { a = "regular"; b = "regular"; c = null; }) (_create ./. { a = null; b = "regular"; c = "regular"; })' + +# Actually computes the intersection between files +tree=( + [a]=0 + [b]=0 + [c]=1 + [d]=1 + [e]=0 + [f]=0 +) +checkFileset 'intersection (unions [ ./a ./b ./c ./d ]) (unions [ ./c ./d ./e ./f ])' + +tree=( + [a/x]=0 + [a/y]=0 + [b/x]=1 + [b/y]=1 + [c/x]=0 + [c/y]=0 +) +checkFileset 'intersection ./b ./.' +checkFileset 'intersection ./b (unions [ ./a/x ./a/y ./b/x ./b/y ./c/x ./c/y ])' + +# Complicated case +tree=( + [a/x]=0 + [a/b/i]=1 + [c/d/x]=0 + [c/d/f]=1 + [c/x]=0 + [c/e/i]=1 + [c/e/j]=1 +) +checkFileset 'intersection (unions [ ./a/b ./c/d ./c/e ]) (unions [ ./a ./c/d/f ./c/e ])' + +## Difference + +# Subtracting something from itself results in nothing +tree=( + [a]=0 +) +checkFileset 'difference ./. ./.' + +# The tree of the second argument should not be evaluated if not needed +checkFileset 'difference _emptyWithoutBase (_create ./. (abort "This should not be used!"))' +checkFileset 'difference (_create ./. null) (_create ./. (abort "This should not be used!"))' + +# Subtracting nothing gives the same thing back +tree=( + [a]=1 +) +checkFileset 'difference ./. _emptyWithoutBase' +checkFileset 'difference ./. (_create ./. null)' + +# Subtracting doesn't influence the base path +mkdir a b +touch {a,b}/x +expectEqual 'toSource { root = ./a; fileset = difference ./a ./b; }' 'toSource { root = ./a; fileset = ./a; }' +rm -rf -- * + +# Also not the other way around +mkdir a +expectFailure 'toSource { root = ./a; fileset = difference ./. ./a; }' 'lib.fileset.toSource: `fileset` could contain files in '"$work"', which is not under the `root` \('"$work"'/a\). Potential solutions: +\s*- Set `root` to '"$work"' or any directory higher up. This changes the layout of the resulting store path. +\s*- Set `fileset` to a file set that cannot contain files outside the `root` \('"$work"'/a\). This could change the files included in the result.' +rm -rf -- * + +# Difference actually works +# We test all combinations of ./., ./a, ./a/x and ./b +tree=( + [a/x]=0 + [a/y]=0 + [b]=0 + [c]=0 +) +checkFileset 'difference ./. ./.' +checkFileset 'difference ./a ./.' +checkFileset 'difference ./a/x ./.' +checkFileset 'difference ./b ./.' +checkFileset 'difference ./a ./a' +checkFileset 'difference ./a/x ./a' +checkFileset 'difference ./a/x ./a/x' +checkFileset 'difference ./b ./b' +tree=( + [a/x]=0 + [a/y]=0 + [b]=1 + [c]=1 +) +checkFileset 'difference ./. ./a' +tree=( + [a/x]=1 + [a/y]=1 + [b]=0 + [c]=0 +) +checkFileset 'difference ./a ./b' +tree=( + [a/x]=1 + [a/y]=0 + [b]=0 + [c]=0 +) +checkFileset 'difference ./a/x ./b' +tree=( + [a/x]=0 + [a/y]=1 + [b]=0 + [c]=0 +) +checkFileset 'difference ./a ./a/x' +tree=( + [a/x]=0 + [a/y]=0 + [b]=1 + [c]=0 +) +checkFileset 'difference ./b ./a' +checkFileset 'difference ./b ./a/x' +tree=( + [a/x]=0 + [a/y]=1 + [b]=1 + [c]=1 +) +checkFileset 'difference ./. ./a/x' +tree=( + [a/x]=1 + [a/y]=1 + [b]=0 + [c]=1 +) +checkFileset 'difference ./. ./b' + +## File filter + +# The first argument needs to be a function +expectFailure 'fileFilter null (abort "this is not needed")' 'lib.fileset.fileFilter: First argument is of type null, but it should be a function instead.' + +# The second argument needs to be an existing path +expectFailure 'fileFilter (file: abort "this is not needed") _emptyWithoutBase' 'lib.fileset.fileFilter: Second argument is a file set, but it should be a path instead. +\s*If you need to filter files in a file set, use `intersection fileset \(fileFilter pred \./\.\)` instead.' +expectFailure 'fileFilter (file: abort "this is not needed") null' 'lib.fileset.fileFilter: Second argument is of type null, but it should be a path instead.' +expectFailure 'fileFilter (file: abort "this is not needed") ./a' 'lib.fileset.fileFilter: Second argument \('"$work"'/a\) is a path that does not exist.' + +# The predicate is not called when there's no files +tree=() +checkFileset 'fileFilter (file: abort "this is not needed") ./.' + +# The predicate must be able to handle extra attributes +touch a +expectFailure 'toSource { root = ./.; fileset = fileFilter ({ name, type }: true) ./.; }' 'called with unexpected argument '\''"lib.fileset.fileFilter: The predicate function passed as the first argument must be able to handle extra attributes for future compatibility. If you'\''re using `\{ name, file \}:`, use `\{ name, file, ... \}:` instead."'\' +rm -rf -- * + +# .name is the name, and it works correctly, even recursively +tree=( + [a]=1 + [b]=0 + [c/a]=1 + [c/b]=0 + [d/c/a]=1 + [d/c/b]=0 +) +checkFileset 'fileFilter (file: file.name == "a") ./.' +tree=( + [a]=0 + [b]=1 + [c/a]=0 + [c/b]=1 + [d/c/a]=0 + [d/c/b]=1 +) +checkFileset 'fileFilter (file: file.name != "a") ./.' + +# `.type` is the file type +mkdir d +touch d/a +ln -s d/b d/b +mkfifo d/c +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type == "regular") ./.; }' \ + 'toSource { root = ./.; fileset = ./d/a; }' +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type == "symlink") ./.; }' \ + 'toSource { root = ./.; fileset = ./d/b; }' +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type == "unknown") ./.; }' \ + 'toSource { root = ./.; fileset = ./d/c; }' +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type != "regular") ./.; }' \ + 'toSource { root = ./.; fileset = union ./d/b ./d/c; }' +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type != "symlink") ./.; }' \ + 'toSource { root = ./.; fileset = union ./d/a ./d/c; }' +expectEqual \ + 'toSource { root = ./.; fileset = fileFilter (file: file.type != "unknown") ./.; }' \ + 'toSource { root = ./.; fileset = union ./d/a ./d/b; }' +rm -rf -- * + +# It's lazy +tree=( + [b]=1 + [c/a]=1 +) +# Note that union evaluates the first argument first if necessary, that's why we can use ./c/a here +checkFileset 'union ./c/a (fileFilter (file: assert file.name != "a"; true) ./.)' +# but here we need to use ./c +checkFileset 'union (fileFilter (file: assert file.name != "a"; true) ./.) ./c' + +# Make sure single files are filtered correctly +tree=( + [a]=1 + [b]=0 +) +checkFileset 'fileFilter (file: assert file.name == "a"; true) ./a' +tree=( + [a]=0 + [b]=0 +) +checkFileset 'fileFilter (file: assert file.name == "a"; false) ./a' + +## Tracing + +# The second trace argument is returned +expectEqual 'trace ./. "some value"' 'builtins.trace "(empty)" "some value"' + +# The fileset traceVal argument is returned +expectEqual 'traceVal ./.' 'builtins.trace "(empty)" (_create ./. "directory")' + +# The tracing happens before the final argument is needed +expectEqual 'trace ./.' 'builtins.trace "(empty)" (x: x)' + +# Tracing an empty directory shows it as such +expectTrace './.' '(empty)' + +# This also works if there are directories, but all recursively without files +mkdir -p a/b/c +expectTrace './.' '(empty)' +rm -rf -- * + +# The empty file set without a base also prints as empty +expectTrace '_emptyWithoutBase' '(empty)' +expectTrace 'unions [ ]' '(empty)' +mkdir foo bar +touch {foo,bar}/x +expectTrace 'intersection ./foo ./bar' '(empty)' +rm -rf -- * + +# If a directory is fully included, print it as such +touch a +expectTrace './.' "$work"' (all files in directory)' +rm -rf -- * + +# If a directory is not fully included, recurse +mkdir a b +touch a/{x,y} b/{x,y} +expectTrace 'union ./a/x ./b' "$work"' +- a + - x (regular) +- b (all files in directory)' +rm -rf -- * + +# If an included path is a file, print its type +touch a x +ln -s a b +mkfifo c +expectTrace 'unions [ ./a ./b ./c ]' "$work"' +- a (regular) +- b (symlink) +- c (unknown)' +rm -rf -- * + +# Do not print directories without any files recursively +mkdir -p a/b/c +touch b x +expectTrace 'unions [ ./a ./b ]' "$work"' +- b (regular)' +rm -rf -- * + +# If all children are either fully included or empty directories, +# the parent should be printed as fully included +touch a +mkdir b +expectTrace 'union ./a ./b' "$work"' (all files in directory)' +rm -rf -- * + +mkdir -p x/b x/c +touch x/a +touch a +# If all children are either fully excluded or empty directories, +# the parent should be shown (or rather not shown) as fully excluded +expectTrace 'unions [ ./a ./x/b ./x/c ]' "$work"' +- a (regular)' +rm -rf -- * + +# Completely filtered out directories also print as empty +touch a +expectTrace '_create ./. {}' '(empty)' +rm -rf -- * + +# A general test to make sure the resulting format makes sense +# Such as indentation and ordering +mkdir -p bar/{qux,someDir} +touch bar/{baz,qux,someDir/a} foo +touch bar/qux/x +ln -s x bar/qux/a +mkfifo bar/qux/b +expectTrace 'unions [ + ./bar/baz + ./bar/qux/a + ./bar/qux/b + ./bar/someDir/a + ./foo +]' "$work"' +- bar + - baz (regular) + - qux + - a (symlink) + - b (unknown) + - someDir (all files in directory) +- foo (regular)' +rm -rf -- * + +# For recursively included directories, +# `(all files in directory)` should only be used if there's at least one file (otherwise it would be `(empty)`) +# and this should be determined without doing a full search +# +# a is intentionally ordered first here in order to allow triggering the short-circuit behavior +# We then check that b is not read +# In a more realistic scenario, some directories might need to be recursed into, +# but a file would be quickly found to trigger the short-circuit. +touch a +mkdir b +# We don't have lambda's in bash unfortunately, +# so we just define a function instead and then pass its name +# shellcheck disable=SC2317 +run() { + # This shouldn't read b/ + expectTrace './.' "$work"' (all files in directory)' + # Remove all files immediately after, triggering delete_self events for all of them + rmdir b +} +# Runs the function while checking that b isn't read +withFileMonitor run b +rm -rf -- * + +# Partially included directories trace entries as they are evaluated +touch a b c +expectTrace '_create ./. { a = null; b = "regular"; c = throw "b"; }' "$work"' +- b (regular)' + +# Except entries that need to be evaluated to even figure out if it's only partially included: +# Here the directory could be fully excluded or included just from seeing a and b, +# so c needs to be evaluated before anything can be traced +expectTrace '_create ./. { a = null; b = null; c = throw "c"; }' '' +expectTrace '_create ./. { a = "regular"; b = "regular"; c = throw "c"; }' '' +rm -rf -- * + +# We can trace large directories (10000 here) without any problems +filesToCreate=({0..9}{0..9}{0..9}{0..9}) +expectedTrace=$work$'\n'$(printf -- '- %s (regular)\n' "${filesToCreate[@]}") +# We need an excluded file so it doesn't print as `(all files in directory)` +touch 0 "${filesToCreate[@]}" +expectTrace 'unions (mapAttrsToList (n: _: ./. + "/${n}") (removeAttrs (builtins.readDir ./.) [ "0" ]))' "$expectedTrace" +rm -rf -- * + +## lib.fileset.fromSource + +# Check error messages +expectFailure 'fromSource null' 'lib.fileset.fromSource: The source origin of the argument is of type null, but it should be a path instead.' + +expectFailure 'fromSource (lib.cleanSource "")' 'lib.fileset.fromSource: The source origin of the argument is a string-like value \(""\), but it should be a path instead. +\s*Sources created from paths in strings cannot be turned into file sets, use `lib.sources` or derivations instead.' + +expectFailure 'fromSource (lib.cleanSource null)' 'lib.fileset.fromSource: The source origin of the argument is of type null, but it should be a path instead.' + +# fromSource on a path works and is the same as coercing that path +mkdir a +touch a/b c +expectEqual 'trace (fromSource ./.) null' 'trace ./. null' +rm -rf -- * + +# Check that converting to a file set doesn't read the included files +mkdir a +touch a/b +run() { + expectEqual "trace (fromSource (lib.cleanSourceWith { src = ./a; })) null" "builtins.trace \"$work/a (all files in directory)\" null" + rm a/b +} +withFileMonitor run a/b +rm -rf -- * + +# Check that converting to a file set doesn't read entries for directories that are filtered out +mkdir -p a/b +touch a/b/c +run() { + expectEqual "trace (fromSource (lib.cleanSourceWith { + src = ./a; + filter = pathString: type: false; + })) null" "builtins.trace \"(empty)\" null" + rm a/b/c + rmdir a/b +} +withFileMonitor run a/b +rm -rf -- * + +# The filter is not needed on empty directories +expectEqual 'trace (fromSource (lib.cleanSourceWith { + src = ./.; + filter = abort "filter should not be needed"; +})) null' 'trace _emptyWithoutBase null' + +# Single files also work +touch a b +expectEqual 'trace (fromSource (cleanSourceWith { src = ./a; })) null' 'trace ./a null' +rm -rf -- * + +# For a tree assigning each subpath true/false, +# check whether a source filter with those results includes the same files +# as a file set created using fromSource. Usage: +# +# tree=( +# [a]=1 # ./a is a file and the filter should return true for it +# [b/]=0 # ./b is a directory and the filter should return false for it +# ) +# checkSource +checkSource() { + createTree + + # Serialise the tree as JSON (there's only minimal savings with jq, + # and we don't need to handle escapes) + { + echo "{" + first=1 + for p in "${!tree[@]}"; do + if [[ -z "$first" ]]; then + echo "," + else + first= + fi + echo "\"$p\":" + case "${tree[$p]}" in + 1) + echo "true" + ;; + 0) + echo "false" + ;; + *) + die "Unsupported tree value: ${tree[$p]}" + esac + done + echo "}" + } > "$tmp/tree.json" + + # An expression to create a source value with a filter matching the tree + sourceExpr=' + let + tree = importJSON '"$tmp"'/tree.json; + in + cleanSourceWith { + src = ./.; + filter = + pathString: type: + let + stripped = removePrefix (toString ./. + "/") pathString; + key = stripped + optionalString (type == "directory") "/"; + in + tree.${key} or + (throw "tree key ${key} missing"); + } + ' + + filesetExpr=' + toSource { + root = ./.; + fileset = fromSource ('"$sourceExpr"'); + } + ' + + # Turn both into store paths + sourceStorePath=$(expectStorePath "$sourceExpr") + filesetStorePath=$(expectStorePath "$filesetExpr") + + # Loop through each path in the tree + while IFS= read -r -d $'\0' subpath; do + if [[ ! -e "$sourceStorePath"/"$subpath" ]]; then + # If it's not in the source store path, it's also not in the file set store path + if [[ -e "$filesetStorePath"/"$subpath" ]]; then + die "The store path $sourceStorePath created by $expr doesn't contain $subpath, but the corresponding store path $filesetStorePath created via fromSource does contain $subpath" + fi + elif [[ -z "$(find "$sourceStorePath"/"$subpath" -type f)" ]]; then + # If it's an empty directory in the source store path, it shouldn't be in the file set store path + if [[ -e "$filesetStorePath"/"$subpath" ]]; then + die "The store path $sourceStorePath created by $expr contains the path $subpath without any files, but the corresponding store path $filesetStorePath created via fromSource didn't omit it" + fi + else + # If it's non-empty directory or a file, it should be in the file set store path + if [[ ! -e "$filesetStorePath"/"$subpath" ]]; then + die "The store path $sourceStorePath created by $expr contains the non-empty path $subpath, but the corresponding store path $filesetStorePath created via fromSource doesn't include it" + fi + fi + done < <(find . -mindepth 1 -print0) + + rm -rf -- * +} + +# Check whether the filter is evaluated correctly +tree=( + [a]= + [b/]= + [b/c]= + [b/d]= + [e/]= + [e/e/]= +) +# We fill out the above tree values with all possible combinations of 0 and 1 +# Then check whether a filter based on those return values gets turned into the corresponding file set +for i in $(seq 0 $((2 ** ${#tree[@]} - 1 ))); do + for p in "${!tree[@]}"; do + tree[$p]=$(( i % 2 )) + (( i /= 2 )) || true + done + checkSource +done + +# The filter is called with the same arguments in the same order +mkdir a e +touch a/b a/c d e +expectEqual ' + trace (fromSource (cleanSourceWith { + src = ./.; + filter = pathString: type: builtins.trace "${pathString} ${toString type}" true; + })) null +' ' + builtins.seq (cleanSourceWith { + src = ./.; + filter = pathString: type: builtins.trace "${pathString} ${toString type}" true; + }).outPath + builtins.trace "'"$work"' (all files in directory)" + null +' +rm -rf -- * + +# Test that if a directory is not included, the filter isn't called on its contents +mkdir a b +touch a/c b/d +expectEqual 'trace (fromSource (cleanSourceWith { + src = ./.; + filter = pathString: type: + if pathString == toString ./a then + false + else if pathString == toString ./b then + true + else if pathString == toString ./b/d then + true + else + abort "This filter should not be called with path ${pathString}"; +})) null' 'trace (_create ./. { b = "directory"; }) null' +rm -rf -- * + +# The filter is called lazily: +# If a later say intersection removes a part of the tree, the filter won't run on it +mkdir a d +touch a/{b,c} d/e +expectEqual 'trace (intersection ./a (fromSource (lib.cleanSourceWith { + src = ./.; + filter = pathString: type: + if pathString == toString ./a || pathString == toString ./a/b then + true + else if pathString == toString ./a/c then + false + else + abort "filter should not be called on ${pathString}"; +}))) null' 'trace ./a/b null' +rm -rf -- * + +## lib.fileset.gitTracked/gitTrackedWith + +# The first/second argument has to be a path +expectFailure 'gitTracked null' 'lib.fileset.gitTracked: Expected the argument to be a path, but it'\''s a null instead.' +expectFailure 'gitTrackedWith {} null' 'lib.fileset.gitTrackedWith: Expected the second argument to be a path, but it'\''s a null instead.' + +# The path has to contain a .git directory +expectFailure 'gitTracked ./.' 'lib.fileset.gitTracked: Expected the argument \('"$work"'\) to point to a local working tree of a Git repository, but it'\''s not.' +expectFailure 'gitTrackedWith {} ./.' 'lib.fileset.gitTrackedWith: Expected the second argument \('"$work"'\) to point to a local working tree of a Git repository, but it'\''s not.' + +# recurseSubmodules has to be a boolean +expectFailure 'gitTrackedWith { recurseSubmodules = null; } ./.' 'lib.fileset.gitTrackedWith: Expected the attribute `recurseSubmodules` of the first argument to be a boolean, but it'\''s a null instead.' + +# recurseSubmodules = true is not supported on all Nix versions +if [[ "$(nix-instantiate --eval --expr "$(prefixExpression) (versionAtLeast builtins.nixVersion _fetchGitSubmodulesMinver)")" == true ]]; then + fetchGitSupportsSubmodules=1 +else + fetchGitSupportsSubmodules= + expectFailure 'gitTrackedWith { recurseSubmodules = true; } ./.' 'lib.fileset.gitTrackedWith: Setting the attribute `recurseSubmodules` to `true` is only supported for Nix version 2.4 and after, but Nix version [0-9.]+ is used.' +fi + +# Checks that `gitTrackedWith` contains the same files as `git ls-files` +# for the current working directory. +# If --recurse-submodules is passed, the flag is passed through to `git ls-files` +# and as `recurseSubmodules` to `gitTrackedWith` +checkGitTrackedWith() { + if [[ "${1:-}" == "--recurse-submodules" ]]; then + gitLsFlags="--recurse-submodules" + gitTrackedArg="{ recurseSubmodules = true; }" + else + gitLsFlags="" + gitTrackedArg="{ }" + fi + + # All files listed by `git ls-files` + expectedFiles=() + while IFS= read -r -d $'\0' file; do + # If there are submodules but --recurse-submodules isn't passed, + # `git ls-files` lists them as empty directories, + # we need to filter that out since we only want to check/count files + if [[ -f "$file" ]]; then + expectedFiles+=("$file") + fi + done < <(git ls-files -z $gitLsFlags) + + storePath=$(expectStorePath 'toSource { root = ./.; fileset = gitTrackedWith '"$gitTrackedArg"' ./.; }') + + # Check that each expected file is also in the store path with the same content + for expectedFile in "${expectedFiles[@]}"; do + if [[ ! -e "$storePath"/"$expectedFile" ]]; then + die "Expected file $expectedFile to exist in $storePath, but it doesn't.\nGit status:\n$(git status)\nStore path contents:\n$(find "$storePath")" + fi + if ! diff "$expectedFile" "$storePath"/"$expectedFile"; then + die "Expected file $expectedFile to have the same contents as in $storePath, but it doesn't.\nGit status:\n$(git status)\nStore path contents:\n$(find "$storePath")" + fi + done + + # This is a cheap way to verify the inverse: That all files in the store path are also expected + # We just count the number of files in both and verify they're the same + actualFileCount=$(find "$storePath" -type f -printf . | wc -c) + if [[ "${#expectedFiles[@]}" != "$actualFileCount" ]]; then + die "Expected ${#expectedFiles[@]} files in $storePath, but got $actualFileCount.\nGit status:\n$(git status)\nStore path contents:\n$(find "$storePath")" + fi +} + + +# Runs checkGitTrackedWith with and without --recurse-submodules +# Allows testing both variants together +checkGitTracked() { + checkGitTrackedWith + if [[ -n "$fetchGitSupportsSubmodules" ]]; then + checkGitTrackedWith --recurse-submodules + fi +} + +createGitRepo() { + git init -q "$1" + # Only repo-local config + git -C "$1" config user.name "Nixpkgs" + git -C "$1" config user.email "nixpkgs@nixos.org" + # Get at least a HEAD commit, needed for older Nix versions + git -C "$1" commit -q --allow-empty -m "Empty commit" +} + +# Check the error message for pure eval mode +createGitRepo . +expectFailure --simulate-pure-eval 'toSource { root = ./.; fileset = gitTracked ./.; }' 'lib.fileset.gitTracked: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`. See https://github.com/NixOS/nix/issues/9292.' +expectFailure --simulate-pure-eval 'toSource { root = ./.; fileset = gitTrackedWith {} ./.; }' 'lib.fileset.gitTrackedWith: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`. See https://github.com/NixOS/nix/issues/9292.' +rm -rf -- * + +# Go through all stages of Git files +# See https://www.git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository + +# Empty repository +createGitRepo . +checkGitTracked + +# Untracked file +echo a > a +checkGitTracked + +# Staged file +git add a +checkGitTracked + +# Committed file +git commit -q -m "Added a" +checkGitTracked + +# Edited file +echo b > a +checkGitTracked + +# Removed file +git rm -f -q a +checkGitTracked + +rm -rf -- * + +# gitignored file +createGitRepo . +echo a > .gitignore +touch a +git add -A +checkGitTracked + +# Add it regardless (needs -f) +git add -f a +checkGitTracked +rm -rf -- * + +# Directory +createGitRepo . +mkdir -p d1/d2/d3 +touch d1/d2/d3/a +git add d1 +checkGitTracked +rm -rf -- * + +# Submodules +createGitRepo . +createGitRepo sub + +# Untracked submodule +git -C sub commit -q --allow-empty -m "Empty commit" +checkGitTracked + +# Tracked submodule +git submodule add ./sub sub >/dev/null +checkGitTracked + +# Untracked file +echo a > sub/a +checkGitTracked + +# Staged file +git -C sub add a +checkGitTracked + +# Committed file +git -C sub commit -q -m "Add a" +checkGitTracked + +# Changed file +echo b > sub/b +checkGitTracked + +# Removed file +git -C sub rm -f -q a +checkGitTracked + +rm -rf -- * # TODO: Once we have combinators and a property testing library, derive property tests from https://en.wikipedia.org/wiki/Algebra_of_sets |