Files
git-mirror/Documentation/git-diff.adoc
Jacob Keller 09fb155f11 diff --no-index: support limiting by pathspec
The --no-index option of git-diff enables using the diff machinery from
git while operating outside of a repository. This mode of git diff is
able to compare directories and produce a diff of their contents.

When operating git diff in a repository, git has the notion of
"pathspecs" which can specify which files to compare. In particular,
when using git to diff two trees, you might invoke:

  $ git diff-tree -r <treeish1> <treeish2>.

where the treeish could point to a subdirectory of the repository.

When invoked this way, users can limit the selected paths of the tree by
using a pathspec. Either by providing some list of paths to accept, or
by removing paths via a negative refspec.

The git diff --no-index mode does not support pathspecs, and cannot
limit the diff output in this way. Other diff programs such as GNU
difftools have options for excluding paths based on a pattern match.
However, using git diff as a diff replacement has several advantages
over many popular diff tools, including coloring moved lines, rename
detections, and similar.

Teach git diff --no-index how to handle pathspecs to limit the
comparisons. This will only be supported if both provided paths are
directories.

For comparisons where one path isn't a directory, the --no-index mode
already has some DWIM shortcuts implemented in the fixup_paths()
function.

Modify the fixup_paths function to return 1 if both paths are
directories. If this is the case, interpret any extra arguments to git
diff as pathspecs via parse_pathspec.

Use parse_pathspec to load the remaining arguments (if any) to git diff
--no-index as pathspec items. Disable PATHSPEC_ATTR support since we do
not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP
since we do not have a repository root. All pathspecs are treated as
rooted at the provided comparison paths.

After loading the pathspec data, calculate skip offsets for skipping
past the root portion of the paths. This is required to ensure that
pathspecs start matching from the provided path, rather than matching
from the absolute path. We could instead pass the paths as prefix values
to parse_pathspec. This is slightly problematic because the paths come
from the command line and don't necessarily have the proper trailing
slash. Additionally, that would require parsing pathspecs multiple
times.

Pass the pathspec object and the skip offsets into queue_diff, which
in-turn must pass them along to read_directory_contents.

Modify read_directory_contents to check against the pathspecs when
scanning the directory. Use the skip offset to skip past the initial
root of the path, and only match against portions that are below the
intended directory structure being compared.

The search algorithm for finding paths is recursive with read_dir. To
make pathspec matching work properly, we must set both
DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC.

Without DO_MATCH_DIRECTORY, paths like "a/b/c/d" will not match against
pathspecs like "a/b/c". This is usually achieved by setting the is_dir
parameter of match_pathspec.

Without DO_MATCH_LEADING_PATHSPEC, paths like "a/b/c" would not match
against pathspecs like "a/b/c/d". This is crucial because we recursively
iterate down the directories. We could simply avoid checking pathspecs
at subdirectories, but this would force recursion down directories
which would simply be skipped.

If we always passed DO_MATCH_LEADING_PATHSPEC, then we will
incorrectly match in certain cases such as matching 'a/c' against
':(glob)**/d'. The match logic will see that a matches the leading part
of the **/ and accept this even tho c doesn't match.

To avoid this, use the match_leading_pathspec() variant recently
introduced. This sets both flags when is_dir is set, but leaves them
both cleared when is_dir is 0.

Add test cases and documentation covering the new functionality. Note
for the documentation I opted not to move the placement of '--' which is
sometimes used to disambiguate arguments. The diff --no-index mode
requires exactly 2 arguments determining what to compare. Any additional
arguments are interpreted as pathspecs and must come afterwards. Use of
'--' would not actually disambiguate anything, since there will never be
ambiguity over which arguments represent paths or pathspecs.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-22 14:20:11 -07:00

257 lines
8.3 KiB
Plaintext

git-diff(1)
===========
NAME
----
git-diff - Show changes between commits, commit and working tree, etc
SYNOPSIS
--------
[synopsis]
git diff [<options>] [<commit>] [--] [<path>...]
git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]
git diff [<options>] [--merge-base] <commit> [<commit>...] <commit> [--] [<path>...]
git diff [<options>] <commit>...<commit> [--] [<path>...]
git diff [<options>] <blob> <blob>
git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]
DESCRIPTION
-----------
Show changes between the working tree and the index or a tree, changes
between the index and a tree, changes between two trees, changes resulting
from a merge, changes between two blob objects, or changes between two
files on disk.
`git diff [<options>] [--] [<path>...]`::
This form is to view the changes you made relative to
the index (staging area for the next commit). In other
words, the differences are what you _could_ tell Git to
further add to the index but you still haven't. You can
stage these changes by using linkgit:git-add[1].
`git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]`::
This form is to compare the given two paths on the
filesystem. You can omit the `--no-index` option when
running the command in a working tree controlled by Git and
at least one of the paths points outside the working tree,
or when running the command outside a working tree
controlled by Git. This form implies `--exit-code`. If both
paths point to directories, additional pathspecs may be
provided. These will limit the files included in the
difference. All such pathspecs must be relative as they
apply to both sides of the diff.
`git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]`::
This form is to view the changes you staged for the next
commit relative to the named _<commit>_. Typically you
would want comparison with the latest commit, so if you
do not give _<commit>_, it defaults to `HEAD`.
If `HEAD` does not exist (e.g. unborn branches) and
_<commit>_ is not given, it shows all staged changes.
`--staged` is a synonym of `--cached`.
+
If `--merge-base` is given, instead of using _<commit>_, use the merge base
of _<commit>_ and `HEAD`. `git diff --cached --merge-base A` is equivalent to
`git diff --cached $(git merge-base A HEAD)`.
`git diff [<options>] [--merge-base] <commit> [--] [<path>...]`::
This form is to view the changes you have in your
working tree relative to the named _<commit>_. You can
use `HEAD` to compare it with the latest commit, or a
branch name to compare with the tip of a different
branch.
+
If `--merge-base` is given, instead of using _<commit>_, use the merge base
of _<commit>_ and `HEAD`. `git diff --merge-base A` is equivalent to
`git diff $(git merge-base A HEAD)`.
`git diff [<options>] [--merge-base] <commit> <commit> [--] [<path>...]`::
This is to view the changes between two arbitrary
_<commit>_.
+
If `--merge-base` is given, use the merge base of the two commits for the
"before" side. `git diff --merge-base A B` is equivalent to
`git diff $(git merge-base A B) B`.
`git diff [<options>] <commit> <commit>...<commit> [--] [<path>...]`::
This form is to view the results of a merge commit. The first
listed _<commit>_ must be the merge itself; the remaining two or
more commits should be its parents. Convenient ways to produce
the desired set of revisions are to use the suffixes `@` and
`^!`. If `A` is a merge commit, then `git diff A A^@`,
`git diff A^!` and `git show A` all give the same combined diff.
`git diff [<options>] <commit>..<commit> [--] [<path>...]`::
This is synonymous to the earlier form (without the `..`) for
viewing the changes between two arbitrary _<commit>_. If _<commit>_ on
one side is omitted, it will have the same effect as
using `HEAD` instead.
`git diff [<options>] <commit>...<commit> [--] [<path>...]`::
This form is to view the changes on the branch containing
and up to the second _<commit>_, starting at a common ancestor
of both _<commit>_. `git diff A...B` is equivalent to
`git diff $(git merge-base A B) B`. You can omit any one
of _<commit>_, which has the same effect as using `HEAD` instead.
Just in case you are doing something exotic, it should be
noted that all of the _<commit>_ in the above description, except
in the `--merge-base` case and in the last two forms that use `..`
notations, can be any _<tree>_. A tree of interest is the one pointed to
by the ref named `AUTO_MERGE`, which is written by the `ort` merge
strategy upon hitting merge conflicts (see linkgit:git-merge[1]).
Comparing the working tree with `AUTO_MERGE` shows changes you've made
so far to resolve textual conflicts (see the examples below).
For a more complete list of ways to spell _<commit>_, see
"SPECIFYING REVISIONS" section in linkgit:gitrevisions[7].
However, `diff` is about comparing two _endpoints_, not ranges,
and the range notations (`<commit>..<commit>` and `<commit>...<commit>`)
do not mean a range as defined in the
"SPECIFYING RANGES" section in linkgit:gitrevisions[7].
`git diff [<options>] <blob> <blob>`::
This form is to view the differences between the raw
contents of two blob objects.
OPTIONS
-------
:git-diff: 1
include::diff-options.adoc[]
`-1`::
`--base`::
`-2`::
`--ours`::
`-3`::
`--theirs`::
Compare the working tree with
+
--
* the "base" version (stage #1) when using `-1` or `--base`,
* "our branch" (stage #2) when using `-2` or `--ours`, or
* "their branch" (stage #3) when using `-3` or `--theirs`.
--
+
The index contains these stages only for unmerged entries i.e.
while resolving conflicts. See linkgit:git-read-tree[1]
section "3-Way Merge" for detailed information.
`-0`::
Omit diff output for unmerged entries and just show
"Unmerged". Can be used only when comparing the working tree
with the index.
`<path>...`::
The _<path>_ parameters, when given, are used to limit
the diff to the named paths (you can give directory
names and get diff for all files under them).
include::diff-format.adoc[]
EXAMPLES
--------
Various ways to check your working tree::
+
------------
$ git diff <1>
$ git diff --cached <2>
$ git diff HEAD <3>
$ git diff AUTO_MERGE <4>
------------
+
<1> Changes in the working tree not yet staged for the next commit.
<2> Changes between the index and your last commit; what you
would be committing if you run `git commit` without `-a` option.
<3> Changes in the working tree since your last commit; what you
would be committing if you run `git commit -a`
<4> Changes in the working tree you've made to resolve textual
conflicts so far.
Comparing with arbitrary commits::
+
------------
$ git diff test <1>
$ git diff HEAD -- ./test <2>
$ git diff HEAD^ HEAD <3>
------------
+
<1> Instead of using the tip of the current branch, compare with the
tip of "test" branch.
<2> Instead of comparing with the tip of "test" branch, compare with
the tip of the current branch, but limit the comparison to the
file "test".
<3> Compare the version before the last commit and the last commit.
Comparing branches::
+
------------
$ git diff topic master <1>
$ git diff topic..master <2>
$ git diff topic...master <3>
------------
+
<1> Changes between the tips of the topic and the master branches.
<2> Same as above.
<3> Changes that occurred on the master branch since when the topic
branch was started off it.
Limiting the diff output::
+
------------
$ git diff --diff-filter=MRC <1>
$ git diff --name-status <2>
$ git diff arch/i386 include/asm-i386 <3>
------------
+
<1> Show only modification, rename, and copy, but not addition
or deletion.
<2> Show only names and the nature of change, but not actual
diff output.
<3> Limit diff output to named subtrees.
Munging the diff output::
+
------------
$ git diff --find-copies-harder -B -C <1>
$ git diff -R <2>
------------
+
<1> Spend extra cycles to find renames, copies and complete
rewrites (very expensive).
<2> Output diff in reverse.
CONFIGURATION
-------------
include::includes/cmd-config-section-all.adoc[]
:git-diff: 1
include::config/diff.adoc[]
SEE ALSO
--------
`diff`(1),
linkgit:git-difftool[1],
linkgit:git-log[1],
linkgit:gitdiffcore[7],
linkgit:git-format-patch[1],
linkgit:git-apply[1],
linkgit:git-show[1]
GIT
---
Part of the linkgit:git[1] suite