The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience
and histogram diffs, not for the minimal one. This means that when
reseting the diff algorithm to the default one, one needs to separately
clear the bit for the minimal diff. There are places in the code that fail
to do that: merge-ort.c and builtin/merge-file.c.
Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate
clearing of this bit in the places where it hasn't been forgotten.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A lot of code clean-up of xdiff.
Split out of a larger topic.
* en/xdiff-cleanup:
xdiff: change type of xdfile_t.changed from char to bool
xdiff: add macros DISCARD(0), KEEP(1), INVESTIGATE(2) in xprepare.c
xdiff: rename rchg -> changed in xdfile_t
xdiff: delete chastore from xdfile_t
xdiff: delete fields ha, line, size in xdlclass_t in favor of an xrecord_t
xdiff: delete redundant array xdfile_t.ha
xdiff: delete struct diffdata_t
xdiff: delete local variables that alias fields in xrecord_t
xdiff: delete superfluous function xdl_get_rec() in xemit
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xdiff: delete local variables and initialize/free xdfile_t directly
xdiff: delete static forward declarations in xprepare
The only values possible for 'changed' is 1 and 0, which exactly maps
to a bool type. It might not look like this because action1 and action2
(which use to be dis1, and dis2) were also of type char and were
assigned numerical values within a few lines of 'changed' (what used to
be rchg).
Using DISCARD/KEEP/INVESTIGATE for action1[i]/action2[j], and true/false
for changed[k] makes it clear to future readers that these are
logically separate concepts.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is refactor-only; no behavior is changed. A future commit
will use bool literals for changed[i].
The functions xdl_clean_mmatch() and xdl_cleanup_records() will be
cleaned up more in a future patch series. The changes to
xdl_cleanup_records(), in this patch, are just to make it clear why
`char rchg` is refactored to `bool changed`.
Rename dis* to action* and replace literal numericals with macros.
The old names came from when dis* (which I think was short for discard)
was treated like a boolean, but over time it grew into a ternary state
machine. The result was confusing because dis* and rchg* both used 0/1
values with different meanings.
The new names and macros make the states explicit. nm is short for
number of matches, and mlim is a heuristic limit:
nm == 0 -> action[i] = DISCARD -> changed[i] = true
0 < nm < mlim -> action[i] = KEEP -> changed[i] = false
nm >= mlim -> action[i] = INVESTIGATE -> changed[i] = xdl_clean_mmatch()
When need_min is true, only DISCARD and KEEP occur because the limit
is effectively infinite.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The field rchg (now 'changed') declares if a line in a file is changed
or not. A later commit will change it's type from 'char' to 'bool'
to make its purpose even more clear.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdfile_t currently uses chastore_t which is an arena allocator. I
think that xrecord_t used to be a linked list and recs didn't exist
originally. When recs was added I think they forgot to remove
xdfile_t.next, but was overlooked. This dual data structure setup
makes the code somewhat confusing.
Additionally the C type chastore_t isn't FFI friendly, and provides
little to no performance benefit over using realloc to grow an array.
Performance impact of deleting fields from xdfile_t:
Deleting ha is about 5% slower.
Deleting cha is about 5% faster.
Delete ha, but keep cha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.269 s ± 0.017 s [User: 1.135 s, System: 0.128 s]
Range (min … max): 1.249 s … 1.286 s 10 runs
Benchmark 2: build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.339 s ± 0.017 s [User: 1.234 s, System: 0.099 s]
Range (min … max): 1.320 s … 1.358 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.06 ± 0.02 times faster than build_delete_ha/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete cha, but keep ha
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.290 s ± 0.001 s [User: 1.154 s, System: 0.130 s]
Range (min … max): 1.288 s … 1.292 s 10 runs
Benchmark 2: build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.232 s ± 0.017 s [User: 1.105 s, System: 0.121 s]
Range (min … max): 1.205 s … 1.249 s 10 runs
Summary
build_delete_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.05 ± 0.01 times faster than build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Delete ha AND chastore
time hyperfine --warmup 3 -L exe build_v2.51.0/git,build_delete_ha_and_chastore/git '{exe} log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null'
Benchmark 1: build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.291 s ± 0.002 s [User: 1.156 s, System: 0.129 s]
Range (min … max): 1.287 s … 1.295 s 10 runs
Benchmark 2: build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Time (mean ± σ): 1.306 s ± 0.001 s [User: 1.195 s, System: 0.105 s]
Range (min … max): 1.305 s … 1.308 s 10 runs
Summary
build_v2.51.0/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null ran
1.01 ± 0.00 times faster than build_delete_ha_and_chastore/git log --oneline --shortstat --diff-algorithm=myers -3000 v2.39.1 >/dev/null
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fields from xdlclass_t are aliases of xrecord_t:
xdlclass_t.line -> xrecord_t.ptr
xdlclass_t.size -> xrecord_t.size
xdlclass_t.ha -> xrecord_t.ha
xdlclass_t carries a copy of the data in xrecord_t, but instead of
embedding xrecord_t it duplicates the individual fields. A future
commit will change the types used in xrecord_t so embed it in
xdlclass_t first, so we don't have to remember to change the types
here as well.
Best-viewed-with: --color-words
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 0 <= i < xdfile_t.nreff the following is true:
xdfile_t.ha[i] == xdfile_t.recs[xdfile_t.rindex[i]]
This makes the code about 5% slower. The fields rindex and ha are
specific to the classic diff (myers and minimal). I plan on creating a
struct for classic diff, but there's a lot of cleanup that needs to be
done before that can happen and leaving ha in would make those cleanups
harder to follow.
A subsequent commit will delete the chastore cha from xdfile_t. That
later commit will investigate deleting ha and cha independently and
together.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every field in this struct is an alias for a certain field in xdfile_t.
diffdata_t.nrec -> xdfile_t.nreff
diffdata_t.ha -> xdfile_t.ha
diffdata_t.rindex -> xdfile_t.rindex
diffdata_t.rchg -> xdfile_t.rchg
I think this struct existed before xdfile_t, and was kept for backward
compatibility reasons. I think xdiffi should have been refactored to
use the new (xdfile_t) struct, but was easier to alias it instead.
The local variables rchg* and rindex* don't shorten the lines by much,
nor do they really need to be there to make the code more readable.
Delete them.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the type xrecord_t as the local variable for the functions in the
file xdiff/xemit.c. Most places directly reference the fields inside of
this struct, doing that here makes it more consistent with the rest of
the code.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When xrecord_t was a linked list, and recs didn't exist, I assume this
function walked the list until it found the right record. Accessing
a contiguous array is so trivial that this function is now superfluous.
Delete it.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.
Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.
Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdl_hash_record_verbatim uses modified djb2 hash with XOR instead of ADD
for combining. The ADD-based variant is used as the basis of the modern
("GNU") symbol lookup scheme in ELF. Glibc dynamic loader received an
optimized version of this hash function thanks to Noah Goldstein [1].
Switch xdl_hash_record_verbatim to additive hashing and implement
an optimized loop following the scheme suggested by Noah.
Timing 'git log --oneline --shortstat v2.0.0..v2.5.0' under perf, I got
version | cycles, bn | instructions, bn
---------------------------------------
A 6.38 11.3
B 6.21 10.89
C 5.80 9.95
D 5.83 8.74
---------------------------------------
A: baseline (git master at e4ef0485fd)
B: plus 'xdiff: refactor xdl_hash_record()'
C: and plus this patch
D: with 'xdiff: use xxhash' by Phillip Wood
The resulting speedup for xdl_hash_record_verbatim itself is about 1.5x.
[1] https://inbox.sourceware.org/libc-alpha/20220519221803.57957-6-goldstein.w.n@gmail.com/
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inline the check for whitespace flags so that the compiler can hoist
it out of the loop in xdl_prepare_ctx(). This improves the performance
by 8%.
$ hyperfine --warmup=1 -L rev HEAD,HEAD^ --setup='git checkout {rev} -- :/ && make git' ': {rev}; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
Benchmark 1: : HEAD; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
Time (mean ± σ): 1.670 s ± 0.044 s [User: 1.473 s, System: 0.196 s]
Range (min … max): 1.619 s … 1.754 s 10 runs
Benchmark 2: : HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0
Time (mean ± σ): 1.801 s ± 0.021 s [User: 1.605 s, System: 0.192 s]
Range (min … max): 1.766 s … 1.831 s 10 runs
Summary
': HEAD^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0' ran
1.08 ± 0.03 times faster than ': HEAD^^; GIT_CONFIG_GLOBAL=/dev/null ./git log --oneline --shortstat v2.0.0..v2.5.0'
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cleanup_records function marks some lines as changed before running
the actual diff algorithm. For most lines, this is a good performance
optimization, but it also marks lines that are surrounded by many
changed lines as changed as well. This can cause redundant changes and
longer-than-necessary diffs.
Whether this results in better-looking diffs is subjective. However, the
--minimal flag explicitly requests the shortest possible diff.
The change results in shorter diffs in about 1.3% of all diffs in Git's
history. Performance wise, I have measured the impact on
"git log -p -3000 --minimal > /dev/null". With this change, I get
Time (mean ± σ): 2.363 s ± 0.023 s (25 runs)
and without this patch I measured
Time (mean ± σ): 2.362 s ± 0.035 s (25 runs).
As the difference is well within the margin of error, this does not seem
to have an impact on performance.
Signed-off-by: Niels Glodny <n.glodny@campus.lmu.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* js/comma-semicolon-confusion:
detect-compiler: detect clang even if it found CUDA
clang: warn when the comma operator is used
compat/regex: explicitly mark intentional use of the comma operator
wildmatch: avoid using of the comma operator
diff-delta: avoid using the comma operator
xdiff: avoid using the comma operator unnecessarily
clar: avoid using the comma operator unnecessarily
kwset: avoid using the comma operator unnecessarily
rebase: avoid using the comma operator unnecessarily
remote-curl: avoid using the comma operator unnecessarily
The xdiff code on 32-bit platform misbehaved when an insanely large
context size is given, which has been corrected.
* rs/xdiff-context-length-fix:
xdiff: avoid arithmetic overflow in xdl_get_hunk()
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. While the code in
this patch used the comma operator intentionally (to avoid curly
brackets around two statements, each, that want to be guarded by a
condition), it is better to surround it with curly brackets and to use a
semicolon instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdl_get_hunk() calculates the maximum number of common lines between two
changes that would fit into the same hunk for the given context options.
It involves doubling and addition and thus can overflow if the terms are
huge.
The type of ctxlen and interhunkctxlen in xdemitconf_t is long, while
the type of the corresponding context and interhunkcontext in struct
diff_options is int. On many platforms longs are bigger that ints,
which prevents the overflow. On Windows they have the same range and
the overflow manifests as hunks that are split erroneously and lines
being repeated between them.
Fix the overflow by checking and not going beyond LONG_MAX. This allows
specifying a huge context line count and getting all lines of a changed
files in a single hunk, as expected.
Reported-by: Jason Cho <jason11choca@proton.me>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comparisons all involve comparisons against unsigned values.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loop iteration variable is non-negative and used in comparisons
against a size_t value. Use size_t to eliminate the mismatch.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comparisons all involve unsigned variables. Cast the comparison
to unsigned to eliminate the mismatch.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unsigned `ignored` variable causes expressions to promote to
unsigned. Use a signed value to make comparisons use the same types.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loop iteration variable is non-negative and only used in comparisons
against other size_t values.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow each file to fix the warnings guarded by the macro separately by
moving the definition from the shared xinclude.h into each file that
needs it.
xmerge.c and xprepare.c do not contain any signed vs. unsigned
comparisons so the definition was not included in these files.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark code units that generate warnings with `-Wsign-compare`. This
allows for a structured approach to get rid of all such warnings over
time in a way that can be easily measured.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is used interchangeably with xdl_emit via a function
pointer, so we can't just drop the unused parameter. Mark it to silence
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The def_ff() function is the default "find_func" for finding hunk
headers. It has never used its "priv" argument since it was introduced
in f258475a6e (Per-path attribute based hunk header selection.,
2007-07-06). But back then we used a function pointer to switch between
a caller-provided function and the default, so the two had to conform to
the same interface.
In ff2981f724 (xdiff: factor out match_func_rec(), 2016-05-28), that
pointer indirection went away in favor of code which directly calls
either of the two functions. So there's no need for def_ff() to retain
this unused parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The entry point to the patience-diff algorithm takes two mmfile_t
structs with the original file contents, but it doesn't actually do
anything useful with them. This is similar to the case recently cleaned
up in the histogram code via f1d019071e (xdiff: drop unused mmfile
parameters from xdl_do_histogram_diff(), 2022-08-19), but there's a bit
more subtlety going on.
We pass them into the recursive patience_diff(), which in turn passes
them into fill_hashmap(), which stuffs the pointers into a struct. But
the only thing which reads the struct fields is our recursion into
patience_diff()!
So it's unlikely that something like -Wunused-parameter could find this
case: it would have to detect the circular dependency caused by the
recursion (not to mention tracing across struct field assignments).
But once found, it's easy to have the compiler confirm what's going on:
1. Drop the "file1" and "file2" fields from the hashmap struct
definition. Remove the assignments in fill_hashmap(), and
temporarily substitute NULL in the recursive call to
patience_diff(). Compiling shows that no other code touched those
fields.
2. Now fill_hashmap() will trigger -Wunused-parameter. Drop "file1"
and "file2" from its definition and callsite.
3. Now patience_diff() will trigger -Wunused-parameter. Drop them
there, too. One of the callsites is the recursion with our
NULL values, so those temporary values go away.
4. Now xdl_do_patience_diff() will trigger -Wunused-parameter. Drop
them there. And we're done.
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are no longer used since 9df0fc3d57 (xdiff: fix a memory leak,
2022-02-16), as the caller is expected to call xdl_prepare_env() itself.
After that change the histogram code only examines the prepared
xdfenv_t, not the original buffers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper to grow an array. This is analogous to ALLOC_GROW() in
the rest of the codebase but returns −1 on allocation failure to
accommodate other users of libxdiff such as libgit2. It will also
return a error if the multiplication overflows while calculating the
new allocation size. Note that this keeps doubling on reallocation
like the code it is replacing rather than increasing the existing size
by half like ALLOC_GROW(). It does however copy ALLOC_GROW()'s trick
of adding a small amount to the new allocation to avoid a lot of
reallocations at small sizes.
Note that xdl_alloc_grow_helper() uses long rather than size_t for
`nr` and `alloc` to match the existing code.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper for allocating an array and initialize the elements to
zero. This is analogous to CALLOC_ARRAY() in the rest of the codebase
but it returns NULL on allocation failures rather than dying to
accommodate other users of libxdiff such as libgit2.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for introducing XDL_CALLOC_ARRAY() use calloc() to
obtain zeroed out memory rather than malloc() followed by memset(). To
try and keep the lines a reasonable length this commit also stops
casting the pointer returned by calloc() as this is unnecessary.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper to allocate an array that automatically calculates the
allocation size. This is analogous to ALLOC_ARRAY() in the rest of the
codebase but returns NULL if the allocation fails to accommodate other
users of libxdiff such as libgit2. The helper will also return NULL if
the multiplication in the allocation calculation overflows.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
This macro was added in 3443546f6e (Use a *real* built-in diff
generator, 2006-03-24), but none of the xdiff code uses it, it uses
xdl_free() directly.
If we need its functionality again we'll use the FREE_AND_NULL() macro
added in 481df65f4f (git-compat-util: add a FREE_AND_NULL() wrapper
around free(ptr); ptr = NULL, 2017-06-15).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other users of xdiff such as libgit2 need to be able to handle
allocation failures. These allocation failures were previously
ignored.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard "goto out" pattern rather than repeating very similar
code after checking for each error. This will simplify the next commit
that starts handling allocation failures that are currently ignored.
On error xdl_do_diff() frees the environment so we need to take care
to avoid a double free in that case. xdl_build_script() does not
assign a result unless it is successful so there is no possibility of
a double free if it fails.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other users of libxdiff such as libgit2 need to be able to handle
allocation failures. As NULL is a valid return value the function
signature is changed to be able report allocation failures.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the patience and histogram algorithms initialize the
environment they do not free it if there is an error. In contrast for
the Myers algorithm the environment is initalized in xdl_do_diff() and
it is freed if there is an error. Fix this by always initializing the
environment in xdl_do_diff() and freeing it there if there is an
error. Remove the comment in do_patience_diff() about the environment
being freed by xdl_diff() as it is not accurate because (a) xdl_diff()
does not do that if there is an error and (b) xdl_diff() is not the
only caller.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 6b13bc3232 (xdiff: simplify comparison, 2021-11-17), we do not
call xdl_recmatch() from xdiffi.c's recs_match(), so we no longer need
the "flags" parameter. That in turn lets us drop the flags parameters
from the group-slide functions which call it.
There's no functional change here; it's just making these functions a
little simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 663c5ad035 (diff histogram: intern strings, 2021-11-17), our
cmp_recs() does not call xdl_recmatch(), and thus no longer needs an
xpparam_t struct from which to get the flags. We can drop the unused
parameter from the function, as well as the macro which wraps it.
There's no functional change here; it's just simplifying things (and
making -Wunused-parameter happier).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro has been unused since 43ca7530df (xdiff/xhistogram: rely on
xdl_trim_ends(), 2011-08-01). The function that called it went away
there, and its use in the CMP() macro was inlined. It probably should
have been deleted then, but nobody noticed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"zdiff3" is identical to ordinary diff3 except that it allows compaction
of common lines on the two sides of history at the beginning or end of
the conflict hunk. For example, the following diff3 conflict:
1
2
3
4
<<<<<<
A
B
C
D
E
||||||
5
6
======
A
X
C
Y
E
>>>>>>
7
8
9
has common lines 'A', 'C', and 'E' on the two sides. With zdiff3, one
would instead get the following conflict:
1
2
3
4
A
<<<<<<
B
C
D
||||||
5
6
======
X
C
Y
>>>>>>
E
7
8
9
Note that the common lines, 'A', and 'E' were moved outside the
conflict. Unlike with the two-way conflicts from the 'merge'
conflictStyle, the zdiff3 conflict is NOT split into multiple conflict
regions to allow the common 'C' lines to be shown outside a conflict,
because zdiff3 shows the base version too and the base version cannot be
reasonably split.
Note also that the removing of lines common to the two sides might make
the remaining text inside the conflict region match the base text inside
the conflict region (for example, if the diff3 conflict had '5 6 E' on
the right side of the conflict, then the common line 'E' would be moved
outside and both the base and right side's remaining conflict text would
be the lines '5' and '6'). This has the potential to surprise users and
make them think there should not have been a conflict, but there
definitely was a conflict and it should remain.
Based-on-patch-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Co-authored-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>