79181 Commits

Author SHA1 Message Date
Junio C Hamano
e85ae279b0 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:54:56 +09:00
Junio C Hamano
bbefa15ff5 Merge branch 'en/replay-doc-revision-range'
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.

* en/replay-doc-revision-range:
  Documentation/git-replay.adoc: fix errors around revision range
2025-12-09 07:54:56 +09:00
Junio C Hamano
7fc0b33b5d Merge branch 'yc/xdiff-patience-optim'
The way patience diff finds LCS has been optimized.

* yc/xdiff-patience-optim:
  xdiff: optimize patience diff's LCS search
2025-12-09 07:54:55 +09:00
Junio C Hamano
fe0e6ffa19 Merge branch 'bc/zsh-testsuite'
A few tests have been updated to work under the shell compatible
mode of zsh.

* bc/zsh-testsuite:
  t5564: fix test hang under zsh's sh mode
  t0614: use numerical comparison with test_line_count
2025-12-09 07:54:54 +09:00
Junio C Hamano
c64b234a0b Merge branch 'pw/replay-exclude-gpgsig-fix'
"git replay" forgot to omit the "gpgsig-sha256" extended header
from the resulting commit the same way it omits "gpgsig", which has
been corrected.

* pw/replay-exclude-gpgsig-fix:
  replay: do not copy "gpgsign-sha256" header
2025-12-09 07:54:54 +09:00
Junio C Hamano
bdc5341ff6 The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-05 14:49:59 +09:00
Junio C Hamano
644aed8921 Merge branch 'rs/config-set-multi-error-message-fix'
The error message given by "git config set", when the variable
being updated has more than one values defined, used old style "git
config" syntax with an incorrect option in its hint, both of which
have been corrected.

* rs/config-set-multi-error-message-fix:
  config: fix suggestion for failed set of multi-valued option
2025-12-05 14:49:59 +09:00
Junio C Hamano
e74a6e0cb9 Merge branch 'rs/config-unset-opthelp-fix'
The option help text given by "git config unset -h" described
the "--all" option to "replace", not "unset", multiple variables,
which has been corrected.

* rs/config-unset-opthelp-fix:
  config: fix short help of unset flags
2025-12-05 14:49:59 +09:00
Junio C Hamano
9d442ce2e2 Merge branch 'ps/object-source-management'
Code refactoring around object database sources.

* ps/object-source-management:
  odb: handle recreation of quarantine directories
  odb: handle changing a repository's commondir
  chdir-notify: add function to unregister listeners
  odb: handle initialization of sources in `odb_new()`
  http-push: stop setting up `the_repository` for each reference
  t/helper: stop setting up `the_repository` repeatedly
  builtin/index-pack: fix deferred fsck outside repos
  oidset: introduce `oidset_equal()`
  odb: move logic to disable ref updates into repo
  odb: refactor `odb_clear()` to `odb_free()`
  odb: adopt logic to close object databases
  setup: convert `set_git_dir()` to have file scope
  path: move `enter_repo()` into "setup.c"
2025-12-05 14:49:58 +09:00
Junio C Hamano
1b40ddc1a5 Merge branch 'cc/fast-import-strip-if-invalid'
"git fast-import" learns "--strip-if-invalid" option to drop
invalid cryptographic signature from objects.

* cc/fast-import-strip-if-invalid:
  fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>
  commit: refactor verify_commit_buffer()
  fast-import: refactor finalize_commit_buffer()
2025-12-05 14:49:58 +09:00
Junio C Hamano
85f99338e1 Merge branch 'js/ci-show-breakage-in-dockerized-jobs'
Dockerised jobs at the GitHub Actions CI have been taught to show
more details of failed tests.

* js/ci-show-breakage-in-dockerized-jobs:
  ci(dockerized): do show the result of failing tests again
2025-12-05 14:49:58 +09:00
Junio C Hamano
77f8d994a8 Merge branch 'kh/doc-committer-date-is-author-date'
The "--committer-date-is-author-date" option of "git am/rebase" is
a misguided one.  The documentation is updated to discourage its
use.

* kh/doc-committer-date-is-author-date:
  doc: warn against --committer-date-is-author-date
2025-12-05 14:49:57 +09:00
Junio C Hamano
0534b78576 Merge branch 'jc/optional-path'
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.

* jc/optional-path:
  config: really treat missing optional path as not configured
  config: really pretend missing :(optional) value is not there
  config: mark otherwise unused function as file-scope static
2025-12-05 14:49:56 +09:00
Junio C Hamano
5eadcbf815 Merge branch 'js/strip-scalar-too'
"make strip" has been taught to strip "scalar" as well as "git".

* js/strip-scalar-too:
  make strip: include `scalar`
2025-12-05 14:49:56 +09:00
Junio C Hamano
0c6707687f Merge branch 'en/xdiff-cleanup-2'
Code clean-up.

* en/xdiff-cleanup-2:
  xdiff: rename rindex -> reference_index
  xdiff: change rindex from long to size_t in xdfile_t
  xdiff: make xdfile_t.nreff a size_t instead of long
  xdiff: make xdfile_t.nrec a size_t instead of long
  xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
  xdiff: use unambiguous types in xdl_hash_record()
  xdiff: use size_t for xrecord_t.size
  xdiff: make xrecord_t.ptr a uint8_t instead of char
  xdiff: use ptrdiff_t for dstart/dend
  doc: define unambiguous type mappings across C and Rust
2025-12-05 14:49:56 +09:00
Junio C Hamano
f0ef5b6d9b The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-30 18:31:41 -08:00
Junio C Hamano
aea8cc3a10 Merge branch 'jk/asan-bonanza'
Various issues detected by Asan have been corrected.

* jk/asan-bonanza:
  t: enable ASan's strict_string_checks option
  fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
  fsck: remove redundant date timestamp check
  fsck: avoid strcspn() in fsck_ident()
  fsck: assert newline presence in fsck_ident()
  cache-tree: avoid strtol() on non-string buffer
  Makefile: turn on NO_MMAP when building with ASan
  pack-bitmap: handle name-hash lookups in incremental bitmaps
  compat/mmap: mark unused argument in git_munmap()
2025-11-30 18:31:41 -08:00
Junio C Hamano
6912d80f55 Merge branch 'je/doc-data-model'
Add a new manual that describes the data model.

* je/doc-data-model:
  doc: add an explanation of Git's data model
2025-11-30 18:31:40 -08:00
Junio C Hamano
3b212a83fe Merge branch 'jc/whitespace-incomplete-line'
Both "git apply" and "git diff" learn a new whitespace error class,
"incomplete-line".

* jc/whitespace-incomplete-line:
  attr: enable incomplete-line whitespace error for this project
  diff: highlight and error out on incomplete lines
  apply: check and fix incomplete lines
  whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE
  apply: revamp the parsing of incomplete lines
  diff: update the way rewrite diff handles incomplete lines
  diff: call emit_callback ecbdata everywhere
  diff: refactor output of incomplete line
  diff: keep track of the type of the last line seen
  diff: correct suppress_blank_empty hack
  diff: emit_line_ws_markup() if/else style fix
  whitespace: correct bit assignment comments
2025-11-30 18:31:40 -08:00
Junio C Hamano
ffd9bb1bc7 Merge branch 'ja/doc-synopsis-style'
Doc mark-up updates.

* ja/doc-synopsis-style:
  doc: pull-fetch-param typofix
  doc: convert git push to synopsis style
  doc: convert git pull to synopsis style
  doc: convert git fetch to synopsis style
2025-11-30 18:31:39 -08:00
Junio C Hamano
0fec747d59 Merge branch 'lo/repo-info-all'
"git repo info" learned "--all" option.

* lo/repo-info-all:
  repo: add --all to git-repo-info
  repo: factor out field printing to dedicated function
2025-11-30 18:31:39 -08:00
Elijah Newren
136f86abc0 Documentation/git-replay.adoc: fix errors around revision range
There was significant confusion in the git-replay manual about what
constitutes a revision range.  As noted in f302c1e4aa (revisions(7):
clarify that most commands take a single revision range, 2021-05-18):

   Commands that are specifically designed to take two distinct ranges
   (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they
   are exceptions. Unless otherwise noted, all "git" commands that operate
   on a set of commits work on a single revision range.

`git replay` is not an exception, but a few places in the manual were
written as though it were.  These appear to have come in revisions to
the original series, between v3->v4 (see
https://lore.kernel.org/git/CAP8UFD3bpLrVW97DH7j=V9H2GsTSAkksC9L3QujQERFk_kLnZA@mail.gmail.com/
, "More than one <revision-range> can be passed") and between v6->v7
(https://lore.kernel.org/git/20231115143327.2441397-1-christian.couder@gmail.com/,
"Takes ranges of commits"), and I missed both of these revisions when
reviewing.  Fix them now.

There was also a reference to the "Commit Limiting options below", but
this page has no such section of options; strike the misleading
reference.

It is worth noting that we are documenting existing behavior, rather
than optimal behavior.  Junio has multiple times suggested introducing
alternative ways to walk revisions and use them in `git replay
--advance`, e.g. at
  * https://lore.kernel.org/git/xmqqy1mqo6kv.fsf@gitster.g/
  * https://lore.kernel.org/git/xmqq8rb3is8c.fsf@gitster.g/
  * https://lore.kernel.org/git/xmqqtsydj2zk.fsf@gitster.g/ (item (2))
If/when we introduce some new revision walking flag that implements one
of these alternate types of revision walks, we can update the --advance
option and this manual appropriately.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-28 23:06:46 -08:00
Yee Cheng Chin
c7e3b8085b xdiff: optimize patience diff's LCS search
The find_longest_common_sequence() function in patience diff is
inefficient as it calls binary_search() for every unique line it
encounters when deciding where to put it in the sequence. From
instrumentation (using xctrace) on popular repositories, binary_search()
takes up 50-60% of the run time within patience_diff() when performing a
diff.

To optimize this, add a boundary condition check before binary_search()
is called to see if the encountered unique line is located after the
entire currently tracked longest subsequence. If so, skip the
unnecessary binary search and simply append the entry to the end of
sequence. Given that most files compared in a diff are usually quite
similar to each other, this condition is very common, and should be hit
much more frequently than the binary search.

Below are some end-to-end performance results by timing `git log
--shortstat --oneline -500 --patience` on different repositories with
the old and new code. Generally speaking this seems to give at least
8-10% speed up. The "binary search hit %" column describes how often the
algorithm enters the binary search path instead of the new faster path.
Even in the WebKit case we can see that it's quite rare (1.46%).

| Repo     | Speed difference | binary search hit % |
|----------|------------------|---------------------|
| vim      | 1.27x            | 0.01%               |
| pytorch  | 1.16x            | 0.02%               |
| cpython  | 1.14x            | 0.06%               |
| ripgrep  | 1.14x            | 0.03%               |
| git      | 1.13x            | 0.12%               |
| vscode   | 1.09x            | 0.10%               |
| WebKit   | 1.08x            | 1.46%               |

The benchmarks were done using hyperfine, on an Apple M1 Max laptop,
with git compiled with `-O3 -flto`.

Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27 19:11:41 -08:00
brian m. carlson
a92f243a94 t5564: fix test hang under zsh's sh mode
This test starts a SOCKS server in Perl in the background and then kills
it after the tests are done.  However, when using zsh (in sh mode) in
the tests, the start_socks function hangs until the background process
is killed.

Note that this does not reproduce in a simple shell script, so there is
likely some interaction between job handling, our heavy use of eval in
the test framework, and possibly other complexities of our test
framework.  What is clear, however, is that switching from a compound
statement to a subshell fixes the problem entirely and the test passes
with no problem, so do that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27 19:06:03 -08:00
brian m. carlson
bf25fca31c t0614: use numerical comparison with test_line_count
In this comparison, we want to know whether the number of lines is
greater than 1.  Our test_line_count function passes the first argument
as the comparison operator to test, so what we want is a numerical
comparison, not a string comparison.  While this does not produce a
functional problem now, it could very well if we expected two or more
items, in which case the value "10" would not match when it should.

Furthermore, the "<" and ">" comparisons are new in POSIX 1003.1-2024
and we don't want to require such a new version of POSIX since many
popular and supported operating systems were released before that
version of POSIX was released.

Finally, zsh's builtin test operator does not like the greater-than sign
in "test", since it is only supported in the double-bracket extension.
This has been reported and will be addressed in a future version, but
since our code is also technically incorrect, as well as not very
compatible, let's fix it by using a numeric comparison.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27 19:06:01 -08:00
Junio C Hamano
b31ab939fe The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26 10:32:43 -08:00
Junio C Hamano
54af646904 Merge branch 'gf/win32-pthread-cond-wait-err'
Emulation code clean-up.

* gf/win32-pthread-cond-wait-err:
  win32: return error if SleepConditionVariableCS fails
2025-11-26 10:32:43 -08:00
Junio C Hamano
536d284f3b Merge branch 'jk/ci-windows-meson-test-fix'
"Windows+meson" job at the GitHub Actions CI was hard to debug, as
it did not show and save failed test artifacts, which has been
corrected.

* jk/ci-windows-meson-test-fix:
  ci(windows-meson-test): handle options and output like other test jobs
  unit-test: ignore --no-chain-lint
2025-11-26 10:32:43 -08:00
Junio C Hamano
d65eab5d30 Merge branch 'pw/worktree-list-display-width-fix'
"git worktree list" attempts to show paths to worktrees while
aligning them, but miscounted display columns for the paths when
non-ASCII characters were involved, which has been corrected.

* pw/worktree-list-display-width-fix:
  worktree list: quote paths
  worktree list: fix column spacing
2025-11-26 10:32:42 -08:00
Junio C Hamano
e539545396 Merge branch 'js/wincred-get-credential-alloc-fix'
Under-allocation fix.

* js/wincred-get-credential-alloc-fix:
  wincred: avoid memory corruption
2025-11-26 10:32:42 -08:00
Junio C Hamano
35eaf96add Merge branch 'js/cmake-libgit-fix'
Makefile based build have recently been updated to build a
libgit.a that also has reftable and xdiff objects; CMake based
build procedure has been updated to match.

* js/cmake-libgit-fix:
  cmake: stop trying to build the reftable and xdiff libraries
2025-11-26 10:32:42 -08:00
Junio C Hamano
eb474aa7e6 Merge branch 'js/mingw-assign-comma-fix'
The "return errno = EFOO, -1" construct, which is heavily used in
compat/mingw.c and triggers warnings under "-Wcomma", has been
rewritten to avoid the warnings.

* js/mingw-assign-comma-fix:
  mingw: avoid the comma operator
2025-11-26 10:32:41 -08:00
Junio C Hamano
fa40522717 Merge branch 'js/ci-github-setup-go-update'
Update a version of action used at the GitHub Actrions CI.

* js/ci-github-setup-go-update:
  ci: bump actions/setup-go from 5 to 6
2025-11-26 10:32:41 -08:00
Junio C Hamano
24ddb3f1fc Merge branch 'jk/test-mktemp-leakfix'
Test leakfix.

* jk/test-mktemp-leakfix:
  test-mktemp: plug memory and descriptor leaks
2025-11-26 10:32:41 -08:00
Junio C Hamano
370470e240 Merge branch 'rs/xmkstemp-simplify'
Code simplification.

* rs/xmkstemp-simplify:
  wrapper: simplify xmkstemp()
2025-11-26 10:32:40 -08:00
Junio C Hamano
1b93acd13a Merge branch 'ad/blame-diff-algorithm'
"git blame" learns "--diff-algorithm=<algo>" option.

* ad/blame-diff-algorithm:
  blame: make diff algorithm configurable
  xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-26 10:32:40 -08:00
Junio C Hamano
716e871d50 Merge branch 'en/ort-rename-another-fix'
Yet another corner case fix around renames in the "ort" merge
strategy.

* en/ort-rename-another-fix:
  merge-ort: fix failing merges in special corner case
  merge-ort: remove debugging crud
  t6429: update comment to mention correct tool
2025-11-26 10:32:40 -08:00
Johannes Schindelin
0458e8b854 ci(dockerized): do show the result of failing tests again
The quality of tests and test suites is most apparent not when
everything passes, but in how quickly bugs can be identified,
analyzed, and resolved after test failures occur.

As such, it is an unfortunate side effect of 2a21098b98 (github: adapt
containerized jobs to be rootless, 2025-01-10) that the output of failed
test cases, which was shown before that change directly in the build
logs, is now no longer shown at all.

The reason is a side effect of trying to run the build and the tests
with permissions other than the `root` user, but without providing the
prerequisite permissions to signal what tests failed and whose output
hence needs to be included in the logs.

The way this signaling works is for the workflow to write into
special-purpose files whose path is specific to the current workflow
step and which can be accessed via the `$GITHUB_ENV` environment
variable, which differs between workflow steps. It is this file that is
missing write permission for the `builder` user that was introduced in
above-mentioned commit.

The solution is simple: make the file world-writable.

Technically, this write permission should be removed after the step has
completed, if proper security practices were to be upheld, but since
nothing uses that file again, it does not matter, and the fix is more
succinct this way.

This commit is best viewed with `--color-words`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: squashed Elijah's rewrite of the first paragraph of the log message]
[jc: updated chmod to match "world-writable" in the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26 10:17:44 -08:00
Junio C Hamano
42bf8a534b Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
  gitk: add external diff file rename detection
  gitk: show unescaped file names on 'rename' and 'copy' lines
  gitk: fix a 'continue' statement outside a loop to 'return'
  gitk: persist position and size of the Tags and Heads window
  Revert "gitk: Only restore window size from ~/.gitk, not position"
2025-11-26 09:35:09 -08:00
Phillip Wood
9f3a115087 replay: do not copy "gpgsign-sha256" header
When "git replay" replays a commit it copies the extended headers
across from the original commit. However, if the original commit
was signed, we do not want to copy the header associated with the
signature is it wont be valid for the new commit. The code already
knows to avoid coping the "gpgsig" header but does not know to avoid
copying the "gpgsig-sha256" header.  Add that header to the list of
exclusions to match what "git commit --amend" does.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26 09:33:52 -08:00
Christian Couder
c20f112e51 fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>
Tools like `git filter-repo`[1] use `git fast-export` and
`git fast-import` to rewrite repository history. When rewriting
history using one such tool though, commit signatures might become
invalid because the commits they sign changed due to the changes
in the repository history made by the tool between the fast-export
and the fast-import steps.

Note that as far as signature handling goes:

  * Since fast-export doesn't know what changes filter-repo may make
to the stream, it can't know whether the signatures will still be
valid.

  * Since filter-repo doesn't know what history canonicalizations
fast-export performed (and it performs a few), it can't know whether
the signatures will still be valid.

  * Therefore, fast-import is the only process in the pipeline that
can know whether a specified signature remains valid.

Having invalid signatures in a rewritten repository could be
confusing, so users rewritting history might prefer to simply
discard signatures that are invalid at the fast-import step.

For example a common use case is to rewrite only "recent" history.
While specifying commit ranges corresponding to "recent" commits
could work, users worry about getting it wrong and want to just
automatically rewrite everything, expecting older commit signatures
to be untouched.

To let them do that, let's add a new 'strip-if-invalid' mode to the
`--signed-commits=<mode>` option of `git fast-import`.

It would be interesting for the `--signed-tags=<mode>` option to
have this mode too, but we leave that for a future improvement.

It might also be possible for `git fast-export` to have such a mode
in its `--signed-commits=<mode>` and `--signed-tags=<mode>`
options, but the use cases for it are much less clear, so we also
leave that for possible future improvements.

For now let's just die() if 'strip-if-invalid' is passed to these
options where it hasn't been implemented yet.

[1]: https://github.com/newren/git-filter-repo

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26 08:43:44 -08:00
Johannes Sixt
776223c4d8 Merge branch 'tb/external-diff-renamed'
* tb/external-diff-renamed:
  gitk: add external diff file rename detection
2025-11-26 16:04:14 +01:00
Johannes Sixt
bd3fd7e77c Merge branch 'js/persist-ref-window-geometry'
* js/persist-ref-window-geometry:
  gitk: persist position and size of the Tags and Heads window
  Revert "gitk: Only restore window size from ~/.gitk, not position"
2025-11-26 16:02:23 +01:00
Patrick Steinhardt
ac65c70663 odb: handle recreation of quarantine directories
In the preceding commit we have moved the logic that reparents object
database sources on chdir(3p) from "setup.c" into "odb.c". Let's also do
the same for any temporary quarantine directories so that the complete
reparenting logic is self-contained in "odb.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
2816b748e5 odb: handle changing a repository's commondir
The function `repo_set_gitdir()` is called in two situations:

  - To initialize the repository with its discovered location. As part
    of this we also set up the new object database.

  - To update the repository's discovered location in case the process
    changes its working directory so that we update relative paths. This
    means we also have to update any relative paths that are potentially
    used in the object database.

In the context of the object database we ideally wouldn't ever have to
worry about the second case: if all paths used by our object database
sources were absolute, then we wouldn't have to update them. But
unfortunately, the paths aren't only used to locate files owned by the
given source, but we also use them for reporting purposes. One such
example is `repo_get_object_directory()`, where we cannot just change
semantics to always return absolute paths, as that is likely to break
tooling out there.

One solution to this would be to have both a "display path" and an
"internal path". This would allow us to use internal paths for all
internal matters, but continue to use the potentially-relative display
paths so that we don't break compatibility. But converting the codebase
to honor this split is quite a messy endeavour, and it wouldn't even
help us with the goal to get rid of the need to update the display path
on chdir(3p).

Another solution would be to rework "setup.c" so that we never have to
update paths in the first place. In that case, we'd only initialize the
repository once we have figured out final locations for all directories.
This would be a significant simplification of that subsystem indeed, but
the current logic is so messy that it would take significant investments
to get there.

Meanwhile though, while object sources may still use relative paths, the
best thing we can do is to handle the reparenting of the object source
paths in the object database itself. This can be done by registering one
callback for each object database so that we get notified whenever the
current working directory changes, and we then perform the reparenting
ourselves.

Ideally, this wouldn't even happen on the object database level, but
instead handled by each object database source. But we don't yet have
proper pluggable object database sources, so this will need to be
handled at a later point in time.

The logic itself is rather simple:

  - We register the callback when creating the object database.

  - We unregister the callback when releasing it again.

  - We split up `set_git_dir_1()` so that it becomes possible to skip
    recreating the object database. This is required because the
    function is called both when the current working directory changes,
    but also when we set up the repository. Calling this function
    without skipping creation of the ODB will result in a bug in case
    it's already created.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
2574c61736 chdir-notify: add function to unregister listeners
While we (obviously) have a way to register new listeners that get
called whenever we chdir(3p), we don't have an equivalent that can be
used to unregister such a listener again.

Add one, as it will be required in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
35d9fc65ed odb: handle initialization of sources in odb_new()
The logic to set up a new object database is currently distributed
across two functions in "repository.c":

  - In `initialize_repository()` we initialize an empty object database.
    This object database is not fully initialized and doesn't have any
    sources attached to it.

  - The primary object database source is then created in
    `repo_set_gitdir()`.

Ideally though, the logic should be entirely self-contained so that we
can iterate more readily on how exactly the sources themselves get set
up.

Refactor `odb_new()` to handle both allocation and setup of the object
database. This ensures that the object database is always initialized
and ready for use, and it allows us to change how the sources get set up
eventually.

Note that `repo_set_gitdir()` still reaches into the sources when the
function gets called with an already-initialized object database. This
will be fixed in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
c257bd5916 http-push: stop setting up the_repository for each reference
When pushing references via HTTP we call `repo_init_revisions()` in a
loop for each reference that we're about to push. As third argument we
pass the result of `setup_git_directory()`, which causes us to
reinitialize the repository every single time.

This is an obvious waste of compute, as the repository that we're
working in will never change across any of the initializations. The only
reason that we do this is to retrieve the directory of the repository.
Furthermore, this is about to create issues in a subsequent commit,
where reinitializing the repository will cause a `BUG()`.

Address this by storing the Git directory in a variable instead so that
we don't have to call the function repeatedly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
eea83c010c t/helper: stop setting up the_repository repeatedly
The "repository" test helper sets up `the_repository` twice. In fact
though, we don't even have to set it up even once: all we need is to set
up its hash algorithm, because we still depend on some subsystems that
aren't free of `the_repository`.

Refactor the code accordingly. This prepares for a subsequent change,
where setting up the repository repeatedly will lead to a `BUG()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
8dc22e87f0 builtin/index-pack: fix deferred fsck outside repos
When asked to perform object consistency checks via the `--fsck-objects`
flag we verify that each object part of the pack is valid. In general,
this check can even be performed outside of a Git repository: we don't
need an initialized object database as we simply read the object from
the packfile directly.

But there's one exception: a subset of the object checks may be deferred
to a later point in time. For now, this only concerns ".gitmodules" and
".gitattributes" files: whenever we see a tree referencing these files
we queue them for a deferred check. This is done because we need to do
some extra checks for those files to ensure that they are well-formed,
and these checks need to be done regardless of whether the corresponding
blobs are part of the packfile or not.

This works inside a repository, but unfortunately the logic leads to a
segfault when running outside of one. This is because we eventually call
`odb_read_object()`, which will crash because the object database has
not been initialized.

There's multiple options here:

  - We could in theory create a purely in-memory database with only a
    packfile store that contains the single packfile. We don't really
    have the infrastructure for this yet though, and it would end up
    being quite hacky.

  - We could refuse to perform consistency checks outside of a
    repository. But most of the checks work alright, so this would be a
    regression.

  - We can skip the finalizing consistency checks when running outside
    of a repository. This is not as invasive as skipping all checks,
    but it's not great to randomly skip a subset of tests, either.

None of these options really feel perfect. The first one would be the
obvious choice if easily possible.

There's another option though: instead of skipping the final object
checks, we can die if there are any queued object checks. With this
change we now die exactly if and only if we would have previously
segfaulted. Like this we ensure that objects that _may_ fail the
consistency checks won't be silently skipped, and at the same time we
give users a much better error message.

Refactor the code accordingly and add a test that would have triggered
the segfault. Note that we also move down the logic to add the packfile
to the store. There is no point doing this any earlier than right before
we execute `fsck_finish()`, and it ensures that the logic to set up and
perform the consistency check is self-contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:15:59 -08:00