17181 Commits

Author SHA1 Message Date
Junio C Hamano
d1755d5489 Merge branch 'ar/parallel-hooks' into seen
* ar/parallel-hooks:
  hook: allow runtime enabling extensions.hookStdoutToStderr
  hook: introduce extensions.hookStdoutToStderr
  hook: add per-event jobs config
  hook: add -j/--jobs option to git hook run
  hook: mark non-parallelizable hooks
  hook: allow parallel hook execution
  hook: parse the hook.jobs config
  hook: refactor hook_config_cache from strmap to named struct
  config: add a repo_config_get_uint() helper
  repository: fix repo_init() memleak due to missing _clear()
2026-02-27 15:14:42 -08:00
Junio C Hamano
8df10cd2d3 Merge branch 'js/neuter-sideband' into seen
Invalidate control characters in sideband messages, to avoid
terminal state getting messed up.

* js/neuter-sideband:
  sideband: delay sanitizing by default to Git v3.0
  sideband: offer to configure sanitizing on a per-URL basis
  sideband: add options to allow more control sequences to be passed through
  sideband: do allow ANSI color sequences by default
  sideband: introduce an "escape hatch" to allow control characters
  sideband: mask control characters
2026-02-27 15:14:41 -08:00
Junio C Hamano
52cae57153 Merge branch 'pt/fsmonitor-linux' into seen
The fsmonitor daemon has been implemented for Linux.

* pt/fsmonitor-linux:
  fsmonitor: close inherited file descriptors and detach in daemon
  run-command: add close_fd_above_stderr option
  fsmonitor: add tests for Linux
  fsmonitor: implement filesystem change listener for Linux
  fsmonitor: deduplicate settings logic for Unix platforms
  fsmonitor: deduplicate IPC path logic for Unix platforms
  fsmonitor: use pthread_cond_timedwait for cookie wait
  compat/win32: add pthread_cond_timedwait
  fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon
  fsmonitor: fix khash memory leak in do_handle_client
2026-02-27 15:14:40 -08:00
Junio C Hamano
d7a75bf96d Merge branch 'vp/http-rate-limit-retries' into seen
The HTTP transport learned to react to "429 Too Many Requests".

* vp/http-rate-limit-retries:
  http: add support for HTTP 429 rate limit retries
  remote-curl: introduce show_http_message_fatal() helper
  strbuf_attach: fix call sites to pass correct alloc
  strbuf: pass correct alloc to strbuf_attach() in strbuf_reencode()
2026-02-27 15:14:40 -08:00
Junio C Hamano
863838a955 Merge branch 'mf/format-patch-cover-letter-format' into jch
"git format-patch --cover-letter" learns to use a simpler format
instead of the traditional shortlog format to list its commits with
a new --cover-letter-format option and format.commitListFormat
configuration variable.

* mf/format-patch-cover-letter-format:
  docs: add usage for the cover-letter fmt feature
  format-patch: add commitListFormat config
  format-patch: add ability to use alt cover format
  format-patch: move cover letter summary generation
  pretty.c: add %(count) and %(total) placeholders
2026-02-27 15:14:29 -08:00
Junio C Hamano
34e61c49e2 Merge branch 'tb/incremental-midx-part-3.2' into jch
Further work on incremental repacking using MIDX/bitmap

* tb/incremental-midx-part-3.2:
  midx: enable reachability bitmaps during MIDX compaction
  midx: implement MIDX compaction
  t/helper/test-read-midx.c: plug memory leak when selecting layer
  midx-write.c: factor fanout layering from `compute_sorted_entries()`
  midx-write.c: enumerate `pack_int_id` values directly
  midx-write.c: extract `fill_pack_from_midx()`
  midx-write.c: introduce `midx_pack_perm()` helper
  midx: do not require packs to be sorted in lexicographic order
  midx-write.c: introduce `struct write_midx_opts`
  midx-write.c: don't use `pack_perm` when assigning `bitmap_pos`
  t/t5319-multi-pack-index.sh: fix copy-and-paste error in t5319.39
  git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h'
  git-multi-pack-index(1): remove non-existent incompatibility
  builtin/multi-pack-index.c: make '--progress' a common option
  midx: introduce `midx_get_checksum_hex()`
  midx: rename `get_midx_checksum()` to `midx_get_checksum_hash()`
  midx: mark `get_midx_checksum()` arguments as const
2026-02-27 15:14:25 -08:00
Junio C Hamano
8f23ae43fb Merge branch 'lc/rebase-trailer' into jch
"git rebase" learns "--trailer" command to drive the
interpret-trailers machinery.

* lc/rebase-trailer:
  rebase: support --trailer
  commit, tag: parse --trailer with OPT_STRVEC
  trailer: append trailers without fork/exec
  trailer: move process_trailers to trailer.h
  interpret-trailers: factor trailer rewriting
2026-02-27 15:14:24 -08:00
Junio C Hamano
a972b6daa8 Merge branch 'jt/repo-structure-extrema' into jch
"git repo structure" command learns to report maximum values on
various aspects of objects it inspects.

* jt/repo-structure-extrema:
  builtin/repo: find tree with most entries
  builtin/repo: find commit with most parents
  builtin/repo: add OID annotations to table output
  builtin/repo: collect largest inflated objects
  builtin/repo: update stats for each object
2026-02-27 15:14:20 -08:00
Junio C Hamano
f3d439aaa1 Merge branch 'sa/replay-revert' into jch
"git replay" (experimental) learns, in addition to "pick" and
"replay", a new operating mode "revert".

* sa/replay-revert:
  replay: add --revert mode to reverse commit changes
  sequencer: extract revert message formatting into shared function
2026-02-27 15:14:19 -08:00
Junio C Hamano
a4513813ed Merge branch 'dt/send-email-client-cert' into jch
"git send-email" learns to support use of client-side certificates.

* dt/send-email-client-cert:
  send-mail: add client certificate options
2026-02-27 15:14:19 -08:00
Junio C Hamano
3624bed86a Merge branch 'bc/sha1-256-interop-02' into jch
The code to maintain mapping between object names in multiple hash
functions is being added, written in Rust.

* bc/sha1-256-interop-02:
  object-file-convert: always make sure object ID algo is valid
  rust: add a small wrapper around the hashfile code
  rust: add a new binary object map format
  rust: add functionality to hash an object
  rust: add a build.rs script for tests
  rust: fix linking binaries with cargo
  hash: expose hash context functions to Rust
  write-or-die: add an fsync component for the object map
  csum-file: define hashwrite's count as a uint32_t
  rust: add additional helpers for ObjectID
  hash: add a function to look up hash algo structs
  rust: add a hash algorithm abstraction
  rust: add a ObjectID struct
  hash: use uint32_t for object_id algorithm
  conversion: don't crash when no destination algo
  repository: require Rust support for interoperability
2026-02-27 15:14:18 -08:00
Junio C Hamano
882a88c436 Merge branch 'jh/alias-i18n-fixes' into jch
Further update to the i18n alias support to avoid regressions.

* jh/alias-i18n-fixes:
  git, help: fix memory leaks in alias listing
  alias: treat empty subsection [alias ""] as plain [alias]
  doc: fix list continuation in alias subsection example
2026-02-27 15:14:16 -08:00
Junio C Hamano
c1d17b07b8 Merge branch 'cs/add-skip-submodule-ignore-all' into jch
"git add <submodule>" has been taught to honor
submodule.<name>.ignore that is set to "all" (and requires "git add
-f" to override it).

* cs/add-skip-submodule-ignore-all:
  Documentation: update add --force option + ignore=all config
  tests: fix existing tests when add an ignore=all submodule
  tests: t2206-add-submodule-ignored: ignore=all and add --force tests
  read-cache: submodule add need --force given ignore=all configuration
  read-cache: update add_files_to_cache take param ignored_too
2026-02-27 15:14:16 -08:00
Junio C Hamano
3cd3f31f03 Merge branch 'ar/config-hooks' into jch
Allow hook commands to be defined (possibly centrally) in the
configuration files, and run multiple of them for the same hook
event.

* ar/config-hooks:
  hook: add -z option to "git hook list"
  hook: allow out-of-repo 'git hook' invocations
  hook: allow event = "" to overwrite previous values
  hook: allow disabling config hooks
  hook: include hooks from the config
  hook: add "git hook list" command
  hook: run a list of hooks to prepare for multihook support
  hook: add internal state alloc/free callbacks
2026-02-27 15:14:13 -08:00
Junio C Hamano
eab5960d1e Merge branch 'ds/config-list-with-type' into jch
"git config list" is taught to show the values interpreted for
specific type with "--type=<X>" option.

* ds/config-list-with-type:
  config: use an enum for type
  config: restructure format_config()
  config: format colors quietly
  color: add color_parse_quietly()
  config: format expiry dates quietly
  config: format paths gently
  config: format bools or strings in helper
  config: format bools or ints gently
  config: format bools gently
  config: format int64s gently
  config: make 'git config list --type=<X>' work
  config: add 'gently' parameter to format_config()
  config: move show_all_config()
2026-02-27 15:14:12 -08:00
Junio C Hamano
87975e28d9 Merge branch 'lo/repo-leftover-bits' into jch
Clean-up the code around "git repo info" command.

* lo/repo-leftover-bits:
  Documentation/git-repo: capitalize format descriptions
  Documentation/git-repo: replace 'NUL' with '_NUL_'
  t1901: adjust nul format output instead of expected value
  t1900: rename t1900-repo to t1900-repo-info
  repo: rename struct field to repo_info_field
  repo: replace get_value_fn_for_key by get_repo_info_field
  repo: rename repo_info_fields to repo_info_field
  CodingGuidelines: instruct to name arrays in singular
2026-02-27 15:14:11 -08:00
Junio C Hamano
e521e926b3 Merge branch 'ps/maintenance-geometric-default' into jch
"git maintenance" starts using the "geometric" strategy by default.

* ps/maintenance-geometric-default:
  builtin/maintenance: use "geometric" strategy by default
  t7900: prepare for switch of the default strategy
  t6500: explicitly use "gc" strategy
  t5510: explicitly use "gc" strategy
  t5400: explicitly use "gc" strategy
  t34xx: don't expire reflogs where it matters
  t: disable maintenance where we verify object database structure
  t: fix races caused by background maintenance
2026-02-27 15:14:10 -08:00
Junio C Hamano
3431f6f0fd Merge branch 'kh/format-patch-noprefix-is-boolean' into jch
The configuration variable format.noprefix did not behave as a
proper boolean variable, which has now been fixed and documented.

* kh/format-patch-noprefix-is-boolean:
  doc: diff-options.adoc: show format.noprefix for format-patch
  format-patch: make format.noprefix a boolean
2026-02-27 15:14:08 -08:00
Junio C Hamano
2a84c3141d Merge branch 'hn/status-compare-with-push' into jch
"git status" learned to show comparison between the current branch
and various other branches listed on status.compareBranches
configuration.

* hn/status-compare-with-push:
  status: add status.compareBranches config for multiple branch comparisons
  refactor format_branch_comparison in preparation
2026-02-27 15:14:05 -08:00
Junio C Hamano
8a0da64a51 Merge branch 'kn/ref-location' into jch
Allow the directory in which reference backends store their data to
be specified.

* kn/ref-location:
  refs: add GIT_REFERENCE_BACKEND to specify reference backend
  refs: allow reference location in refstorage config
  refs: receive and use the reference storage payload
  refs: move out stub modification to generic layer
  refs: extract out `refs_create_refdir_stubs()`
  setup: don't modify repo in `create_reference_database()`
2026-02-27 15:14:05 -08:00
Junio C Hamano
9111f466ce Merge branch 'kh/doc-patch-id-4' into jch
Doc update.

* kh/doc-patch-id-4:
  doc: patch-id: see also git-cherry(1)
  doc: patch-id: add script example
  doc: patch-id: emphasize multi-patch processing
2026-02-27 15:14:01 -08:00
Junio C Hamano
6a1851b551 Merge branch 'pw/meson-doc-mergetool' into jch
Update build precedure for mergetool documentation in meson-based builds.

* pw/meson-doc-mergetool:
  meson: fix building mergetool docs
2026-02-27 15:14:00 -08:00
Junio C Hamano
bd4979d617 Merge branch 'kh/doc-am-xref' into jch
Doc update.

* kh/doc-am-xref:
  doc: am: fill out hook discussion
  doc: am: add missing config am.messageId
  doc: am: say that --message-id adds a trailer
  doc: am: normalize git(1) command links
2026-02-27 15:14:00 -08:00
Mirko Faina
40d1aa8785 docs: add usage for the cover-letter fmt feature
Document the new "--cover-letter-format" feature in format-patch and its
related config variable "format.commitListFormat".

Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 15:12:21 -08:00
Junio C Hamano
2cc7191751 The 8th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 15:11:55 -08:00
Junio C Hamano
4416ec1ae3 Merge branch 'db/doc-fetch-jobs-auto'
Doc update.

* db/doc-fetch-jobs-auto:
  doc: fetch: document `--jobs=0` behavior
2026-02-27 15:11:54 -08:00
Junio C Hamano
d64a20a1b1 Merge branch 'mf/format-patch-honor-from-for-cover-letter'
"git format-patch --from=<me>" did not honor the command line
option when writing out the cover letter, which has been corrected.

* mf/format-patch-honor-from-for-cover-letter:
  format-patch: fix From header in cover letter
2026-02-27 15:11:54 -08:00
Junio C Hamano
c0d0b8daed Merge branch 'jh/alias-i18n'
Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing
2026-02-27 15:11:53 -08:00
Junio C Hamano
bb9c781f4f Merge branch 'ps/history-ergonomics-updates'
UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input
2026-02-27 15:11:50 -08:00
Junio C Hamano
aa95f87c74 Merge branch 'ps/for-each-ref-in-fixes'
A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips
2026-02-27 15:11:50 -08:00
Junio C Hamano
341be27dfe Merge branch 'lo/repo-info-keys'
"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
  repo: add new flag --keys to git-repo-info
  repo: rename the output format "keyvalue" to "lines"
2026-02-27 15:11:49 -08:00
Jonatan Holmgren
2e3a987f3b doc: fix list continuation in alias subsection example
The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 13:06:48 -08:00
Paul Tarjan
149c19ba1b fsmonitor: implement filesystem change listener for Linux
Implement the built-in fsmonitor daemon for Linux using the inotify
API, bringing it to feature parity with the existing Windows and macOS
implementations.

The implementation uses inotify rather than fanotify because fanotify
requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it
unsuitable for an unprivileged user-space daemon.  While inotify has
the limitation of requiring a separate watch on every directory (unlike
macOS's FSEvents, which can monitor an entire directory tree with a
single watch), it operates without elevated privileges and provides
the per-file event granularity needed for fsmonitor.

The listener uses inotify_init1(O_NONBLOCK) with a poll loop that
checks for events with a 50-millisecond timeout, keeping the inotify
queue well-drained to minimize the risk of overflows.  Bidirectional
hashmaps map between watch descriptors and directory paths for efficient
event resolution.  Directory renames are tracked using inotify's cookie
mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a
periodic check detects stale renames where the matching IN_MOVED_TO
never arrived, forcing a resync.

New directory creation triggers recursive watch registration to ensure
all subdirectories are monitored.  The IN_MASK_CREATE flag is used
where available to prevent modifying existing watches, with a fallback
for older kernels.  When IN_MASK_CREATE is available and
inotify_add_watch returns EEXIST, it means another thread or recursive
scan has already registered the watch, so it is safe to ignore.

Remote filesystem detection uses statfs() to identify network-mounted
filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers.
Mount point information is read from /proc/mounts and matched against
the statfs f_fsid to get accurate, human-readable filesystem type names
for logging.  When the .git directory is on a remote filesystem, the
IPC socket falls back to $HOME or a user-configured directory via the
fsmonitor.socketDir setting.

Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:29:46 -08:00
Harald Nordgren
3ea95ac9c5 status: add status.compareBranches config for multiple branch comparisons
Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.

Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination

Any other value is ignored and a warning is shown.

When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.

The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons

This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.

Example configuration:
    [status]
        compareBranches = @{upstream} @{push}

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:25:48 -08:00
Junio C Hamano
7b2bccb0d5 The 7th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:54:18 -08:00
Junio C Hamano
1a46f31b3e Merge branch 'jc/doc-cg-needswork'
A CodingGuidelines update.

* jc/doc-cg-needswork:
  CodingGuidelines: document NEEDSWORK comments
2026-02-25 11:54:17 -08:00
Junio C Hamano
8d15dd1ce1 Merge branch 'ds/revision-maximal-only'
"git rev-list" and friends learn "--maximal-only" to show only the
commits that are not reachable by other commits.

* ds/revision-maximal-only:
  revision: add --maximal-only option
2026-02-25 11:54:17 -08:00
Junio C Hamano
6b5ad01886 Merge branch 'cc/lop-filter-auto'
"auto filter" logic for large-object promisor remote.

* cc/lop-filter-auto:
  fetch-pack: wire up and enable auto filter logic
  promisor-remote: change promisor_remote_reply()'s signature
  promisor-remote: keep advertised filters in memory
  list-objects-filter-options: support 'auto' mode for --filter
  doc: fetch: document `--filter=<filter-spec>` option
  fetch: make filter_options local to cmd_fetch()
  clone: make filter_options local to cmd_clone()
  promisor-remote: allow a client to store fields
  promisor-remote: refactor initialising field lists
2026-02-25 11:54:17 -08:00
Junio C Hamano
bf3c3603fd Merge branch 'kh/doc-am-format-sendmail'
Doc update.

* kh/doc-am-format-sendmail:
  doc: add caveat about round-tripping format-patch
2026-02-25 11:54:16 -08:00
Lucas Seiki Oshiro
8b97dc367a Documentation/git-repo: capitalize format descriptions
The descriptions for the git-repo output formats are in lowercase.
Capitalize these descriptions, making them consistent with the rest of
the documentation.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
906b632c4f Documentation/git-repo: replace 'NUL' with '_NUL_'
Replace all occurrences of "NUL" by "_NUL_" in git-repo.adoc, following the
convention used by other documentation files.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
c63e64e04d CodingGuidelines: instruct to name arrays in singular
Arrays should be named in the singular form, ensuring that when
accessing an element within an array (e.g. dog[0]) it's clear that
we're referring to an element instead of a collection.

Add a new rule to CodingGuidelines asking for arrays to be named in
singular instead of plural.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:41 -08:00
Karthik Nayak
53592d68e8 refs: add GIT_REFERENCE_BACKEND to specify reference backend
Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:40:00 -08:00
Karthik Nayak
01dc84594e refs: allow reference location in refstorage config
The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:38:41 -08:00
Taylor Blau
d54da84bd9 midx: enable reachability bitmaps during MIDX compaction
Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.

Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.

As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.

In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:35 -08:00
Taylor Blau
9df44a97f1 midx: implement MIDX compaction
When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.

While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:

 - Preserving the MIDX chain is impossible, since there is no way to
   write a MIDX layer that contains objects or packs found in an earlier
   MIDX layer already part of the chain. So callers would have to write
   an entirely new (non-incremental) MIDX containing only the compacted
   layers, discarding all other objects/packs from the MIDX.

 - There is (currently) no way to write a MIDX layer outside of the MIDX
   chain to work around the above, such that the MIDX chain could be
   reassembled substituting the compacted layers with the MIDX that was
   written.

 - The `--stdin-packs` command-line option does not allow us to specify
   the order of packs as they appear in the MIDX. Therefore, even if
   there were workarounds for the previous two challenges, any bitmaps
   belonging to layers which come after the compacted layer(s) would no
   longer be valid.

This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.

Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:

 - Instead of calling `fill_packs_from_midx()`, we call a new function
   `fill_packs_from_midx_range()`, which walks backwards along the
   portion of the MIDX chain which we are compacting, and adds packs one
   layer a time.

   In order to preserve the pseudo-pack order, the concatenated pack
   order is preserved, with the exception of preferred packs which are
   always added first.

 - After adding entries from the set of packs in the compaction range,
   `compute_sorted_entries()` must adjust the `pack_int_id`'s for all
   objects added in each fanout layer to match their original
   `pack_int_id`'s (as opposed to the index at which each pack appears
   in `ctx.info`).

   Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
   here, as it unconditionally recurs through the `->base_midx`. Factor
   out a `_1()` variant that operates on a single layer, reimplement
   the existing function in terms of it, and use the new variant from
   `midx_fanout_add_compact()`.

   Since we are sorting the list of objects ourselves, the order we add
   them in does not matter.

 - When writing out the new 'multi-pack-index-chain' file, discard any
   layers in the compaction range, replacing them with the newly written
   layer, instead of keeping them and placing the new layer at the end
   of the chain.

This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.

The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
b2ec8e90c2 midx: do not require packs to be sorted in lexicographic order
The MIDX file format currently requires that pack files be identified by
the lexicographic ordering of their names (that is, a pack having a
checksum beginning with "abc" would have a numeric pack_int_id which is
smaller than the same value for a pack beginning with "bcd").

As a result, it is impossible to combine adjacent MIDX layers together
without permuting bits from bitmaps that are in more recent layer(s).

To see why, consider the following example:

          | packs       | preferred pack
  --------+-------------+---------------
  MIDX #0 | { X, Y, Z } | Y
  MIDX #1 | { A, B, C } | B
  MIDX #2 | { D, E, F } | D

, where MIDX #2's base MIDX is MIDX #1, and so on. Suppose that we want
to combine MIDX layers #0 and #1, to create a new layer #0' containing
the packs from both layers. With the original three MIDX layers, objects
are laid out in the bitmap in the order they appear in their source
pack, and the packs themselves are arranged according to the pseudo-pack
order. In this case, that ordering is Y, X, Z, B, A, C.

But recall that the pseudo-pack ordering is defined by the order that
packs appear in the MIDX, with the exception of the preferred pack,
which sorts ahead of all other packs regardless of its position within
the MIDX. In the above example, that means that pack 'Y' could be placed
anywhere (so long as it is designated as preferred), however, all other
packs must be placed in the location listed above.

Because that ordering isn't sorted lexicographically, it is impossible
to compact MIDX layers in the above configuration without permuting the
object-to-bit-position mapping. Changing this mapping would affect all
bitmaps belonging to newer layers, rendering the bitmaps associated with
MIDX #2 unreadable.

One of the goals of MIDX compaction is that we are able to shrink the
length of the MIDX chain *without* invalidating bitmaps that belong to
newer layers, and the lexicographic ordering constraint is at odds with
this goal.

However, packs do not *need* to be lexicographically ordered within the
MIDX. As far as I can gather, the only reason they are sorted lexically
is to make it possible to perform a binary search over the pack names in
a MIDX, necessary to make `midx_contains_pack()`'s performance
logarithmic in the number of packs rather than linear.

Relax this constraint by allowing MIDX writes to proceed with packs that
are not arranged in lexicographic order. `midx_contains_pack()` will
lazily instantiate a `pack_names_sorted` array on the MIDX, which will
be used to implement the binary search over pack names.

This change produces MIDXs which may not be correctly read with external
tools or older versions of Git. Though older versions of Git know how to
gracefully degrade and ignore any MIDX(s) they consider corrupt,
external tools may not be as robust. To avoid unintentionally breaking
any such tools, guard this change behind a version bump in the MIDX's
on-disk format.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:33 -08:00
Taylor Blau
d0e91c128b git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h'
Since c39fffc1c9 (tests: start asserting that *.txt SYNOPSIS matches -h
output, 2022-10-13), the manual page for 'git multi-pack-index' has a
SYNOPSIS section which differs from 'git multi-pack-index -h'.

Correct this while also documenting additional options accepted by the
'write' sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00
Taylor Blau
f775d5b1cf git-multi-pack-index(1): remove non-existent incompatibility
Since fcb2205b77 (midx: implement support for writing incremental MIDX
chains, 2024-08-06), the command-line options '--incremental' and
'--bitmap' were declared to be incompatible with one another when
running 'git multi-pack-index write'.

However, since 27afc272c4 (midx: implement writing incremental MIDX
bitmaps, 2025-03-20), that incompatibility no longer exists, despite the
documentation saying so. Correct this by removing the stale reference to
their incompatibility.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00
Taylor Blau
6b8fb17490 builtin/multi-pack-index.c: make '--progress' a common option
All multi-pack-index sub-commands (write, verify, repack, and expire)
support a '--progress' command-line option, despite not listing it as
one of the common options in `common_opts`.

As a result each sub-command declares its own `OPT_BIT()` for a
"--progress" command-line option. Centralize this within the
`common_opts` to avoid re-declaring it in each sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00