Code refactoring around alternate object store.
* ps/odb-alternates-object-sources:
odb: write alternates via sources
odb: read alternates via sources
odb: drop forward declaration of `read_info_alternates()`
odb: remove mutual recursion when parsing alternates
odb: stop splitting alternate in `odb_add_to_alternates_file()`
odb: move computation of normalized objdir into `alt_odb_usable()`
odb: resolve relative alternative paths when parsing
odb: refactor parsing of alternates to be self-contained
Refactor writing of alternates so that the actual business logic is
structured around the object database source we want to write the
alternate to. Same as with the preceding commit, this will eventually
allow us to have different logic for writing alternates depending on the
backend used.
Note that after the refactoring we start to call
`odb_add_alternate_recursively()` unconditionally. This is fine though
as we know to skip adding sources that are tracked already.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adapt how we read alternates so that the interface is structured around
the object database source we're reading from. This will eventually
allow us to abstract away this behaviour with pluggable object databases
so that every format can have its own mechanism for listing alternates.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have removed the mutual recursion in the preceding commit
it is not necessary anymore to have a forward declaration of the
`read_info_alternates()` function. Move the function and its
dependencies further up so that we can remove it.
Note that this commit also removes the function documentation of
`read_info_alternates()`. It's unclear what it's documenting, but it for
sure isn't documenting the modern behaviour of the function anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding an alternative object database source we not only have to
consider the added source itself, but we also have to add _its_ sources
to our database. We implement this via mutual recursion:
1. We first call `link_alt_odb_entries()`.
2. `link_alt_odb_entries()` calls `parse_alternates()`.
3. We then add each alternate via `odb_add_alternate_recursively()`.
4. `odb_add_alternate_recursively()` calls `link_alt_odb_entries()`
again.
This flow is somewhat hard to follow, but more importantly it means that
parsing of alternates is somewhat tied to the recursive behaviour.
Refactor the function to remove the mutual recursion between adding
sources and parsing alternates. The parsing step thus becomes completely
oblivious to the fact that there is recursive behaviour going on at all.
The recursion is handled by `odb_add_alternate_recursively()` instead,
which now recurses with itself.
This refactoring allows us to move parsing of alternates into object
database sources in a subsequent step.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `odb_add_to_alternates_file()` we know to add the newly
added source to the object database in case we have already loaded
alternates. This is done so that we can make its objects accessible
immediately without having to fully reload all alternates.
The way we do this though is to call `link_alt_odb_entries()`, which
adds _multiple_ sources to the object database source in case we have
newline-separated entries. This behaviour is not documented in the
function documentation of `odb_add_to_alternates_file()`, and all
callers only ever pass a single directory to it. It's thus entirely
surprising and a conceptual mismatch.
Fix this issue by directly calling `odb_add_alternate_recursively()`
instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `alt_odb_usable()` receives as input the object database,
the path it's supposed to determine usability for as well as the
normalized path of the main object directory of the repository. The last
part is derived by the function's caller from the object database. As we
already pass the object database to `alt_odb_usable()` it is redundant
information.
Drop the extra parameter and compute the normalized object directory in
the function itself.
While at it, rename the function to `odb_is_source_usable()` to align it
with modern terminology.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing alternates and resolving potential relative paths is currently
handled in two separate steps. This has the effect that the logic to
retrieve alternates is not entirely self-contained. We want it to be
just that though so that we can eventually move the logic to list
alternates into the `struct odb_source`.
Move the logic to resolve relative alternative paths into
`parse_alternates()`. Besides bringing us a step closer towards the
above goal, it also neatly separates concerns of generating the list of
alternatives and linking them into the object database.
Note that we ignore any errors when the relative path cannot be
resolved. This isn't really a change in behaviour though: if the path
cannot be resolved to a directory then `alt_odb_usable()` still knows to
bail out.
While at it, rename the function to `odb_add_alternate_recursively()` to
more clearly indicate what its intent is and to align it with modern
terminology.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing of the alternates file and environment variable is currently
split up across multiple different functions and is entangled with
`link_alt_odb_entries()`, which is responsible for linking the parsed
object database sources. This results in two downsides:
- We have mutual recursion between parsing alternates and linking them
into the object database. This is because we also parse alternates
that the newly added sources may have.
- We mix up the actual logic to parse the data and to link them into
place.
Refactor the logic so that parsing of the alternates file is entirely
self-contained. Note that this doesn't yet fix the above two issues, but
it is a necessary step to get there.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the only use of "mktemp()" that is subject to TOCTOU race
and Stop using the insecure "mktemp()" function.
* rs/ban-mktemp:
compat: remove gitmkdtemp()
banned.h: ban mktemp(3)
compat: remove mingw_mktemp()
compat: use git_mkdtemp()
wrapper: add git_mkdtemp()
The "git_istream" abstraction has been revamped to make it easier
to interface with pluggable object database design.
* ps/object-read-stream:
streaming: drop redundant type and size pointers
streaming: move into object database subsystem
streaming: refactor interface to be object-database-centric
streaming: move logic to read packed objects streams into backend
streaming: move logic to read loose objects streams into backend
streaming: make the `odb_read_stream` definition public
streaming: get rid of `the_repository`
streaming: rely on object sources to create object stream
packfile: introduce function to read object info from a store
streaming: move zlib stream into backends
streaming: create structure for filtered object streams
streaming: create structure for packed object streams
streaming: create structure for loose object streams
streaming: create structure for in-core object streams
streaming: allocate stream inside the backend-specific logic
streaming: explicitly pass packfile info when streaming a packed object
streaming: propagate final object type via the stream
streaming: drop the `open()` callback function
streaming: rename `git_istream` into `odb_read_stream`
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.
* en/replay-doc-revision-range:
Documentation/git-replay.adoc: fix errors around revision range
A few tests have been updated to work under the shell compatible
mode of zsh.
* bc/zsh-testsuite:
t5564: fix test hang under zsh's sh mode
t0614: use numerical comparison with test_line_count
"git replay" forgot to omit the "gpgsig-sha256" extended header
from the resulting commit the same way it omits "gpgsig", which has
been corrected.
* pw/replay-exclude-gpgsig-fix:
replay: do not copy "gpgsign-sha256" header
While investigating the config options set by 'scalar' I noticed this
one wasn't documented.
Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing to a set of remotes using a nickname for the group, the
client initializes the connection to each remote, talks to the
remote and reads and parses capabilities line, and holds the
capabilities in a file-scope static variable server_capabilities_v1.
There are a few other such file-scope static variables, and these
connections cannot be parallelized until they are refactored to a
structure that keeps track of active connections.
Which is *not* the theme of this patch ;-)
For a single connection, the server_capabilities_v1 variable is
initialized to NULL (at the program initialization), populated when
we talk to the other side, used to look up capabilities of the other
side possibly multiple times, and the memory is held by the variable
until program exit, without leaking. When talking to multiple remotes,
however, the server capabilities from the second connection overwrites
without freeing the one from the first connection, which leaks.
==1080970==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 421 byte(s) in 2 object(s) allocated from:
#0 0x5615305f849e in strdup (/home/gitster/g/git-jch/bin/bin/git+0x2b349e) (BuildId: 54d149994c9e85374831958f694bd0aa3b8b1e26)
#1 0x561530e76cc4 in xstrdup /home/gitster/w/build/wrapper.c:43:14
#2 0x5615309cd7fa in process_capabilities /home/gitster/w/build/connect.c:243:27
#3 0x5615309cd502 in get_remote_heads /home/gitster/w/build/connect.c:366:4
#4 0x561530e2cb0b in handshake /home/gitster/w/build/transport.c:372:3
#5 0x561530e29ed7 in get_refs_via_connect /home/gitster/w/build/transport.c:398:9
#6 0x561530e26464 in transport_push /home/gitster/w/build/transport.c:1421:16
#7 0x561530800bec in push_with_options /home/gitster/w/build/builtin/push.c:387:8
#8 0x5615307ffb99 in do_push /home/gitster/w/build/builtin/push.c:442:7
#9 0x5615307fe926 in cmd_push /home/gitster/w/build/builtin/push.c:664:7
#10 0x56153065673f in run_builtin /home/gitster/w/build/git.c:506:11
#11 0x56153065342f in handle_builtin /home/gitster/w/build/git.c:779:9
#12 0x561530655b89 in run_argv /home/gitster/w/build/git.c:862:4
#13 0x561530652cba in cmd_main /home/gitster/w/build/git.c:984:19
#14 0x5615308dda0a in main /home/gitster/w/build/common-main.c:9:11
#15 0x7f051651bca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
SUMMARY: AddressSanitizer: 421 byte(s) leaked in 2 allocation(s).
Free the capablities data for the previous server before overwriting
it with the next server to plug this leak.
The added test fails without the freeing with SANITIZE=leak; I
somehow couldn't get it fail reliably with SANITIZE=leak,address
though.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Join two paragraphs that start with the standard “The default <hook>,
when enabled” into one and put it at the end of the “pre-commit”
section.
The trailing whitespace paragraph was added in the first commit for the
doc, in 6d35cc76 (Document hooks., 2005-09-02). Then 3e14dd2c (mention
use of "hooks.allownonascii" in "man githooks", 2019-02-20) updated the
“pre-commit” section to mention the non-ASCII check that was added in
d00e364d.[1] But this paragraph was added one-past the original
“default” paragraph, after the env. variable paragraph, and starts
exactly the same. That causes the flow of this section to feel
off (paragraphs in order):
1. Invoked by <cmd> and what parameters it takes
2. The default 'pre-commit' hook catches introduction of trailing
whitespace
3. `GIT_EDITOR=:`
4. The default pre-commit' hook catches introduction of non-ASCII
filenames
Let’s instead join these two paragrahs and explain the whole behavior of
the default script.
† 1: Extend sample pre-commit hook to check for non ascii filenames,
2009-05-19
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitmkdtemp() has become a trivial wrapper around git_mkdtemp(). Remove
this now unnecessary layer of indirection.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Older versions of mktemp(3) generate easily guessable file names. The
function checks if the generated name is used, which is unreliable, as
a file with that name might then be created by some other process before
we can do it ourselves. The function was dropped from POSIX due to its
security problems. Forbid its use.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the mktemp(3) compatibility function now that its last caller was
removed by the previous commit.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A file might appear at the path returned by mktemp(3) before we call
mkdir(2). Use the more robust git_mkdtemp() instead, which retries a
number of times and doesn't need to call lstat(2).
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend git_mkstemps_mode() to optionally call mkdir(2) instead of
open(2), then use that ability to create a mkdtemp(3) replacement,
git_mkdtemp(). We'll start using it in the next commit.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git repo struct" learned to take "-z" as a synonym to "--format=nul".
* lo/repo-struct-z:
repo: add -z as an alias for --format=nul to git-repo-structure
repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
repo: remove blank line from Documentation/git-repo.adoc
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.
* kh/advise-w-git-help-in-branch:
branch: advice using git-help(1) instead of man(1)
Build fix.
* tc/meson-cross-compile-fix:
meson: use is_cross_build() where possible
meson: only detect ICONV_OMITS_BOM if possible
meson: ignore subprojects/.wraplock
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.
* js/last-modified-with-sparse-checkouts:
last-modified: support sparse checkouts
Halve the memory consumed by artificial filepairs created during
"git diff --find-copioes-harder", also making the operation run
faster.
* rs/diff-index-find-copies-harder-optim:
diff-index: don't queue unchanged filepairs with diff_change()
Recent optimization to "last-modified" command introduced use of
uninitialized block of memory, which has been corrected.
* tc/last-modified-active-paths-optimization:
last-modified: fix use of uninitialized memory
The error message given by "git config set", when the variable
being updated has more than one values defined, used old style "git
config" syntax with an incorrect option in its hint, both of which
have been corrected.
* rs/config-set-multi-error-message-fix:
config: fix suggestion for failed set of multi-valued option
The option help text given by "git config unset -h" described
the "--all" option to "replace", not "unset", multiple variables,
which has been corrected.
* rs/config-unset-opthelp-fix:
config: fix short help of unset flags
Code refactoring around object database sources.
* ps/object-source-management:
odb: handle recreation of quarantine directories
odb: handle changing a repository's commondir
chdir-notify: add function to unregister listeners
odb: handle initialization of sources in `odb_new()`
http-push: stop setting up `the_repository` for each reference
t/helper: stop setting up `the_repository` repeatedly
builtin/index-pack: fix deferred fsck outside repos
oidset: introduce `oidset_equal()`
odb: move logic to disable ref updates into repo
odb: refactor `odb_clear()` to `odb_free()`
odb: adopt logic to close object databases
setup: convert `set_git_dir()` to have file scope
path: move `enter_repo()` into "setup.c"
Dockerised jobs at the GitHub Actions CI have been taught to show
more details of failed tests.
* js/ci-show-breakage-in-dockerized-jobs:
ci(dockerized): do show the result of failing tests again
The "--committer-date-is-author-date" option of "git am/rebase" is
a misguided one. The documentation is updated to discourage its
use.
* kh/doc-committer-date-is-author-date:
doc: warn against --committer-date-is-author-date
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.
* jc/optional-path:
config: really treat missing optional path as not configured
config: really pretend missing :(optional) value is not there
config: mark otherwise unused function as file-scope static
Code clean-up.
* en/xdiff-cleanup-2:
xdiff: rename rindex -> reference_index
xdiff: change rindex from long to size_t in xdfile_t
xdiff: make xdfile_t.nreff a size_t instead of long
xdiff: make xdfile_t.nrec a size_t instead of long
xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
xdiff: use unambiguous types in xdl_hash_record()
xdiff: use size_t for xrecord_t.size
xdiff: make xrecord_t.ptr a uint8_t instead of char
xdiff: use ptrdiff_t for dstart/dend
doc: define unambiguous type mappings across C and Rust