304 Commits

Author SHA1 Message Date
Junio C Hamano
9d442ce2e2 Merge branch 'ps/object-source-management'
Code refactoring around object database sources.

* ps/object-source-management:
  odb: handle recreation of quarantine directories
  odb: handle changing a repository's commondir
  chdir-notify: add function to unregister listeners
  odb: handle initialization of sources in `odb_new()`
  http-push: stop setting up `the_repository` for each reference
  t/helper: stop setting up `the_repository` repeatedly
  builtin/index-pack: fix deferred fsck outside repos
  oidset: introduce `oidset_equal()`
  odb: move logic to disable ref updates into repo
  odb: refactor `odb_clear()` to `odb_free()`
  odb: adopt logic to close object databases
  setup: convert `set_git_dir()` to have file scope
  path: move `enter_repo()` into "setup.c"
2025-12-05 14:49:58 +09:00
Junio C Hamano
0534b78576 Merge branch 'jc/optional-path'
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.

* jc/optional-path:
  config: really treat missing optional path as not configured
  config: really pretend missing :(optional) value is not there
  config: mark otherwise unused function as file-scope static
2025-12-05 14:49:56 +09:00
Junio C Hamano
aea8cc3a10 Merge branch 'jk/asan-bonanza'
Various issues detected by Asan have been corrected.

* jk/asan-bonanza:
  t: enable ASan's strict_string_checks option
  fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
  fsck: remove redundant date timestamp check
  fsck: avoid strcspn() in fsck_ident()
  fsck: assert newline presence in fsck_ident()
  cache-tree: avoid strtol() on non-string buffer
  Makefile: turn on NO_MMAP when building with ASan
  pack-bitmap: handle name-hash lookups in incremental bitmaps
  compat/mmap: mark unused argument in git_munmap()
2025-11-30 18:31:41 -08:00
Patrick Steinhardt
8dc22e87f0 builtin/index-pack: fix deferred fsck outside repos
When asked to perform object consistency checks via the `--fsck-objects`
flag we verify that each object part of the pack is valid. In general,
this check can even be performed outside of a Git repository: we don't
need an initialized object database as we simply read the object from
the packfile directly.

But there's one exception: a subset of the object checks may be deferred
to a later point in time. For now, this only concerns ".gitmodules" and
".gitattributes" files: whenever we see a tree referencing these files
we queue them for a deferred check. This is done because we need to do
some extra checks for those files to ensure that they are well-formed,
and these checks need to be done regardless of whether the corresponding
blobs are part of the packfile or not.

This works inside a repository, but unfortunately the logic leads to a
segfault when running outside of one. This is because we eventually call
`odb_read_object()`, which will crash because the object database has
not been initialized.

There's multiple options here:

  - We could in theory create a purely in-memory database with only a
    packfile store that contains the single packfile. We don't really
    have the infrastructure for this yet though, and it would end up
    being quite hacky.

  - We could refuse to perform consistency checks outside of a
    repository. But most of the checks work alright, so this would be a
    regression.

  - We can skip the finalizing consistency checks when running outside
    of a repository. This is not as invasive as skipping all checks,
    but it's not great to randomly skip a subset of tests, either.

None of these options really feel perfect. The first one would be the
obvious choice if easily possible.

There's another option though: instead of skipping the final object
checks, we can die if there are any queued object checks. With this
change we now die exactly if and only if we would have previously
segfaulted. Like this we ensure that objects that _may_ fail the
consistency checks won't be silently skipped, and at the same time we
give users a much better error message.

Refactor the code accordingly and add a test that would have triggered
the segfault. Note that we also move down the logic to add the packfile
to the store. There is no point doing this any earlier than right before
we execute `fsck_finish()`, and it ensures that the logic to set up and
perform the consistency check is self-contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:15:59 -08:00
Junio C Hamano
0bd16856ff config: really treat missing optional path as not configured
These callers expect that git_config_pathname() that returns 0 is a
signal that the variable they passed has a string they need to act
on.  But with the introduction of ":(optional)path" earlier, that is
no longer the case.  If the path specified by the configuration
variable is missing, their variable will get a NULL in it, and they
need to act on it (often, just refraining from copying it elsewhere).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24 17:00:47 -08:00
Jeff King
5a993593b2 fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
In fsck_ident(), we parse the timestamp with parse_timestamp(), which is
really an alias for strtoumax(). But since our buffer may not be
NUL-terminated, this can trigger a complaint from ASan's
strict_string_checks mode. This is a false positive, since we know that
the buffer contains a trailing newline (which we checked earlier in the
function), and that strtoumax() would stop there.

But it is worth working around ASan's complaint. One is because that
will let us turn on strict_string_checks by default, which has helped
catch other real problems. And two is that the safety of the current
code is very hard to reason about (it subtly depends on distant code
which could change).

One option here is to just parse the number left-to-right ourselves. But
we care about the size of a timestamp_t and detecting overflow, since
that's part of the point of these checks. And doing that correctly is
tricky. So we'll instead just pull the digits into a separate,
NUL-terminated buffer, and use that to call parse_timestamp().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:12 -08:00
Jeff King
f05df7ffca fsck: remove redundant date timestamp check
After calling "parse_timestamp(p, &end, 10)", we complain if "p == end",
which would imply that we did not see any digits at all. But we know
this cannot be the case, since we would have bailed already if we did
not see any digits, courtesy of extra checks added by 8e4309038f (fsck:
do not assume NUL-termination of buffers, 2023-01-19). Since then,
checking "p == end" is redundant and we can drop it.

This will make our lives a little easier as we refactor further.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:11 -08:00
Jeff King
830424def4 fsck: avoid strcspn() in fsck_ident()
We may be operating on a buffer that is not NUL-terminated, but we use
strcspn() to parse it. This is OK in practice, as discussed in
8e4309038f (fsck: do not assume NUL-termination of buffers, 2023-01-19),
because we know there is at least a trailing newline in our buffer, and
we always pass "\n" to strcspn(). So we know it will stop before running
off the end of the buffer.

But this is a subtle point to hang our memory safety hat on. And it
confuses ASan's strict_string_checks mode, even though it is technically
a false positive (that mode complains that we have no NUL, which is
true, but it does not know that we have verified the presence of the
newline already).

Let's instead open-code the loop. As a bonus, this makes the logic more
obvious (to my mind, anyway). The current code skips forward with
strcspn until it hits "<", ">", or "\n". But then it must check which it
saw to decide if that was what we expected or not, duplicating some
logic between what's in the strcspn() and what's in the domain logic.
Instead, we can just check each character as we loop and act on it
immediately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:11 -08:00
Jeff King
0b6ec075df fsck: assert newline presence in fsck_ident()
The fsck code purports to handle buffers that are not NUL-terminated,
but fsck_ident() uses some string functions. This works OK in practice,
as explained in 8e4309038f (fsck: do not assume NUL-termination of
buffers, 2023-01-19). Before calling fsck_ident() we'll have called
verify_headers(), which makes sure we have at least a trailing newline.
And none of our string-like functions will walk past that newline.

However, that makes this code at the top of fsck_ident() very confusing:

    *ident = strchrnul(*ident, '\n');
    if (**ident == '\n')
            (*ident)++;

We should always see that newline, or our memory safety assumptions have
been violated! Further, using strchrnul() is weird, since the whole
point is that if the newline is not there, we don't necessarily have a
NUL at all, and might read off the end of the buffer.

So let's have callers pass in the boundary of our buffer, which lets us
safely find the newline with memchr(). And if it is not there, this is a
BUG(), because it means our caller did not validate the input with
verify_headers() as it was supposed to (and we are better off bailing
rather than having memory-safety problems).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:11 -08:00
brian m. carlson
51acda73d3 fsck: consider gpgsig headers expected in tags
When we're creating a tag, we want to make sure that gpgsig and
gpgsig-sha256 headers are allowed for the commit.  The default fsck
behavior is to ignore the fact that they're left over, but some of our
tests enable strict checking which flags them nonetheless.  Add
improved checking for these headers as well as documentation and several
tests.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-09 17:46:14 -07:00
Patrick Steinhardt
08b775864e config: move Git config parsing into "environment.c"
In "config.c" we host both the business logic to read and write config
files as well as the logic to parse specific Git-related variables. On
the one hand this is mixing concerns, but even more importantly it means
that we cannot easily remove the dependency on `the_repository` in our
config parsing logic.

Move the logic into "environment.c". This file is a grab bag of all
kinds of global state already, so it is quite a good fit. Furthermore,
it also hosts most of the global variables that we're parsing the config
values into, making this an even better fit.

Note that there is one hidden change: in `parse_fsync_components()` we
use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But
as -Wsign-compare warnings are enabled in this file this causes a
compiler warning. The issue is fixed by using a `size_t` instead.

This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE`
declaration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23 08:15:22 -07:00
Patrick Steinhardt
d4ff88aee3 odb: rename repo_read_object_file()
Rename `repo_read_object_file()` to `odb_read_object()` to match other
functions related to the object database and our modern coding
guidelines.

Introduce a compatibility wrapper so that any in-flight topics will
continue to compile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:38 -07:00
Patrick Steinhardt
8f49151763 object-store: rename files to "odb.{c,h}"
In the preceding commits we have renamed the structures contained in
"object-store.h" to `struct object_database` and `struct odb_backend`.
As such, the code files "object-store.{c,h}" are confusingly named now.
Rename them to "odb.{c,h}" accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01 14:46:34 -07:00
Patrick Steinhardt
68cd492a3e object-store: merge "object-store-ll.h" and "object-store.h"
The "object-store-ll.h" header has been introduced to keep transitive
header dependendcies and compile times at bay. Now that we have created
a new "object-store.c" file though we can easily move the last remaining
additional bit of "object-store.h", the `odb_path_map`, out of the
header.

Do so. As the "object-store.h" header is now equivalent to its low-level
alternative we drop the latter and inline it into the former.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15 08:24:37 -07:00
Justin Tobler
ca7158076f fsck: reject misconfigured fsck.skipList
In Git, fsck operations can ignore known broken objects via the
`fsck.skipList` configuration. This option expects a path to a file with
the list of object names. When the configuration is specified without a
path, an error message is printed, but the command continues as if the
configuration was not set. Configuring `fsck.skipList` without a value
is a misconfiguration so config parsing should be more strict and reject
it.

Update `git_fsck_config()` to no longer ignore misconfiguration of
`fsck.skipList`. The same behavior is also present for
`fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration
parsers for these are updated to ensure the related operations remain
consistent.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:22:25 -08:00
Karthik Nayak
c87910b96b packfile: pass down repository to for_each_packed_object
The function `for_each_packed_object` currently relies on the global
variable `the_repository`. To eliminate global variable usage in
`packfile.c`, we should progressively shift the dependency on
the_repository to higher layers. Let's remove its usage from this
function and closely related function `is_promisor_object`.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04 08:21:54 +09:00
shejialuo
bf061d26c7 builtin/refs: add verify subcommand
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)".

Add "verbose" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs and objects consistency.

Remove bit-field for "strict" field, this is because we cannot take
address of a bit-field which makes it unhandy to set member variables
when parsing the command line options.

The "git-fsck(1)" declares "fsck_options" variable with "static"
identifier which avoids complaint by the leak-checker. However, in
"git-refs verify", we need to do memory clean manually. Thus add
"fsck_options_clear" function in "fsck.c" to provide memory clean
operation.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
shejialuo
2de307cdb2 fsck: add refs report function
Introduce a new struct "fsck_ref_report" to contain the information we
need when reporting refs-related messages.

With the new "fsck_vreport" function, add a new function
"fsck_report_ref" to report refs-related fsck error message. Unlike
"report" function uses the exact parameters, we simply pass "struct
fsck_ref_report *report" as the parameter. This is because at current we
don't know exactly how many fields we need. By passing this parameter,
we don't need to change this function prototype when we want to add more
information into "fsck_ref_report".

We have introduced "fsck_report_ref" function to report the error
message for refs. We still need to add the corresponding callback
function. Create refs-specific "error_func" callback
"fsck_refs_error_function".

Last, add "FSCK_REFS_OPTIONS_DEFAULT" macro to create default options
when checking ref consistency.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo
3473d18fad fsck: add a unified interface for reporting fsck messages
The static function "report" provided by "fsck.c" aims at checking error
type and calling the callback "error_func" to report the message. Both
refs and objects need to check the error type of the current fsck
message. In order to extract this common behavior, create a new function
"fsck_vreport". Instead of using "...", provide "va_list" to allow more
flexibility.

Instead of changing "report" prototype to be align with the
"fsck_vreport" function, we leave the "report" prototype unchanged due
to the reason that there are nearly 62 references about "report"
function. Simply change "report" function to use "fsck_vreport" to
report objects related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo
0ec5dfe8c4 fsck: make "fsck_error" callback generic
The "fsck_error" callback is designed to report the objects-related
error messages. It accepts two parameter "oid" and "object_type" which
is not generic. In order to provide a unified callback which can report
either objects or refs, remove the objects-related parameters and add
the generic parameter "void *fsck_report".

Create a new "fsck_object_report" structure which incorporates the
removed parameters "oid" and "object_type". Then change the
corresponding references to adapt to new "fsck_error" callback.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo
8cd4a447b8 fsck: rename objects-related fsck error functions
The names of objects-related fsck error functions are generic. It's OK
when there is only object database check. However, we are going to
introduce refs database check report function. To avoid ambiguity,
rename object-related fsck error functions to explicitly indicate these
functions are used to report objects-related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo
2d79aa9095 fsck: rename "skiplist" to "skip_oids"
The "skiplist" field in "fsck_options" is related to objects. Because we
are going to introduce ref consistency check, the "skiplist" name is too
general which will make the caller think "skiplist" is related to both
the refs and objects.

It may seem that for both refs and objects, we should provide a general
"skiplist" here. However, the type for "skiplist" is `struct oidset`
which is totally unsuitable for refs.

To avoid above ambiguity, rename "skiplist" to "skip_oids".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
Junio C Hamano
7b472da915 Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-07-02 09:59:00 -07:00
Junio C Hamano
4216329457 Merge branch 'ps/no-writable-strings'
Building with "-Werror -Wwrite-strings" is now supported.

* ps/no-writable-strings: (27 commits)
  config.mak.dev: enable `-Wwrite-strings` warning
  builtin/merge: always store allocated strings in `pull_twohead`
  builtin/rebase: always store allocated string in `options.strategy`
  builtin/rebase: do not assign default backend to non-constant field
  imap-send: fix leaking memory in `imap_server_conf`
  imap-send: drop global `imap_server_conf` variable
  mailmap: always store allocated strings in mailmap blob
  revision: always store allocated strings in output encoding
  remote-curl: avoid assigning string constant to non-const variable
  send-pack: always allocate receive status
  parse-options: cast long name for OPTION_ALIAS
  http: do not assign string constant to non-const field
  compat/win32: fix const-correctness with string constants
  pretty: add casts for decoration option pointers
  object-file: make `buf` parameter of `index_mem()` a constant
  object-file: mark cached object buffers as const
  ident: add casts for fallback name and GECOS
  entry: refactor how we remove items for delayed checkouts
  line-log: always allocate the output prefix
  line-log: stop assigning string constant to file parent buffer
  ...
2024-06-17 15:55:58 -07:00
Patrick Steinhardt
f2c32a66f5 oidset: pass hash algorithm when parsing file
The `oidset_parse_file_carefully()` function implicitly depends on
`the_repository` when parsing object IDs. Fix this by having callers
pass in the hash algorithm to use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:34 -07:00
Patrick Steinhardt
e7da938570 global: introduce USE_THE_REPOSITORY_VARIABLE macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt
b567004b4b global: improve const correctness when assigning string constants
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
Junio C Hamano
cf792653ad Merge branch 'ps/leakfixes'
Leakfixes.

* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-06 12:49:23 -07:00
Patrick Steinhardt
6073b3b5c3 config: clarify memory ownership in git_config_pathname()
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Junio C Hamano
d36cc0d5a4 Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint
* fixes/2.45.1/2.44:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:59:12 -07:00
Junio C Hamano
863c0ed71e Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44
* fixes/2.45.1/2.43:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:35 -07:00
Junio C Hamano
3c562ef2e6 Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43
* fixes/2.45.1/2.42:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:11 -07:00
Junio C Hamano
73339e4dc2 Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42
* fixes/2.45.1/2.41:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:43 -07:00
Junio C Hamano
4f215d214f Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41
* fixes/2.45.1/2.40:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:02 -07:00
Junio C Hamano
48440f60a7 Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.

* jc/fix-2.45.1-and-friends-for-2.39:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 12:29:36 -07:00
Junio C Hamano
1991703bdb Revert "fsck: warn about symlink pointing inside a gitdir"
This reverts commit a33fea08 (fsck: warn about symlink pointing
inside a gitdir, 2024-04-10), which warns against symbolic links
commonly created by git-annex.
2024-05-22 21:55:31 -07:00
Johannes Schindelin
1c00f92eb5 Sync with 2.44.1
* maint-2.44: (41 commits)
  Git 2.44.1
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  ...
2024-04-29 20:42:30 +02:00
Johannes Schindelin
e5e6663e69 Sync with 2.43.4
* maint-2.43: (40 commits)
  Git 2.43.4
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  ...
2024-04-19 12:38:54 +02:00
Johannes Schindelin
8e97ec3662 Sync with 2.42.2
* maint-2.42: (39 commits)
  Git 2.42.2
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  ...
2024-04-19 12:38:50 +02:00
Johannes Schindelin
be348e9815 Sync with 2.41.1
* maint-2.41: (38 commits)
  Git 2.41.1
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  ...
2024-04-19 12:38:46 +02:00
Johannes Schindelin
f5b2af06f5 Sync with 2.40.2
* maint-2.40: (39 commits)
  Git 2.40.2
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  upload-pack: disable lazy-fetching by default
  ...
2024-04-19 12:38:42 +02:00
Johannes Schindelin
93a88f42db Sync with 2.39.4
* maint-2.39: (38 commits)
  Git 2.39.4
  fsck: warn about symlink pointing inside a gitdir
  core.hooksPath: add some protection while cloning
  init.templateDir: consider this config setting protected
  clone: prevent hooks from running during a clone
  Add a helper function to compare file contents
  init: refactor the template directory discovery into its own function
  find_hook(): refactor the `STRIP_EXTENSION` logic
  clone: when symbolic links collide with directories, keep the latter
  entry: report more colliding paths
  t5510: verify that D/F confusion cannot lead to an RCE
  submodule: require the submodule path to contain directories only
  clone_submodule: avoid using `access()` on directories
  submodules: submodule paths must not contain symlinks
  clone: prevent clashing git dirs when cloning submodule in parallel
  t7423: add tests for symlinked submodule directories
  has_dir_name(): do not get confused by characters < '/'
  docs: document security issues around untrusted .git dirs
  upload-pack: disable lazy-fetching by default
  fetch/clone: detect dubious ownership of local repositories
  ...
2024-04-19 12:38:37 +02:00
Johannes Schindelin
a33fea0886 fsck: warn about symlink pointing inside a gitdir
In the wake of fixing a vulnerability where `git clone` mistakenly
followed a symbolic link that it had just written while checking out
files, writing into a gitdir, let's add some defense-in-depth by
teaching `git fsck` to report symbolic links stored in its trees that
point inside `.git/`.

Even though the Git project never made any promises about the exact
shape of the `.git/` directory's contents, there are likely repositories
out there containing symbolic links that point inside the gitdir. For
that reason, let's only report these as warnings, not as errors.
Security-conscious users are encouraged to configure
`fsck.symlinkPointsToGitDir = error`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-19 12:38:25 +02:00
Junio C Hamano
1002f28a52 Merge branch 'eb/hash-transition'
Work to support a repository that work with both SHA-1 and SHA-256
hash algorithms has started.

* eb/hash-transition: (30 commits)
  t1016-compatObjectFormat: add tests to verify the conversion between objects
  t1006: test oid compatibility with cat-file
  t1006: rename sha1 to oid
  test-lib: compute the compatibility hash so tests may use it
  builtin/ls-tree: let the oid determine the output algorithm
  object-file: handle compat objects in check_object_signature
  tree-walk: init_tree_desc take an oid to get the hash algorithm
  builtin/cat-file: let the oid determine the output algorithm
  rev-parse: add an --output-object-format parameter
  repository: implement extensions.compatObjectFormat
  object-file: update object_info_extended to reencode objects
  object-file-convert: convert commits that embed signed tags
  object-file-convert: convert commit objects when writing
  object-file-convert: don't leak when converting tag objects
  object-file-convert: convert tag objects when writing
  object-file-convert: add a function to convert trees between algorithms
  object: factor out parse_mode out of fast-import and tree-walk into in object.h
  cache: add a function to read an OID of a specific algorithm
  tag: sign both hashes
  commit: export add_header_signature to support handling signatures on tags
  ...
2024-03-28 14:13:50 -07:00
Junio C Hamano
0f7a10a3aa Merge branch 'en/header-cleanup' into maint-2.43
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-02-08 16:22:10 -08:00
Junio C Hamano
76bd1294d8 Merge branch 'vd/fsck-submodule-url-test'
Tighten URL checks fsck makes in a URL recorded for submodules.

* vd/fsck-submodule-url-test:
  submodule-config.c: strengthen URL fsck check
  t7450: test submodule urls
  test-submodule: remove command line handling for check-name
  submodule-config.h: move check_submodule_url
2024-01-26 08:54:47 -08:00
Victoria Dye
13320ff610 submodule-config.h: move check_submodule_url
Move 'check_submodule_url' out of 'fsck.c' and into 'submodule-config.h' as
a public method, similar to 'check_submodule_name'. With the function now
accessible outside of 'fsck', it can be used in a later commit to extend
'test-tool submodule' to check the validity of submodule URLs as it does
with names in the 'check-name' subcommand.

Other than its location, no changes are made to 'check_submodule_url' in
this patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 10:12:48 -08:00
Junio C Hamano
492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Elijah Newren
eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Jeff King
d49cb162fa fsck: handle NULL value when parsing message config
When parsing fsck.*, receive.fsck.*, or fetch.fsck.*, we don't check for
an implicit bool. So any of:

  [fsck]
  badTree
  [receive "fsck"]
  badTree
  [fetch "fsck"]
  badTree

will cause us to segfault. We can fix it with config_error_nonbool() in
the usual way, but we have to make a few more changes to get good error
messages. The problem is that all three spots do:

  if (skip_prefix(var, "fsck.", &var))

to match and parse the actual message id. But that means that "var" now
just says "badTree" instead of "receive.fsck.badTree", making the
resulting message confusing. We can fix that by storing the parsed
message id in its own separate variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:47 +09:00