git-mirror

mirror of https://github.com/git/git.git synced 2025-12-12 20:36:24 +01:00

Author	SHA1	Message	Date
Jean-Noël Avila via GitGitGadget	fddba8f737	doc: pull-fetch-param typofix An earier patch had a typo discovered after it has been merged to 'next'. Fix it. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-24 10:55:48 -08:00
Junio C Hamano	debbc87557	The second batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-21 09:14:18 -08:00
Junio C Hamano	7895a60969	Merge branch 'jc/gitattributes-whitespace-no-indent-fix' Ever since we added whitespace rules for this project, we misspelt an entry, which has been corrected. * jc/gitattributes-whitespace-no-indent-fix: .gitattributes: remove misspelled no-op whitespace attribute	2025-11-21 09:14:18 -08:00
Junio C Hamano	c62d2d3810	Merge branch 'kn/maintenance-is-needed' "git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly	2025-11-21 09:14:17 -08:00
Junio C Hamano	3176576a56	Merge branch 'rs/diff-quiet-no-rename' As "git diff --quiet" only cares about the existence of any changes, disable rename/copy detection to skip more expensive processing whose result will be discarded anyway. * rs/diff-quiet-no-rename: diff: disable rename detection with --quiet	2025-11-21 09:14:15 -08:00
Junio C Hamano	770afe4437	config: mark otherwise unused function as file-scope static git_configset_get_pathname() is only used once inside config.c; we do not have to expose it as a public function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-20 15:03:40 -08:00
Greg Funni	2367c6bcd6	win32: return error if SleepConditionVariableCS fails If it fails, return an error. Signed-off-by: Greg Funni <gfunni234@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-20 14:45:26 -08:00
Kristoffer Haugsbakk	fbf3d0669f	doc: warn against --committer-date-is-author-date This option could create a commit history which violates the assumption that commits have non-decreasing commit timestamps. Warn against that in both git-am(1) and git-rebase(1). The genesis of this option is from git-am(1) and was added in `3f01ad66` (am: Add --committer-date-is-author-date option, 2009-01-22). The commit message doesn’t give us an example of a use case, but the thread starter does:[1] I've a big set of patches in a mbox file: there's sufficient info inside for git-am to work. Yet, each time I do import these, my sha1sums are changing because of different commit dates. I'd like to force the commit date to match the info/date from the time I received the email (and therefore always get back the right sha1sums). [1]: https://lore.kernel.org/git/46d6db660901221441q60eb90bdge601a7a250c3a247@mail.gmail.com/ So the motivation was to treat git-am(1) as an import command that creates the same commit IDs. Putting aside the question of whether you should be using git-am(1) for importing commits, this approach is problematic: • you still need to apply the commits to the same base if you want the same hashes; and • you need the same committer. And if you expect the same committer, why is this person applying the same patches multiple times with the goal of making identical commits? That was all for git-am(1). It was added to git-rebase(1) in `570ccad3` (rebase: add options passed to git-am, 2009-03-18)[2] in order to plug options that could not be sent on to git-am(1). At this point the utility of the option graduated to making no sense; a use case for `git rebase --committer-date-is-author- date` is still yet to be found. Just warn against using this option on both commands and remind the user to consider whether they really need it. † 2: See also `7573cec5` (rebase -i: support --committer-date-is-author-date, 2020-08-17) for the commit for the merge backend Suggested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-20 10:03:31 -08:00
Patrick Steinhardt	f8bdf3127a	odb: refactor `odb_clear()` to `odb_free()` The function `odb_clear()` releases all resources allocated to an object database and ensures that all fields become zero'd out. Despite its naming though it doesn't really clear the object database so that it becomes ready for reuse afterwards again -- the caller would first have to reinitialize it, and that contradicts the terminology of "clearing" as we have defined it in our coding guidelines. There isn't really only a reason to have "clearing" semantics, either. There's only a single caller of `odb_clear()`, and that caller also ends up freeing the object database structure itself. Refactor the function to have "freeing" semantics instead, so that the structure itself is also freed, which allows us to drop some useless boilerplate to zero out the structure's members. This refactoring reveals that we're trying to close the commit graph multiple times: once directly via `free_commit_graph()`, and once via `odb_close()`. Drop the former call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 17:41:03 -08:00
Patrick Steinhardt	9aaba57993	odb: adopt logic to close object databases The logic to close an object database is currently contained in the packfile subsystem. That choice is somewhat relatable, as most of the logic really is to close resources associated with the packfile store itself. But we also end up handling object sources and commit graphs, which certainly is not related to packfiles. Move the function into the object database subsystem and rename it to `odb_close()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 17:41:03 -08:00
Patrick Steinhardt	7c188a9e45	setup: convert `set_git_dir()` to have file scope We don't have any external callers of `set_git_dir()` anymore now that `enter_repo()` has been moved into "setup.c". Remove the declaration and mark the function as static. Note that this change requires us to move the implementation around so that we can avoid adding any new forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 17:41:03 -08:00
Patrick Steinhardt	831e02340b	path: move `enter_repo()` into "setup.c" The function `enter_repo()` is used to enter a repository at a given path. As such it sits way closer to setting up a repository than it does with handling paths, but regardless of that it's located in "path.c" instead of in "setup.c". Move the function into "setup.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 17:41:03 -08:00
Junio C Hamano	c6def6a055	Merge branch 'ps/object-source-loose' into ps/object-source-management A part of code paths that deals with loose objects has been cleaned up. * ps/object-source-loose: object-file: refactor writing objects via a stream object-file: rename `write_object_file()` object-file: refactor freshening of objects object-file: rename `has_loose_object()` object-file: read objects via the loose object source object-file: move loose object map into loose source object-file: hide internals when we need to reprepare loose sources object-file: move loose object cache into loose source object-file: introduce `struct odb_source_loose` object-file: move `fetch_if_missing` odb: adjust naming to free object sources odb: introduce `odb_source_new()` odb: fix subtle logic to check whether an alternate is usable	2025-11-19 17:40:24 -08:00
Jean-Noël Avila	f7316a66d3	doc: convert git push to synopsis style - Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 15:00:45 -08:00
Jean-Noël Avila	c80a5ebce0	doc: convert git pull to synopsis style - Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 15:00:42 -08:00
Jean-Noël Avila	903b04a3e7	doc: convert git fetch to synopsis style - Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 15:00:37 -08:00
Junio C Hamano	5e6e4854e0	Start 2.53 cycle Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-19 10:55:42 -08:00
Junio C Hamano	ee27005905	Merge branch 'ps/ref-peeled-tags-fixes' Another fix-up to "peeled-tags" topic. * ps/ref-peeled-tags-fixes: object: fix performance regression when peeling tags	2025-11-19 10:55:42 -08:00
Junio C Hamano	7ccfc262d7	Merge branch 'kn/refs-optim-cleanup' Code clean-up. * kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions	2025-11-19 10:55:40 -08:00
Junio C Hamano	13134cecb0	Merge branch 'ps/ref-peeled-tags' Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`	2025-11-19 10:55:39 -08:00
Junio C Hamano	7a75e549b2	Merge branch 'ps/packed-git-in-object-store' The list of packfiles used in a running Git process is moved from the packed_git structure into the packfile store. * ps/packed-git-in-object-store: packfile: track packs via the MRU list exclusively packfile: always add packfiles to MRU when adding a pack packfile: move list of packs into the packfile store builtin/pack-objects: simplify logic to find kept or nonlocal objects packfile: fix approximation of object counts http: refactor subsystem to use `packfile_list`s packfile: move the MRU list into the packfile store packfile: use a `strmap` to store packs by name	2025-11-19 10:55:37 -08:00
Ezekiel Newren	22ce0cb639	xdiff: rename rindex -> reference_index The classic diff adds only the lines that it's going to consider, during the diff, to an array. A mapping between the compacted array, and the lines of the file that they reference, is facilitated by this array. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:11 -08:00
Ezekiel Newren	5004a8da14	xdiff: change rindex from long to size_t in xdfile_t The field rindex describes an index offset for other arrays. Change it to size_t. Changing the type of rindex from long to size_t has no cascading refactor impact because it is only ever used to directly index other arrays. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:11 -08:00
Ezekiel Newren	e35877eadb	xdiff: make xdfile_t.nreff a size_t instead of long size_t is used because nreff describes the number of elements in memory for rindex. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:11 -08:00
Ezekiel Newren	016538780e	xdiff: make xdfile_t.nrec a size_t instead of long size_t is used because nrec describes the number of elements for both recs, and for 'changed' + 2. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	6a26019c81	xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash The ha field is serving two different purposes, which makes the code harder to read. At first glance, it looks like many places assume there could never be hash collisions between lines of the two input files. In reality, line_hash is used together with xdl_recmatch() to ensure correct comparisons of lines, even when collisions occur. To make this clearer, the old ha field has been split: * line_hash: a straightforward hash of a line, independent of any external context. Its type is uint64_t, as it comes from a fixed width hash function. * minimal_perfect_hash: Not a new concept, but now a separate field. It comes from the classifier's general-purpose hash table, which assigns each line a unique and minimal hash across the two files. A size_t is used here because it's meant to be used to index an array. This also avoids ` as usize` casts on the Rust side when using it to index a slice. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	b0d4ae30f5	xdiff: use unambiguous types in xdl_hash_record() Convert the function signature and body to use unambiguous types. char is changed to uint8_t because this function processes bytes in memory. unsigned long to uint64_t so that the hash output is consistent across platforms. `flags` was changed from long to uint64_t to ensure the high order bits are not dropped on platforms that treat long as 32 bits. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	9bd193253c	xdiff: use size_t for xrecord_t.size size_t is the appropriate type because size is describing the number of elements, bytes in this case, in memory. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	10f97d6aff	xdiff: make xrecord_t.ptr a uint8_t instead of char Make xrecord_t.ptr uint8_t because it's referring to bytes in memory. In order to avoid a refactor avalanche, many uses of this field were cast to char* or similar. Places where casting was unnecessary: xemit.c:156 xmerge.c:124 xmerge.c:127 xmerge.c:164 xmerge.c:169 xmerge.c:172 xmerge.c:178 Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	f007f4f4b4	xdiff: use ptrdiff_t for dstart/dend ptrdiff_t is appropriate for dstart and dend because they both describe positive or negative offsets relative to a pointer. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -08:00
Ezekiel Newren	6971934d9b	doc: define unambiguous type mappings across C and Rust Document other nuances when crossing the FFI boundary. Other language mappings may be added in the future. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:09 -08:00
Lucas Seiki Oshiro	155caac7d1	repo: add --all to git-repo-info Add a new flag `--all` to git-repo-info for requesting values for all the available keys. By using this flag, the user can retrieve all the values instead of searching what are the desired keys for what they wants. Helped-by: Karthik Nayak <karthik.188@gmail.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 13:29:10 -08:00
Lucas Seiki Oshiro	fd7d79d068	repo: factor out field printing to dedicated function Move the field printing in git-repo-info to a new function called `print_field`, allowing it to be called by functions other than `print_fields`. Also change its use of quote_c_style() helper to output directly to the standard output stream, instead of taking a result in a strbuf and then printing it outselves. Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 13:29:10 -08:00
Phillip Wood	08dfa59835	worktree list: quote paths If a worktree path contains newlines or other control characters it messes up the output of "git worktree list". Fix this by using quote_path() to display the worktree path. The output of "git worktree list" is designed for human consumption, scripts should be using the "--porcelain" option so this change should not break them. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 10:11:29 -08:00
Phillip Wood	a6238ee163	worktree list: fix column spacing The output of "git worktree list" displays a table containing the worktree path, HEAD OID and branch name for each worktree. The code aligns the columns by measuring the visual width of the worktree path when it is printed. Unfortunately it fails to use the visual width when calculating the width of the column so, if any of the paths contain a multibyte character, we can end up with excess padding between columns. The simplest fix would be to replace strlen() with utf8_strwidth() in measure_widths(). However that leaves us measuring the visual width twice and the byte length once. By caching the visual width and printing the padding separately to the worktree path, we only need to calculate the visual width once and do not need the byte length at all. The visual widths are stored in an arrays of structs rather than an array of ints as the next commit will add more struct members. Even if there are no multibyte characters in any of the paths we still print an extra space between the path and the object id as the field width is calculated as one plus the length of the path and we print an explicit space as well. This is fixed by not printing the extra space. The tests are updated to include multibyte characters in one of the worktree paths and to check the spacing of the columns. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 10:11:19 -08:00
Jeff King	14b561e768	test-mktemp: plug memory and descriptor leaks We test xmkstemp() in our helper by just calling: xmkstemp(xstrdup(argv[1])); This leaks both the copied string as well as the descriptor returned by the function. In practice this isn't a big deal, since we immediately exit the program, but: 1. LSan will complain about the memory leak. The only reason we did not notice this in our leak-checking builds is that both of the callers in the test suite (both in t0070) pass a broken template (and expect failure). So the function calls die() before we can actually leak. But it's an accident waiting to happen if anybody adds a call which succeeds. 2. Coverity complains about the descriptor leak. There's a long list of uninteresting or false positives in Coverity's results, but since we're here we might as well fix it, too. I didn't bother adding a new test that triggers the leak. It's not even in real production code, but just in the test-helper itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 10:05:14 -08:00
Jeff King	17bd1108ea	ci(windows-meson-test): handle options and output like other test jobs The GitHub windows-meson-test jobs directly run "meson test" with the --slice option. This means they skip all of the ci/lib.sh infrastructure, and in particular: 1. They do not actually set any GIT_TEST_OPTS like --verbose-log or -x. 2. They do not do the usual handle_failed_tests() magic to print test failures or tar up failed directories. As a result, you get almost no feedback at all when a test fails in this job, making debugging rather tricky. Let's try to make this behave more like the other CI jobs. Because we're on Windows, we can't just use the normal run-build-and-tests.sh script. Our build runs as a separate job (like the non-meson Windows job), and then we parallelize the tests across several job slices. So we need something like the run-test-slice.sh script that the "windows-test" job uses. In theory we could just swap out the "make" invocation there for "meson". But it doesn't quite work, because "make" knows how to pull GIT_TEST_OPTS out of GIT-BUILD-OPTIONS automatically. But for meson, we have to extract them into the --test-args option ourselves. I tried making the logic in run-test-slice.sh conditional, but there ended up being hardly any common code at all (and there are some tricky ordering constraints). So I added up with a new meson-specific test-slice runner. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:45:29 -08:00
Jeff King	e96105aa17	unit-test: ignore --no-chain-lint In the same spirit as `9faf3963b6` (t: introduce compatibility options to clar-based tests, 2024-12-13), we should ignore --no-chain-lint passed to our clar tests, since it may appear in GIT_TEST_OPTS to be used with other tests. This is particularly important on Windows CI, where --no-chain-lint is added to the test options by default, and the meson build will pass all options to the unit tests. The only reason our meson Windows CI job does not run into this currently is that it is not respecting GIT_TEST_OPTS at all! So ignoring this option is a prerequisite to fixing that situation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:45:28 -08:00
Jeff King	a031b6181a	t: enable ASan's strict_string_checks option ASan has an option to enable strict string checking, where any pointer passed to a function that expects a NUL-terminated string will be checked for that NUL termination. This can sometimes produce false positives. E.g., it is not wrong to pass a buffer with { '1', '2', '\n' } into strtoul(). Even though it is not NUL-terminated, it will stop at the newline. But in trying it out, it identified two problematic spots in our test suite (which have now been adjusted): 1. The strtol() parsing in cache-tree.c was a real potential problem, which would have been very hard to find otherwise (since it required constructing a very specific broken index file). 2. The use of string functions in fsck_ident() were false positives, because we knew that there was always a trailing newline which would stop the functions from reading off the end of the buffer. But the reasoning behind that is somewhat fragile, and silencing those complaints made the code easier to reason about. So even though this did not find any earth-shattering bugs, and even had a few false positives, I'm sufficiently convinced that its complaints are more helpful than hurtful. Let's turn it on by default (since the test suite now runs cleanly with it) and see if it ever turns up any other instances. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:12 -08:00
Jeff King	5a993593b2	fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated In fsck_ident(), we parse the timestamp with parse_timestamp(), which is really an alias for strtoumax(). But since our buffer may not be NUL-terminated, this can trigger a complaint from ASan's strict_string_checks mode. This is a false positive, since we know that the buffer contains a trailing newline (which we checked earlier in the function), and that strtoumax() would stop there. But it is worth working around ASan's complaint. One is because that will let us turn on strict_string_checks by default, which has helped catch other real problems. And two is that the safety of the current code is very hard to reason about (it subtly depends on distant code which could change). One option here is to just parse the number left-to-right ourselves. But we care about the size of a timestamp_t and detecting overflow, since that's part of the point of these checks. And doing that correctly is tricky. So we'll instead just pull the digits into a separate, NUL-terminated buffer, and use that to call parse_timestamp(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:12 -08:00
Jeff King	f05df7ffca	fsck: remove redundant date timestamp check After calling "parse_timestamp(p, &end, 10)", we complain if "p == end", which would imply that we did not see any digits at all. But we know this cannot be the case, since we would have bailed already if we did not see any digits, courtesy of extra checks added by `8e4309038f` (fsck: do not assume NUL-termination of buffers, 2023-01-19). Since then, checking "p == end" is redundant and we can drop it. This will make our lives a little easier as we refactor further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:11 -08:00
Jeff King	830424def4	fsck: avoid strcspn() in fsck_ident() We may be operating on a buffer that is not NUL-terminated, but we use strcspn() to parse it. This is OK in practice, as discussed in `8e4309038f` (fsck: do not assume NUL-termination of buffers, 2023-01-19), because we know there is at least a trailing newline in our buffer, and we always pass "\n" to strcspn(). So we know it will stop before running off the end of the buffer. But this is a subtle point to hang our memory safety hat on. And it confuses ASan's strict_string_checks mode, even though it is technically a false positive (that mode complains that we have no NUL, which is true, but it does not know that we have verified the presence of the newline already). Let's instead open-code the loop. As a bonus, this makes the logic more obvious (to my mind, anyway). The current code skips forward with strcspn until it hits "<", ">", or "\n". But then it must check which it saw to decide if that was what we expected or not, duplicating some logic between what's in the strcspn() and what's in the domain logic. Instead, we can just check each character as we loop and act on it immediately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:11 -08:00
Jeff King	0b6ec075df	fsck: assert newline presence in fsck_ident() The fsck code purports to handle buffers that are not NUL-terminated, but fsck_ident() uses some string functions. This works OK in practice, as explained in `8e4309038f` (fsck: do not assume NUL-termination of buffers, 2023-01-19). Before calling fsck_ident() we'll have called verify_headers(), which makes sure we have at least a trailing newline. And none of our string-like functions will walk past that newline. However, that makes this code at the top of fsck_ident() very confusing: ident = strchrnul(ident, '\n'); if (*ident == '\n') (ident)++; We should always see that newline, or our memory safety assumptions have been violated! Further, using strchrnul() is weird, since the whole point is that if the newline is not there, we don't necessarily have a NUL at all, and might read off the end of the buffer. So let's have callers pass in the boundary of our buffer, which lets us safely find the newline with memchr(). And if it is not there, this is a BUG(), because it means our caller did not validate the input with verify_headers() as it was supposed to (and we are better off bailing rather than having memory-safety problems). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:11 -08:00
Jeff King	c4c9089584	cache-tree: avoid strtol() on non-string buffer A cache-tree extension entry in the index looks like this: <name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid> where the "_nr" items are human-readable base-10 ASCII. We parse them with strtol(), even though we do not have a NUL-terminated string (we'd generally have an mmap() of the on-disk index file). For a well-formed entry, this is not a problem; strtol() will stop when it sees the newline. But there are two problems: 1. A corrupted entry could omit the newline, causing us to read further. You'd mostly get stopped by seeing non-digits in the oid field (and if it is likewise truncated, there will still be 20 or more bytes of the index checksum). So it's possible, though unlikely, to read off the end of the mmap'd buffer. Of course a malicious index file can fake the oid and the index checksum to all (ASCII) 0's. This is further complicated by the fact that mmap'd buffers tend to be zero-padded up to the page boundary. So to run off the end, the index size also has to be a multiple of the page size. This is also unlikely, though you can construct a malicious index file that matches this. The security implications aren't too interesting. The index file is a local file anyway (so you can't attack somebody by cloning, but only if you convince them to operate in a .git directory you made, at which point attacking .git/config is much easier). And it's just a read overflow via strtol(), which is unlikely to buy you much beyond a crash. 2. ASan has a strict_string_checks option, which tells it to make sure that options to string functions (like strtol) have some eventual NUL, without regard to what the function would actually do (like stopping at a newline here). This option sometimes has false positives, but it can point to sketchy areas (like this one) where the input we use doesn't exhibit a problem, but different input _could_ cause us to misbehave. Let's fix it by just parsing the values ourselves with a helper function that is careful not to go past the end of the buffer. There are a few behavior changes here that should not matter: - We do not consider overflow, as strtol() would. But nor did the original code. However, we don't trust the value we get from the on-disk file, and if it says to read 2^30 entries, we would notice that we do not have that many and bail before reading off the end of the buffer. - Our helper does not skip past extra leading whitespace as strtol() would, but according to gitformat-index(5) there should not be any. - The original quit parsing at a newline or a NUL byte, but now we insist on a newline (which is what the documentation says, and what Git has always produced). Since we are providing our own helper function, we can tweak the interface a bit to make our lives easier. The original code does not use strtol's "end" pointer to find the end of the parsed data, but rather uses a separate loop to advance our "buf" pointer to the trailing newline. We can instead provide a helper that advances "buf" as it parses, letting us read strictly left-to-right through the buffer. I didn't add a new test here. It's surprisingly difficult to construct an index of exactly the right size due to the way we pad entries. But it is easy to trigger the problem in existing tests when using ASan's strict string checking, coupled with a recent change to use NO_MMAP with ASan builds. So: make SANITIZE=address cd t ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh triggers it reliably. Technically it is not deterministic because there is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing checksum hash has a NUL byte in it. But we compute enough cache-trees in the course of that script that we are very likely to hit the problem in one of them. We can look at making strict_string_checks the default for ASan builds, but there are some other cases we'd want to fix first. Reported-by: correctmost <cmlists@sent.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:06 -08:00
Jeff King	a9990f8ec0	Makefile: turn on NO_MMAP when building with ASan Git often uses mmap() to access on-disk files. This leaves a blind spot in our SANITIZE=address builds, since ASan does not seem to handle mmap at all. Nor does the OS notice most out-of-bounds access, since it tends to round up to the nearest page size (so depending on how big the map is, you might have to overrun it by up to 4095 bytes to trigger a segfault). The previous commit demonstrates a memory bug that we missed. We could have made a new test where the out-of-bounds access was much larger, or where the mapped file ended closer to a page boundary. But the point of running the test suite with sanitizers is to catch these problems without having to construct specific tests. Let's enable NO_MMAP for our ASan builds by default, which should give us better coverage. This does increase the memory usage of Git, since we're copying from the filesystem into heap. But the repositories in the test suite tend to be small, so the overhead isn't really noticeable (and ASan already has quite a performance penalty). There are a few other known bugs that this patch will help flush out. However, they aren't directly triggered in the test suite (yet). So it's safe to turn this on now without breaking the test suite, which will help us add new tests to demonstrate those other bugs as we fix them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:06 -08:00
Jeff King	4deb882e54	pack-bitmap: handle name-hash lookups in incremental bitmaps If a bitmap has a name-hash cache, it is an array of 32-bit integers, one per entry in the bitmap, which we've mmap'd from the .bitmap file. We access it directly like this: if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); That works for both regular pack bitmaps and for non-incremental midx bitmaps. There is one bitmap_index with one "hashes" array, and index_pos is within its bounds (we do the bounds-checking when we load the bitmap). But for an incremental midx bitmap, we have a linked list of bitmap_index structs, and each one has only its own small slice of the name-hash array. If index_pos refers to an object that is not in the first bitmap_git of the chain, then we'll access memory outside of the bounds of its "hashes" array, and often outside of the mmap. Instead, we should walk through the list until we find the bitmap_index which serves our index_pos, and use its hash (after adjusting index_pos to make it relative to the slice we found). This is exactly what we do elsewhere for incremental midx lookups (like the pack_pos_to_midx() call a few lines above). But we can't use existing helpers like midx_for_object() here, because we're walking through the chain of bitmap_index structs (each of which refers to a midx), not the chain of incremental multi_pack_index structs themselves. The problem is triggered in the test suite, but we don't get a segfault because the out-of-bounds index is too small. The OS typically rounds our mmap up to the nearest page size, so we just end up accessing some extra zero'd memory. Nor do we catch it with ASan, since it doesn't seem to instrument mmaps at all. But if we build with NO_MMAP, then our maps are replaced with heap allocations, which ASan does check. And so: make NO_MMAP=1 SANITIZE=address cd t ./t5334-incremental-multi-pack-index.sh does show the problem (and this patch makes it go away). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:06 -08:00
Jeff King	65e8141f05	compat/mmap: mark unused argument in git_munmap() Our mmap compat code emulates mapping by using malloc/free. Our git_munmap() must take a "length" parameter to match the interface of munmap(), but we don't use it (it is up to the allocator to know how big the block is in free()). Let's mark it as UNUSED to avoid complaints from -Wunused-parameter. Otherwise you cannot build with "make DEVELOPER=1 NO_MMAP=1". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:05 -08:00
Johannes Schindelin	cd99203f86	ci: bump actions/setup-go from 5 to 6 Bumps actions/setup-go from 5 to 6. This upgrade includes dependency updates that incorporate a fix for a critical vulnerability. [Originally opened at https://github.com/git-for-windows/git/pull/5811] - [Release notes](https://github.com/actions/setup-go/releases) - [Commits](https://github.com/actions/setup-go/compare/v5...v6) Originally-authored-by: dependabot[bot] <support@github.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:34:16 -08:00
Johannes Schindelin	af3919816f	mingw: avoid the comma operator The pattern `return errno = ..., -1;` is observed several times in `compat/mingw.c`. It has served us well over the years, but now clang starts complaining: compat/mingw.c:723:24: error: possible misuse of comma operator here [-Werror,-Wcomma] 723 \| return errno = ENOSYS, -1; \| ^ See for example this failing workflow run: https://github.com/git-for-windows/git-sdk-arm64/actions/runs/15457893907/job/43513458823#step:8:201 Let's appease clang (and also reduce the use of the no longer common comma operator). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:19:21 -08:00
Johannes Schindelin	b0d5c88cca	cmake: stop trying to build the reftable and xdiff libraries In the `en/make-libgit-a` topic branch, more precisely in the commits `f3b4c89d59` (make: delete REFTABLE_LIB, add reftable to LIB_OBJS, 2025-10-02) and `cf680cdb95` (make: delete XDIFF_LIB, add xdiff to LIB_OBJS, 2025-10-02), the strategy to build three static libraries was rethought, and instead only one static library is now built. This is good. However, the CMake definition was not changed accordingly, and now CMake-based builds fail thusly: [...] Generating hook-list.h CMake Error at CMakeLists.txt:122 (string): string sub-command REPLACE requires at least four arguments. Call Stack (most recent call first): CMakeLists.txt:711 (parse_makefile_for_sources) CMake Error at CMakeLists.txt:122 (string): string sub-command REPLACE requires at least four arguments. Call Stack (most recent call first): CMakeLists.txt:717 (parse_makefile_for_sources) -- Configuring incomplete, errors occurred! Fix that by removing the parts that expect the reftable and xdiff objects to be defined separately in the Makefile, still. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:18:28 -08:00

1 2 3 4 5 ...

79166 Commits