git-mirror

mirror of https://github.com/git/git.git synced 2025-12-12 20:36:24 +01:00

Author	SHA1	Message	Date
Karthik Nayak	f6c5ca387a	refs: add a `optimize_required` field to `struct ref_storage_be` To allow users of the refs namespace to check if the reference backend requires optimization, add a new field `optimize_required` field to `struct ref_storage_be`. This field is of type `optimize_required_fn` which is also introduced in this commit. Modify the debug, files, packed and reftable backend to implement this field. A following commit will expose this via 'git pack-refs' and 'git refs optimize'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-10 09:28:48 -08:00
Karthik Nayak	2cd99d9841	refs: rename 'pack_refs_opts' to 'refs_optimize_opts' The previous commit removed all references to 'pack_refs()' within the refs subsystem. Continue this cleanup by also renaming 'pack_refs_opts' to 'refs_optimize_opts' and the respective flags accordingly. Keeping the naming consistent will make the code easier to maintain. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-04 07:35:12 -08:00
Karthik Nayak	9b93ab8a9c	refs: move to using the '.optimize' functions The `struct ref_store` variable exposes two ways to optimize a reftable backend: 1. pack_refs 2. optimize The former was specific to the 'files' + 'packed' refs backend. The latter is more generic and covers all backends. While the naming is different, both of these functions perform the same functionality. Consolidate this code to only maintain the 'optimize' functions. Do this by modifying the backends so that they exclusively implement the `optimize` callback, only. All users of the refs subsystem already use the 'optimize' function so there is no changes needed on the callee side. Finally, cleanup all references to the 'pack_refs' field of the structure and code around it. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-04 07:35:12 -08:00
Patrick Steinhardt	705114772e	refs: drop infrastructure to peel via iterators Now that the peeled object ID gets propagated via the `struct reference` there is no need anymore to call into the reference iterator itself to dereference an object. Remove this infrastructure. Most of the changes are straight-forward deletions of code. There is one exception though in `refs/packed-backend.c::write_with_updates()`. Here we stop peeling the iterator and instead just pass the peeled object ID of that iterator directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-04 07:32:25 -08:00
Patrick Steinhardt	5a5c7359f7	refs: drop `current_ref_iter` hack In preceding commits we have refactored all callers of `peel_iterated_oid()` to instead use `reference_get_peeled_oid()`. This allows us to thus get rid of the former function. Getting rid of that function is nice, but even nicer is that this also allows us to get rid of the `current_ref_iter` hack. This global variable tracked the currently-active ref iterator so that we can use it to peel an object ID. Now that the peeled object ID is propagated via `struct reference` though we don't have to depend on this hack anymore, which makes for a more robust and easier-to-understand infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-04 07:32:25 -08:00
Patrick Steinhardt	89baa52da6	refs: introduce `.ref` field for the base iterator The base iterator has a couple of fields that tracks the name, target, object ID and flags for the current reference. Due to this design we have to create a new `struct reference` whenever we want to hand over that reference to the callback function, which is tedious and not very efficient. Convert the structure to instead contain a `struct reference` as member. This member is expected to be populated by the implementations of the iterator and is handed over to the callback directly. While at it, simplify `should_pack_ref()` to take a `struct reference` directly instead of passing its respective fields. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-04 07:32:25 -08:00
Meet Soni	8dfe077fb6	refs: add a generic 'optimize' API The existing `pack-refs` API is conceptually tied to the 'files' backend, but its behavior is generic (e.g., it triggers compaction for reftable). This naming is confusing. Introduce a new generic refs_optimize() API that dispatches to a backend-specific implementation via a new 'optimize' vtable method. This lays the architectural groundwork for different reference backends (like 'files' and 'reftable') to provide their own storage optimization logic, which will be called from a single, generic entry point. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-09-19 10:02:55 -07:00
Junio C Hamano	c3c8b6910a	Merge branch 'ps/reflog-migrate-fixes' "git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type	2025-08-21 13:46:57 -07:00
Patrick Steinhardt	046c67325c	refs: stop unsetting REF_HAVE_OLD for log-only updates The `REF_HAVE_OLD` flag indicates whether a given ref update has its old object ID set. If so, the value of that field is used to verify whether the current state of the reference matches this expected state. It is thus an important part of mitigating races with a concurrent process that updates the same set of references. When writing reflogs though we explicitly unset that flag. This is a sensible thing to do: the old state of reflog entry updates may not necessarily match the current on-disk state of its accompanying ref, but it's only intended to signal what old object ID we want to write into the new reflog entry. For example when migrating refs we end up writing many reflog entries for a single reference, and most likely those reflog entries will have many different old object IDs. But unsetting this flag also removes a useful signal, namely that the caller _did_ provide an old object ID for a given reflog entry. This signal will become useful in a subsequent commit, where we add a new flag that tells the transaction to use the provided old and new object IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a signal to verify that the caller really did provide an old object ID. Stop unsetting the flag so that we can use it as this described signal in a subsequent commit. Skip checking the old object ID for log-only updates so that we don't expect it to match the current on-disk state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-08-06 07:36:31 -07:00
Karthik Nayak	2b4648b919	refs: selectively set prefix in the seek functions The ref iterator exposes a `ref_iterator_seek()` function. The name suggests that this would seek the iterator to a specific reference in some ways similar to how `fseek()` works for the filesystem. However, the function actually sets the prefix for refs iteration. So further iteration would only yield references which match the particular prefix. This is a bit confusing. Let's add a 'flags' field to the function, which when set with the 'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the iteration in-line with the existing behavior. Otherwise, the reference backends will simply seek to the specified reference and clears any previously set prefix. This allows users to start iteration from a specific reference. In the packed and reftable backend, since references are available in a sorted list, the changes are simply setting the prefix if needed. The changes on the files-backend are a little more involved, since the files backend uses the 'ref-cache' mechanism. We move out the existing logic within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()` which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We then parse the provided seek string and set the required levels and their indexes to ensure that seeking is possible. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-15 11:54:20 -07:00
Karthik Nayak	6bde5d43b7	refs: expose `ref_iterator` via 'refs.h' The `ref_iterator` is an internal structure to the 'refs/' sub-directory, which allows iteration over refs. All reference iteration is built on top of these iterators. External clients of the 'refs' subsystem use the various 'refs_for_each...()' functions to iterate over refs. However since these are wrapper functions, each combination of functionality requires a new wrapper function. This is not feasible as the functions pile up with the increase in requirements. Expose the internal reference iterator, so advanced users can mix and match options as needed. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-15 11:54:19 -07:00
Karthik Nayak	31726bb90d	refs: support rejection in batch updates during F/D checks The `refs_verify_refnames_available()` is used to batch check refnames for F/D conflicts. While this is the more performant alternative than its individual version, it does not provide rejection capabilities on a single update level. For batched updates, this would mean a rejection of the entire transaction whenever one reference has a F/D conflict. Modify the function to call `ref_transaction_maybe_set_rejected()` to check if a single update can be rejected. Since this function is only internally used within 'refs/' and we want to pass in a `struct ref_transaction *` as a variable. We also move and mark `refs_verify_refnames_available()` to 'refs-internal.h' to be an internal function. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:57:21 -07:00
Karthik Nayak	23fc8e4f61	refs: implement batch reference update support Git supports making reference updates with or without transactions. Updates with transactions are generally better optimized. But transactions are all or nothing. This means, if a user wants to batch updates to take advantage of the optimizations without the hard requirement that all updates must succeed, there is no way currently to do so. Particularly with the reftable backend where batching multiple reference updates is more efficient than performing them sequentially. Introduce batched update support with a new flag, 'REF_TRANSACTION_ALLOW_FAILURE'. Batched updates while different from transactions, use the transaction infrastructure under the hood. When enabled, this flag allows individual reference updates that would typically cause the entire transaction to fail due to non-system-related errors to be marked as rejected while permitting other updates to proceed. System errors referred by 'REF_TRANSACTION_ERROR_GENERIC' continue to result in the entire transaction failing. This approach enhances flexibility while preserving transactional integrity where necessary. The implementation introduces several key components: - Add 'rejection_err' field to struct `ref_update` to track failed updates with failure reason. - Add a new struct `ref_transaction_rejections` and a field within `ref_transaction` to this struct to allow quick iteration over rejected updates. - Modify reference backends (files, packed, reftable) to handle partial transactions by using `ref_transaction_set_rejected()` instead of failing the entire transaction when `REF_TRANSACTION_ALLOW_FAILURE` is set. - Add `ref_transaction_for_each_rejected_update()` to let callers examine which updates were rejected and why. This foundational change enables batched update support throughout the reference subsystem. A following commit will expose this capability to users by adding a `--batch-updates` flag to 'git-update-ref(1)', providing both a user-facing feature and a testable implementation. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:57:20 -07:00
Karthik Nayak	76e760b999	refs: introduce enum-based transaction error types Replace preprocessor-defined transaction errors with a strongly-typed enum `ref_transaction_error`. This change: - Improves type safety and function signature clarity. - Makes error handling more explicit and discoverable. - Maintains existing error cases, while adding new error cases for common scenarios. This refactoring paves the way for more comprehensive error handling which we will utilize in the upcoming commits to add batch reference update support. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:57:20 -07:00
Karthik Nayak	4dfcf18089	refs/files: remove duplicate duplicates check Within the files reference backend's transaction's 'finish' phase, a verification step is currently performed wherein the refnames list is sorted and examined for multiple updates targeting the same refname. It has been observed that this verification is redundant, as an identical check is already executed during the transaction's 'prepare' stage. Since the refnames list remains unmodified following the 'prepare' stage, this secondary verification can be safely eliminated. The duplicate check has been removed accordingly, and the `ref_update_reject_duplicates()` function has been marked as static, as its usage is now confined to 'refs.c'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:57:19 -07:00
Karthik Nayak	c3baddf04f	refs: move duplicate refname update check to generic layer Move the tracking of refnames in `affected_refnames` from individual backends into the generic layer in 'refs.c'. This centralizes the duplicate refname detection that was previously handled separately by each backend. Make some changes to accommodate this move: - Add a `string_list` field `refnames` to `ref_transaction` to contain all the references in a transaction. This field is updated whenever a new update is added via `ref_transaction_add_update`, so manual additions in reference backends are dropped. - Modify the backends to use this field internally as needed. The backends need to check if an update for refname already exists when splitting symrefs or adding an update for 'HEAD'. - In the reftable backend, within `reftable_be_transaction_prepare()`, move the `string_list_has_string()` check above `ref_transaction_add_update()`. Since `ref_transaction_add_update()` automatically adds the refname to `transaction->refnames`, performing the check after will always return true, so we perform the check before adding the update. This helps reduce duplication of functionality between the backends and makes it easier to make changes in a more centralized manner. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-08 07:57:18 -07:00
Patrick Steinhardt	82c39c6055	refs/iterator: provide infrastructure to re-seek iterators Reftable iterators need to be scrapped after they have either been exhausted or aren't useful to the caller anymore, and it is explicitly not possible to reuse them for iterations. But enabling for reuse of iterators may allow us to tune them by reusing internal state of an iterator. The reftable iterators for example can already be reused internally, but we're not able to expose this to any users outside of the reftable backend. Introduce a new `.seek` function in the ref iterator vtable that allows callers to seek an iterator multiple times. It is expected to be functionally the same as calling `refs_ref_iterator_begin()` with a different (or the same) prefix. Note that it is not possible to adjust parameters other than the seeked prefix for now, so exclude patterns, trimmed prefixes and flags will remain unchanged. We do not have a usecase for changing these parameters right now, but if we ever find one we can adapt accordingly. Implement the callback for trivial cases. The other iterators will be implemented in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-12 11:31:19 -07:00
Patrick Steinhardt	cec2b6f55a	refs/iterator: separate lifecycle from iteration The ref and reflog iterators have their lifecycle attached to iteration: once the iterator reaches its end, it is automatically released and the caller doesn't have to care about that anymore. When the iterator should be released before it has been exhausted, callers must explicitly abort the iterator via `ref_iterator_abort()`. This lifecycle is somewhat unusual in the Git codebase and creates two problems: - Callsites need to be very careful about when exactly they call `ref_iterator_abort()`, as calling the function is only valid when the iterator itself still is. This leads to somewhat awkward calling patterns in some situations. - It is impossible to reuse iterators and re-seek them to a different prefix. This feature isn't supported by any iterator implementation except for the reftable iterators anyway, but if it was implemented it would allow us to optimize cases where we need to search for specific references repeatedly by reusing internal state. Detangle the lifecycle from iteration so that we don't deallocate the iterator anymore once it is exhausted. Instead, callers are now expected to always call a newly introduce `ref_iterator_free()` function that deallocates the iterator and its internal state. Note that the `dir_iterator` is somewhat special because it does not implement the `ref_iterator` interface, but is only used to implement other iterators. Consequently, we have to provide `dir_iterator_free()` instead of `dir_iterator_release()` as the allocated structure itself is managed by the `dir_iterator` interfaces, as well, and not freed by `ref_iterator_free()` like in all the other cases. While at it, drop the return value of `ref_iterator_abort()`, which wasn't really required by any of the iterator implementations anyway. Furthermore, stop calling `base_ref_iterator_free()` in any of the backends, but instead call it in `ref_iterator_free()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-03-12 11:31:18 -07:00
Karthik Nayak	e7c1b9f123	refs: use 'uint64_t' for 'ref_update.index' The 'ref_update.index' variable is used to store an index for a given reference update. This index is used to order the updates in a predetermined order, while the default ordering is alphabetical as per the refname. For large repositories with millions of references, it should be safer to use 'uint64_t'. Let's do that. This also is applied for all other code sections where we store 'index' and pass it around. Reported-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-22 09:51:36 -08:00
Junio C Hamano	0f3d8e2e46	Merge branch 'kn/reflog-migration-fix' into kn/reflog-migration-fix-followup * kn/reflog-migration-fix: reftable: write correct max_update_index to header	2025-01-17 15:42:58 -08:00
Karthik Nayak	bc67b4ab5f	reftable: write correct max_update_index to header In `297c09eabb` (refs: allow multiple reflog entries for the same refname, 2024-12-16), the reftable backend learned to handle multiple reflog entries within the same transaction. This was done modifying the `update_index` for reflogs with multiple indices. During writing the logs, the `max_update_index` of the writer was modified to ensure the limits were raised to the modified `update_index`s. However, since ref entries are written before the modification to the `max_update_index`, if there are multiple blocks to be written, the reftable backend writes the header with the old `max_update_index`. When all logs are finally written, the footer will be written with the new `min_update_index`. This causes a mismatch between the header and the footer and causes the reftable file to be corrupted. The existing tests only spawn a single block and since headers are lazily written with the first block, the tests didn't capture this bug. To fix the issue, the appropriate `max_update_index` limit must be set even before the first block is written. Add a `max_index` field to the transaction which holds the `max_index` within all its updates, then propagate this value to the reftable backend, wherein this is used to the set the `max_update_index` correctly. Add a test which creates a few thousand reference updates with multiple reflog entries, which should trigger the bug. Reported-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-15 09:12:09 -08:00
Junio C Hamano	6f8ae955bd	Merge branch 'kn/reflog-migration' "git refs migrate" learned to also migrate the reflog data across backends. * kn/reflog-migration: refs: mark invalid refname message for translation refs: add support for migrating reflogs refs: allow multiple reflog entries for the same refname refs: introduce the `ref_transaction_update_reflog` function refs: add `committer_info` to `ref_transaction_add_update()` refs: extract out refname verification in transactions refs/files: add count field to ref_lock refs: add `index` field to `struct ref_udpate` refs: include committer info in `ref_update` struct	2024-12-23 09:32:29 -08:00
Junio C Hamano	5f212684ab	Merge branch 'bf/set-head-symref' When "git fetch $remote" notices that refs/remotes/$remote/HEAD is missing and discovers what branch the other side points with its HEAD, refs/remotes/$remote/HEAD is updated to point to it. * bf/set-head-symref: fetch set_head: handle mirrored bare repositories fetch: set remote/HEAD if it does not exist refs: add create_only option to refs_update_symref_extended refs: add TRANSACTION_CREATE_EXISTS error remote set-head: better output for --auto remote set-head: refactor for readability refs: atomically record overwritten ref in update_symref refs: standardize output of refs_read_symbolic_ref t/t5505-remote: test failure of set-head t/t5505-remote: set default branch to main	2024-12-19 10:58:27 -08:00
Karthik Nayak	4483be36f4	refs: add `committer_info` to `ref_transaction_add_update()` The `ref_transaction_add_update()` creates the `ref_update` struct. To facilitate addition of reflogs in the next commit, the function needs to accommodate setting the `committer_info` field in the struct. So modify the function to also take `committer_info` as an argument and set it accordingly. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-16 09:45:33 -08:00
Karthik Nayak	a3582e2eac	refs: add `index` field to `struct ref_udpate` The reftable backend, sorts its updates by refname before applying them, this ensures that the references are stored sorted. When migrating reflogs from one backend to another, the order of the reflogs must be maintained. Add a new `index` field to the `ref_update` struct to facilitate this. This field is used in the reftable backend's sort comparison function `transaction_update_cmp`, to ensure that indexed fields maintain their order. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-16 09:45:32 -08:00
Karthik Nayak	1a83e26d72	refs: include committer info in `ref_update` struct The reference backends obtain the committer information from `git_committer_info(0)` when adding a reflog. The upcoming patches introduce support for migrating reflogs between the reference backends. This requires an interface to creating reflogs, including custom committer information. Add a new field `committer_info` to the `ref_update` struct, which is then used by the reference backends. If there is no `committer_info` provided, the reference backends default to using `git_committer_info(0)`. The field itself cannot be set to `git_committer_info(0)` since the values are dynamic and must be obtained right when the reflog is being committed. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-16 09:45:32 -08:00
Junio C Hamano	57e81b59f3	Merge branch 'sj/ref-contents-check' "git fsck" learned to issue warnings on "curiously formatted" ref contents that have always been taken valid but something Git wouldn't have written itself (e.g., missing terminating end-of-line after the full object name). * sj/ref-contents-check: ref: add symlink ref content check for files backend ref: check whether the target of the symref is a ref ref: add basic symref content check for files backend ref: add more strict checks for regular refs ref: port git-fsck(1) regular refs check for files backend ref: support multiple worktrees check for refs ref: initialize ref name outside of check functions ref: check the full refname instead of basename ref: initialize "fsck_ref_report" with zero	2024-12-04 10:14:42 +09:00
Bence Ferdinandy	8102d10ff8	refs: standardize output of refs_read_symbolic_ref When the symbolic reference we want to read with refs_read_symbolic_ref is actually not a symbolic reference, the files and the reftable backends return different values (1 and -1 respectively). Standardize the returned values so that 0 is success, -1 is a generic error and -2 is that the reference was actually non-symbolic. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-25 11:46:35 +09:00
shejialuo	1c0e2a0019	ref: add more strict checks for regular refs We have already used "parse_loose_ref_contents" function to check whether the ref content is valid in files backend. However, by using "parse_loose_ref_contents", we allow the ref's content to end with garbage or without a newline. Even though we never create such loose refs ourselves, we have accepted such loose refs. So, it is entirely possible that some third-party tools may rely on such loose refs being valid. We should not report an error fsck message at current. We should notify the users about such "curiously formatted" loose refs so that adequate care is taken before we decide to tighten the rules in the future. And it's not suitable either to report a warn fsck message to the user. We don't yet want the "--strict" flag that controls this bit to end up generating errors for such weirdly-formatted reference contents, as we first want to assess whether this retroactive tightening will cause issues for any tools out there. It may cause compatibility issues which may break the repository. So, we add the following two fsck infos to represent the situation where the ref content ends without newline or has trailing garbages: 1. refMissingNewline(INFO): A loose ref that does not end with newline(LF). 2. trailingRefContent(INFO): A loose ref has trailing content. It might appear that we can't provide the user with any warnings by using FSCK_INFO. However, in "fsck.c::fsck_vreport", we will convert FSCK_INFO to FSCK_WARN and we can still warn the user about these situations when using "git refs verify" without introducing compatibility issues. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 08:21:33 +09:00
shejialuo	7c78d819e6	ref: support multiple worktrees check for refs We have already set up the infrastructure to check the consistency for refs, but we do not support multiple worktrees. However, "git-fsck(1)" will check the refs of worktrees. As we decide to get feature parity with "git-fsck(1)", we need to set up support for multiple worktrees. Because each worktree has its own specific refs, instead of just showing the users "refs/worktree/foo", we need to display the full name such as "worktrees/<id>/refs/worktree/foo". So we should know the id of the worktree to get the full name. Add a new parameter "struct worktree *" for "refs-internal.h::fsck_fn". Then change the related functions to follow this new interface. The "packed-refs" only exists in the main worktree, so we should only check "packed-refs" in the main worktree. Use "is_main_worktree" method to skip checking "packed-refs" in "packed_fsck" function. Then, enhance the "files-backend.c::files_fsck_refs_dir" function to add "worktree/<id>/" prefix when we are not in the main worktree. Last, add a new test to check the refname when there are multiple worktrees to exercise the code. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 08:21:32 +09:00
Patrick Steinhardt	1c299d03e5	refs: introduce "initial" transaction flag There are two different ways to commit a transaction: - `ref_transaction_commit()` can be used to commit a regular transaction and is what almost every caller wants. - `initial_ref_transaction_commit()` can be used when it is known that the ref store that the transaction is committed for is empty and when there are no concurrent processes. This is used when cloning a new repository. Implementing this via two separate functions has a couple of downsides. First, every reference backend needs to implement a separate callback even in the case where they don't special-case the initial transaction. Second, backends are basically forced to reimplement the whole logic for how to commit the transaction like the "files" backend does, even though backends may wish to only tweak certain behaviour of a "normal" commit. Third, it is awkward that callers must never prepare the transaction as this is somewhat different than how a transaction typically works. Refactor the code such that we instead mark initial transactions via a separate flag when starting the transaction. This addresses all of the mentioned painpoints, where the most important part is that it will allow backends to have way more leeway in how exactly they want to handle the initial transaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 07:59:15 +09:00
Patrick Steinhardt	a0efef1446	refs: allow passing flags when setting up a transaction Allow passing flags when setting up a transaction such that the behaviour of the transaction itself can be altered. This functionality will be used in a subsequent patch. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 07:59:14 +09:00
Junio C Hamano	b3d175409d	Merge branch 'sj/ref-fsck' "git fsck" infrastructure has been taught to also check the sanity of the ref database, in addition to the object database. * sj/ref-fsck: fsck: add ref name check for files backend files-backend: add unified interface for refs scanning builtin/refs: add verify subcommand refs: set up ref consistency check infrastructure fsck: add refs report function fsck: add a unified interface for reporting fsck messages fsck: make "fsck_error" callback generic fsck: rename objects-related fsck error functions fsck: rename "skiplist" to "skip_oids"	2024-08-16 12:51:51 -07:00
Junio C Hamano	e7f86cb69d	Merge branch 'jc/refs-symref-referent' The refs API has been taught to give symref target information to the users of ref iterators, allowing for-each-ref and friends to avoid an extra ref_resolve_* API call per a symbolic ref. * jc/refs-symref-referent: ref-filter: populate symref from iterator refs: add referent to each_ref_fn refs: keep track of unresolved reference value in iterators	2024-08-15 13:22:15 -07:00
John Cai	cfd971520e	refs: keep track of unresolved reference value in iterators Since ref iterators do not hold onto the direct value of a reference without resolving it, the only way to get ahold of a direct value of a symbolic ref is to make a separate call to refs_read_symbolic_ref. To make accessing the direct value of a symbolic ref more efficient, let's save the direct value of the ref in the iterators for both the files backend and the reftable backend. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-09 08:47:33 -07:00
shejialuo	ab6f79d8df	refs: set up ref consistency check infrastructure The "struct ref_store" is the base class which contains the "be" pointer which provides backend-specific functions whose interfaces are defined in the "ref_storage_be". We could reuse this polymorphism to define only one interface. For every backend, we need to provide its own function pointer. The interfaces defined in the `ref_storage_be` are carefully structured in semantic. It's organized as the five parts: 1. The name and the initialization interfaces. 2. The ref transaction interfaces. 3. The ref internal interfaces (pack, rename and copy). 4. The ref filesystem interfaces. 5. The reflog related interfaces. To keep consistent with the git-fsck(1), add a new interface named "fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be grouped into any above five categories. Explicitly add blank line to make it different from others. Last, implement placeholder functions for each ref backends. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 09:36:53 -07:00
Patrick Steinhardt	080b068ffb	refs/files: stop using `the_repository` in `parse_loose_ref_contents()` We implicitly rely on `the_repository` in `parse_loose_ref_contents()` by calling `parse_oid_hex()`. Convert the function to instead use `parse_oid_hex_algop()` and have callers pass in the hash algorithm to use. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-07-30 13:41:23 -07:00
Junio C Hamano	5f14d20984	Merge branch 'kn/update-ref-symref' "git update-ref --stdin" learned to handle transactional updates of symbolic-refs. * kn/update-ref-symref: update-ref: add support for 'symref-update' command reftable: pick either 'oid' or 'target' for new updates update-ref: add support for 'symref-create' command update-ref: add support for 'symref-delete' command update-ref: add support for 'symref-verify' command refs: specify error for regular refs with `old_target` refs: create and use `ref_update_expects_existing_old_ref()`	2024-06-20 15:45:12 -07:00
Karthik Nayak	aba381c090	refs: create and use `ref_update_expects_existing_old_ref()` The files and reftable backend, need to check if a ref must exist, so that the required validation can be done. A ref must exist only when the `old_oid` value of the update has been explicitly set and it is not the `null_oid` value. Since we also support symrefs now, we need to ensure that even when `old_target` is set a ref must exist. While this was missed when we added symref support in transactions, there are no active users of this path. As we introduce the 'symref-verify' command in the upcoming commits, it is important to fix this. So let's export this to a function called `ref_update_expects_existing_old_ref()` and expose it internally via 'refs-internal.h'. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-07 10:25:44 -07:00
Patrick Steinhardt	64a6dd8ffc	refs: implement removal of ref storages We're about to introduce logic to migrate ref storages. One part of the migration will be to delete the files that are part of the old ref storage format. We don't yet have a way to delete such data generically across ref backends though. Implement a new `delete` callback and expose it via a new `ref_storage_delete()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:33 -07:00
Junio C Hamano	e55f364398	Merge branch 'ps/refs-without-the-repository-updates' into ps/ref-storage-migration * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks	2024-05-23 09:14:08 -07:00
Patrick Steinhardt	19c76e8235	refs: move object peeling into "object.c" Peeling an object has nothing to do with refs, but we still have the code in "refs.c". Move it over into "object.c", which is a more natural place to put it. Ideally, we'd also move `peel_iterated_oid()` over into "object.c". But this function is tied to the refs interfaces because it uses a global ref iterator variable to optimize peeling when the iterator already has the peeled object ID readily available. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:39 -07:00
Patrick Steinhardt	8378c9d27b	refs: convert iteration over replace refs to accept ref store The function `for_each_replace_ref()` is a bit of an oddball across the refs interfaces as it accepts a pointer to the repository instead of a pointer to the ref store. The only reason for us to accept a repository is so that we can eventually pass it back to the callback function that the caller has provided. This is somewhat arbitrary though, as callers that need the repository can instead make it accessible via the callback payload. Refactor the function to instead accept the ref store and adjust callers accordingly. This allows us to get rid of some of the boilerplate that we had to carry to pass along the repository and brings us in line with the other functions that iterate through refs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:38 -07:00
Patrick Steinhardt	965f8991e5	refs: pass repo when retrieving submodule ref store Looking up submodule ref stores has two deficiencies: - The initialized subrepo will be attributed to `the_repository`. - The submodule ref store will be tracked in a global map. This makes it impossible to have submodule ref stores for a repository other than `the_repository`. Modify the function to accept the parent repository as parameter and move the global map into `struct repository`. Like this it becomes possible to look up submodule ref stores for arbitrary repositories. Note that this also adds a new reference to `the_repository` in `resolve_gitlink_ref()`, which is part of the refs interfaces. This will get adjusted in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:37 -07:00
Patrick Steinhardt	71c871b48d	refs: implement releasing ref storages Ref storages are typically only initialized once for `the_repository` and then never released. Until now we got away with that without causing memory leaks because `the_repository` stays reachable, and because the ref backend is reachable via `the_repository` its memory basically never leaks. This is about to change though because of the upcoming migration logic, which will create a secondary ref storage. In that case, we will either have to release the old or new ref storage to avoid leaks. Implement a new `release` callback and expose it via a new `ref_storage_release()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:37 -07:00
Patrick Steinhardt	ed93ea1602	refs: rename `init_db` callback to avoid confusion Reference backends have two callbacks `init` and `init_db`. The similarity of these two callbacks has repeatedly confused me whenever I was looking at them, where I always had to look up which of them does what. Rename the `init_db` callback to `create_on_disk`, which should hopefully be clearer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:36 -07:00
Karthik Nayak	4865707bda	refs: remove `create_symref` and associated dead code In the previous commits, we converted `refs_create_symref()` to utilize transactions to perform symref updates. Earlier `refs_create_symref()` used `create_symref()` to do the same. We can now remove `create_symref()` and any code associated with it which is no longer used. We remove `create_symref()` code from all the reference backends and also remove it entirely from the `ref_storage_be` struct. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-07 08:51:50 -07:00
Karthik Nayak	644daf7785	refs: add support for transactional symref updates The reference backends currently support transactional reference updates. While this is exposed to users via 'git-update-ref' and its '--stdin' mode, it is also used internally within various commands. However, we do not support transactional updates of symrefs. This commit adds support for symrefs in both the 'files' and the 'reftable' backend. Here, we add and use `ref_update_has_null_new_value()`, a helper function which is used to check if there is a new_value in a reference update. The new value could either be a symref target `new_target` or a OID `new_oid`. We also add another common function `ref_update_check_old_target` which will be used to check if the update's old_target corresponds to a reference's current target. Now transactional updates (verify, create, delete, update) can be used for: - regular refs - symbolic refs - conversion of regular to symbolic refs and vice versa This also allows us to expose this to users via new commands in 'git-update-ref' in the future. Note that a dangling symref update does not record a new reflog entry, which is unchanged before and after this commit. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-07 08:51:49 -07:00
Karthik Nayak	e9965ba477	refs: move `original_update_refname` to 'refs.c' The files backend and the reftable backend implement `original_update_refname` to obtain the original refname of the update. Move it out to 'refs.c' and only expose it internally to the refs library. This will be used in an upcoming commit to also introduce another common functionality for the two backends. We also rename the function to `ref_update_original_update_refname` to keep it consistent with the upcoming other 'ref_update_*' functions that'll be introduced. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-07 08:51:49 -07:00
Karthik Nayak	1bc4cc3fc4	refs: accept symref values in `ref_transaction_update()` The function `ref_transaction_update()` obtains ref information and flags to create a `ref_update` and add them to the transaction at hand. To extend symref support in transactions, we need to also accept the old and new ref targets and process it. This commit adds the required parameters to the function and modifies all call sites. The two parameters added are `new_target` and `old_target`. The `new_target` is used to denote what the reference should point to when the transaction is applied. Some functions allow this parameter to be NULL, meaning that the reference is not changed. The `old_target` denotes the value the reference must have before the update. Some functions allow this parameter to be NULL, meaning that the old value of the reference is not checked. We also update the internal function `ref_transaction_add_update()` similarly to take the two new parameters. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-07 08:51:49 -07:00

1 2 3 4

184 Commits