Commit Graph

1712 Commits

Author SHA1 Message Date
Laszlo Nagy b45d85e32e docs(meta): plan documentation restructure for agent workflows
Add plan.md capturing an eight-phase plan to reorganise the repo so
Claude Code (and humans) can find the right rule, write to the right
place, and keep documentation in sync with code.

The plan introduces a `docs/` parent for requirements and rationale,
single-source-of-truth files for configuration and CLI surface, sync
checks to catch drift, and a preflight phase plus recovery appendix
to make phase-by-phase execution safe.

Each subsequent commit on this branch should execute one phase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:16:31 +00:00
Laszlo Nagy ead58fe113 chore: add tracking label for maintainer-authored work items
Distinguishes project work items the maintainer files against the
project board from user-reported enhancements. Used to keep
`enhancement` semantically clean (= user-reported idea).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:09:12 +00:00
Laszlo Nagy f492d34194 chore: add labels.yml as source of truth for issue and PR labels
Records the full label set (existing plus newly created area:* and
needs-repro labels) in a format consumable by label-sync actions.
Lets future label changes go through PR review rather than ad-hoc
gh label create calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:45:40 +00:00
Laszlo Nagy 26b955f621 chore: disable blank issues and link Discussions from issue picker
Forces new issues through the structured bug form so the load-bearing
fields are always captured. Adds a contact link to Discussions as the
escape hatch for questions and feature ideas.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:31:35 +00:00
Laszlo Nagy 035d17f989 chore: convert bug report template to structured issue form
Replace the legacy markdown template with a YAML issue form. Lifts
load-bearing fields (build tool, compiler, install method, OS, arch)
out of free-text "Additional context" prose into validated dropdowns
and required inputs, and adds explicit fields for the exact bear
command, expected vs actual output, and the RUST_LOG=debug log. The
pre-flight checklist is now enforced via required checkboxes, and a
top-of-form note routes general questions to Discussions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:24:44 +00:00
Laszlo Nagy fc7eb4ad91 recognition: probe cc/c++ to pick clang vs gcc on BSD/macOS hosts
`cc` and `c++` are GCC on most Linuxes but Clang on FreeBSD,
OpenBSD, NetBSD, DragonFly, and macOS. The regex defaulted them
to GCC, which corrupted the compilation database on those hosts
via wrong flag-arity tables (e.g. Clang's `-Xclang <arg>`
consumes the next argv slot, GCC's does not).

Recognition now runs `--version` lazily for these ambiguous
basenames, classifies by signature, and dispatches accordingly.
The probe is the sole classifier: gcc.yaml deliberately omits
`cc`/`c++`, so a failed probe returns NotRecognized rather than
guessing -- a missing entry is visible and debuggable, whereas a
wrongly-classified entry corrupts the database silently via
mismatched flag-arity tables (the bug this work exists to fix).

Layered design:
- CompilerRecognizer dispatches.
- CompilerProbe classifies. VersionProbe on Unix (hardened:
  closed stdin, process-group SIGKILL on timeout, LD_PRELOAD /
  DYLD_INSERT_LIBRARIES stripped); NoProbe on Windows where
  basenames are unambiguous and the Unix subprocess primitives
  the probe relies on aren't available.
- CachingProbe memoizes the probe's verdict per canonical path
  so each unique compiler is fork-exec'd at most once per
  process.

A user `compilers:` entry preempts the probe -- the sole
supported override.

Also simplifies the WrapperInterpreter that the probe work
exposed: replaces the cyclic Arc::new_cyclic +
OnceLock<Box<dyn Interpreter>> + Weak<dyn Interpreter> machinery
with a flat wrapper::unwrap() helper called inline from
CompilerInterpreter::recognize.

See requirements/recognition-ambiguous-name-probe.md for the spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.1.3
2026-05-03 14:00:59 +00:00
Laszlo Nagy 9033e598fb intercept: drop Event wrapper, send Execution directly
The Event wrapper carried a pid field that no production code ever read.
The captured pid was the shim/wrapper's process id (not the compiler's),
so it was semantically misleading and only invited future misuse.

The TCP wire and on-disk JSON Lines format now carry bare Execution
objects rather than {"pid": N, "execution": {...}}. The trim invariant
(strip env vars irrelevant for compilation database generation) moved
from the deleted Event::new into ReporterOnTcp::report, so the
boundary that emits to the wire is the single place that enforces it.

The on-disk format break is hard: prior .events files will be skipped
on read with a warning per line. The output/intercept.rs module
already documents the format as not stable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 08:24:27 +00:00
Laszlo Nagy 5583e96e27 clang: rename CommandConverter::to_entries to convert
Inline the trivial convert_compiler_command pass-through into the
public method, rename it to convert, and use sut for the test
bindings. The name to_entries hid the type's single purpose behind
its return shape; convert names the operation directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 06:56:40 +00:00
Laszlo Nagy a519f60390 clang: replace PathFormatter trait with resolver fn pointers
The trait existed only so converter tests could mock the formatter, but
the formatter has no side effects worth mocking. Drop the trait, the
auto-generated MockPathFormatter, the Box<dyn> indirection, and the
ConfigurablePathFormatter wrapper. CommandConverter now stores two
ResolveFn fn pointers directly; resolver_for(strategy) is a free factory
in path_format. Filesystem-touching tests stay in path_format (where the
real resolver behavior lives); converter tests inject synthetic resolver
fns and exercise control flow without the filesystem.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 13:02:13 +00:00
Laszlo Nagy c9a1344afa config: drop ValidationCollector in favor of free helpers
The collector struct was only ever used as a Vec with a fixed 0/1/N
collapse step. Replace with a free `collapse` helper plus a small
`extend_with` flattener used by `Main::validate`. Sub-validators that
produce at most one error (`Compiler`, `PathFormat`) now return their
error directly. Behavior and tests are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:31:54 +00:00
Laszlo Nagy a2fcd51857 docs: explain the build pipeline per crate
Item #5 consolidated platform-checks but left the build pipeline
under-documented: contributors had to read three build.rs files
to find out who runs what, and the lld prerequisite was a silent
trap. This commit spreads that knowledge across CLAUDE.md files,
co-located with the crates each piece belongs to:

  - New platform-checks/CLAUDE.md: role, post-Item-#5 public API,
    recipe for adding a probe, scope boundary against the
    intercept-family list.
  - New bear-codegen/CLAUDE.md: build-time codegen role; YAML to
    OUT_DIR via include!(); snapshot-test contract.
  - New bear-completions/CLAUDE.md: why a separate crate
    (clap_complete cost), how it's actually invoked (distributor
    runs it; install.sh only picks up pre-generated files).
  - Top-level CLAUDE.md: short "Build pipeline" routing section
    pointing at the per-crate files; "Host requirements" calling
    out lld as a Linux-only prerequisite; routing table grew
    three entries.
  - bear/CLAUDE.md: replaced the narrow "Code generation"
    subsection with a "Build script" section that also covers
    INTERCEPT_LIBDIR validation and the rustc-env emissions
    consumed by installation.rs.
  - intercept-preload/CLAUDE.md: "Build script duties" listing
    the cc-shim build, exports list, and link directives, plus a
    pointer at src/c/shim.c as the source of truth for
    INTERCEPT_FAMILY.
  - integration-tests/CLAUDE.md: "Build script duties" describing
    the executable probes (single vs grouped cfgs) and the
    ccache-masquerade detection.

Side cleanup: dropped the dead `cargo:rustc-cfg=build_cdylib`
directive from intercept-preload/build.rs. It was emitted but
read by no source -- the existing comment claiming it forced
cdylib generation was misleading; cdylib production is decided
by Cargo.toml's crate-type, not by the cfg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:57:36 +00:00
Laszlo Nagy 82574029a6 build: run platform-checks detection once per workspace build
`platform_checks::perform_system_checks()` was called from two
build scripts (intercept-preload and integration-tests), so cc
recompiled the same ~25 header / symbol probes twice on every
cold build. Move detection into platform-checks's own build.rs;
expose results as `DETECTED_HEADERS` / `DETECTED_SYMBOLS` /
`KNOWN_HEADERS` / `KNOWN_SYMBOLS` constants baked into the
library; add `emit_cfg()` / `emit_check_cfg()` for consumer
build scripts to replay as `cargo:rustc-cfg=` directives against
their own crate.

Cargo's build ordering guarantees the detection runs once before
any consumer's build.rs. Cold-build probe lines drop from 52 to
26.

Side cleanups that fall out of the move:

  - `cc` and `tempfile` move from `[dependencies]` to
    `[build-dependencies]` in platform-checks/Cargo.toml; the
    library code is dependency-free.
  - The hand-maintained `[lints.rust] unexpected_cfgs.check-cfg`
    allowlist (~30 lines) is dropped from
    intercept-preload/Cargo.toml entirely, replaced by
    `emit_check_cfg()` printing
    `cargo:rustc-check-cfg=cfg(has_*_X)` at build time. The
    integration-tests allowlist keeps `has_executable_*` and
    `has_preload_library` entries since those probes still live
    in its own build.rs.
  - intercept-preload/build.rs filters DETECTED_SYMBOLS through
    a local INTERCEPT_FAMILY constant for the cc -D defines and
    the version script / exports list, preserving today's
    behavior of restricting exported symbols to the family that
    src/c/shim.c actually defines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:39:48 +00:00
Laszlo Nagy ae2493d5ec test: replace BEAR_TEST_VERBOSE with RUST_LOG
Collapse the integration-test verbosity controls onto a single
inherited RUST_LOG. run_bear no longer forces RUST_LOG=debug; it
inherits the test-process value, defaulting to info when unset --
this keeps warn/info/error log lines in captured stderr (so tests
that assert on them still work) while filtering the per-event
debug traces from the preload library that ccache was caching and
replaying through Command::status() leaks. CI sets RUST_LOG=debug
explicitly for full per-event traces on platforms that can't be
reproduced locally.

Drop::preserve_on_panic now dumps the last captured BearOutput
unconditionally; cargo's per-test capture handles the show-on-
failure filter for both Err returns and panics. The verbose-gated
inline debug-info blocks in CompilationDatabase / InterceptEvents
assertion methods, the new_with_verbose constructor, the
verbose: macro arm of bear_test!, and all the
show_verbose_if_enabled / force_show_verbose / show_last_bear_output
helpers are gone. BEAR_TEST_PRESERVE_FAILURES stays unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:26:53 +00:00
Laszlo Nagy 811f19ea00 chore: sync Cargo.lock to workspace version 4.1.3
The bear-completions extraction was authored in a worktree based on
a pre-bump commit, so its Cargo.lock entry recorded 4.1.2. Cargo
regenerated the entry on the next build; commit the fixup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:46:44 +00:00
Laszlo Nagy e72d69e6ba config: replace directories with direct env lookup
bear/src/config/loader.rs was the only consumer of the directories
crate. Replace BaseDirs / ProjectDirs with a small per-platform
resolver (XDG_CONFIG_HOME or $HOME/.config on Unix; LOCALAPPDATA /
APPDATA on Windows). Drops directories, dirs-sys, and option-ext
from the graph.

Behavior changes (silent for affected users):
- macOS no longer probes ~/Library/Application Support/. CLI tools
  on macOS should use ~/.config/, matching git, gh, nvim, fish,
  ripgrep, helix, etc. Bear's INSTALL.md already nudges macOS users
  toward ~/.config for fish completions.
- Windows path flattens from \rizsotto\Bear\config\ to \Bear\,
  paralleling the Unix Bear/ subdir.

Documentation updates: a new FILES section in man/bear.1.md lists
the concrete search order; the rustdoc on file_locations does the
same; man/bear.1 regenerated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:44:08 +00:00
Laszlo Nagy 6e2e02af80 build: extract generate-completions into bear-completions crate
The completions binary was the sole consumer of clap_complete in
the bear crate. Moving it to its own workspace member removes
clap_complete from bear's dependency graph for every non-completions
build. The binary is still produced at target/release/generate-
completions; no user-visible workflow change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:43:13 +00:00
Laszlo Nagy 3c9ef6985c build: drop regex crate in favor of regex-lite
bear-codegen's identifier check is replaced with a hand-rolled
validator (the only regex usage was a 4-char-class pattern).
bear's compiler recognizer switches from regex to regex-lite.
This drops regex, regex-syntax, regex-automata, and aho-corasick
(~6.3s combined) from production builds; regex-lite (~0.5s)
takes their place.

The regex-lite engine is pure NFA, no DFA/SIMD optimizations,
but the recognizer runs in the post-build semantic pass on
short anchored ASCII inputs, not on the LD_PRELOAD per-syscall
hot path. If profiling ever flags it, memoize by filename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:12:26 +00:00
Laszlo Nagy 6ada5e21ee build: drop env_logger humantime feature
Replace buf.timestamp() in the wrapper and preload log formatters
with an inline HH:MM:SS.mmm UTC formatter. The humantime feature
was the sole reason jiff (~3s of compile time) entered the graph.

Log lines change from "[2026-04-27T14:23:45Z wrapper/PID] msg" to
"[14:23:45.123 wrapper/PID] msg" -- loses calendar date, gains
millisecond precision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:12:18 +00:00
Laszlo Nagy d244e70744 build(bear): drop tempfile from runtime deps
All tempfile call sites under bear/src/ are inside #[cfg(test)]
modules; the [dev-dependencies] entry covers them. Removing the
duplicate [dependencies] entry takes rustix (and fastrand /
linux-raw-sys / bitflags) off the driver / wrapper / preload
target build chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:04:09 +00:00
Laszlo Nagy cf8473ec91 repo: update gitignore file 2026-04-26 07:52:33 +00:00
Laszlo Nagy 2821858fda chore: bump workspace version to 4.1.3
Open the 4.1.3 development cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 06:17:29 +00:00
Laszlo Nagy 98c8a500fc docs: add release process guide
Capture the steps to cut a release in RELEASE.md so the procedure is
discoverable in the repo: preconditions, mandatory pre-flight checks,
fast-forward merge, signed-tag conventions, GitHub release publication,
and the discussions-thread announcement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.1.2
2026-04-25 05:46:19 +00:00
Laszlo Nagy 96252bdb87 docs(man): refresh page date for the 4.1.2 release
Update the date header in bear.1.md and regenerate bear.1 with pandoc so
the man page reflects the release date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 05:46:14 +00:00
Laszlo Nagy 3f0960f991 test(intercept): factor out shared env-wrapper test fixtures
The wrapper-mode tests in bear/src/intercept/environment.rs were each
~40-100 lines, the majority of which was identical filesystem
scaffolding (TempDir for current_dir, TempDir with bin/<WRAPPER_NAME>
for install_dir, TempDir with fake compiler(s) for PATH, Context
construction, test SocketAddr). Intent was buried under fixture noise.

Introduce a test-local `fixture` module inside `mod test` that owns
the TempDirs, builds the Context, and exposes just the knobs tests
actually use (add_compiler_on_path, add_compiler_off_path,
add_compiler_in_cwd_subdir, with_env, with_path_string, ...). Migrate
all wrapper-mode tests to use it.

No production code changes. No behavior changes. 602 tests, still
passing on all platforms. The `masquerade` sub-module and the
parse_program_env_value unit tests are left as-is (the former uses
smaller one-off fixtures for direct helper tests; the latter is pure
and needs no fixtures).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:19:42 +00:00
Laszlo Nagy be851b9f00 feat(intercept): handle compiler env vars that contain flags
Wrapper mode now splits values like CC="gcc -std=c11" into program and
trailing flags on whitespace, resolves the program via the existing
masquerade-aware path, and rewrites the env var so the build still
receives the flags (CC=<wrapper_path> -std=c11). The override is
space-joined with no shell quoting, because neither $CC expansion (no
quote removal) nor GNU Make recipes handed to sh -c round-trip added
quoting correctly for common flags like -DFOO=1.

The no-flag case is a bare wrapper-path string, matching pre-feature
output. resolve_program_path is untouched so the
interception-wrapper-recursion contract is preserved. The man page
points users at CFLAGS / CXXFLAGS / LDFLAGS for anything beyond
simple trailing flags, and notes that the convention is a Unix / GNU
Make inheritance that does not apply to native Windows build
tooling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 12:54:28 +00:00
Laszlo Nagy 2425630e9c fix(intercept): address second-round proof-read findings
Follow-up to bee907e. Narrow-scope corrections from an independent
second review of the masquerade-wrapper handling.

Behaviour

- filter_out_paths now normalises trailing path separators on both
  sides before comparing, so a PATH entry written as
  '/usr/lib64/ccache/' (with trailing slash) still matches an
  excluded dir derived from PathBuf::parent() (which never has one).
  Without this, the defensive "already excluded but returned again"
  branch fired and Bear gave up with a bogus "no real compiler past
  masquerade dir" warning even when a real compiler was reachable.
  New unit test filter_out_paths_matches_across_trailing_separator
  protects the case.

- integration-tests/build.rs now also scans
  /opt/homebrew/opt/ccache/libexec and /usr/local/opt/ccache/libexec,
  the default Homebrew ccache masquerade locations on Apple Silicon
  and Intel macOS. Without these, a Homebrew-equipped developer saw
  host_has_ccache_masquerade silently stay unset and the recursion
  integration test skipped.

Tests

- wrapper_mode_survives_masquerade_wrapper_in_path now also asserts
  that the recorded compiler path is absolute, matching the
  acceptance criterion's "absolute path to the real compiler"
  language.

Docs

- Requirement: struck the "nested compiler invocations ... .bear/
  stays at the front of the child's PATH" bullet from the
  acceptance criteria. That guarantee belongs to
  interception-wrapper-mechanism and is preserved here by not
  modifying the child's PATH; the previous wording implied this
  requirement owns a guarantee it only protects.
- Requirement: clarified that the PATH-scan path
  (compiler_candidates) filters per-file rather than per-directory.
  Distro-shipped masquerade dirs contain only symlinks, so the
  behaviours coincide in practice; the wording now matches what
  the code does.
- Requirement: added a "detection is symlink-based" entry under
  non-functional constraints. Masquerade wrappers installed as
  shell scripts or hard copies are out of scope and will not be
  detected; if a non-symlink masquerade appears in the wild,
  extend detection rather than widen the classification helper to
  read file contents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 10:22:35 +00:00
Laszlo Nagy bee907ea2e fix(intercept): apply review findings for masquerade-wrapper handling
Follow-up to d7305ba. Addresses issues raised in an independent review
of the wrapper-recursion fix.

Behaviour

- resolve_program_path now also filters masquerade wrappers when the
  supplied CC/CXX/... value is an absolute path or a relative path
  with a directory component. Before, CC=/usr/lib64/ccache/gcc
  (or CC=./ccache-dir/gcc) bypassed the filter entirely and Bear
  would store the ccache symlink in the wrapper config, recreating
  the exact loop the filter is meant to prevent. When the supplied
  path IS a masquerade wrapper, resolution falls back to the
  basename via PATH past the masquerade dir; if no real compiler is
  found, resolution returns None and the caller logs that it is
  skipping the compiler.

- resolve_past_masquerade_wrappers now emits a WARN when PATH is
  exhausted after excluding one or more masquerade dirs, naming the
  compiler and the dir(s). The acceptance criterion about logging
  "names the compiler and the detected wrapper" was otherwise only
  served by the generic "could not resolve to an executable on PATH"
  warning.

Performance

- is_masquerade_wrapper short-circuits with symlink_metadata before
  calling canonicalize. Without this, every executable in every
  PATH directory paid a real canonicalize syscall during PATH
  discovery; masquerade targets are always symlinks, so the
  non-symlink path is common and should be cheap.

Tests

- New unit test resolve_program_path_falls_back_past_masquerade_for_absolute_cc
  covers CC=/abs/path/to/masquerade/gcc with a real compiler
  elsewhere on PATH.
- The integration test wrapper_mode_survives_masquerade_wrapper_in_path
  replaces the .contains(".bear") substring check with a
  Path::starts_with on the exact wrapper directory -- the loose
  substring check could false-positive on any path that happens to
  contain ".bear".

Docs

- Requirement wording about detection rewritten: drops the
  prescriptive "read_link iteratively, not canonicalize" language
  (the /usr/bin/gcc -> gcc-13 rationale applied to compiler
  registration, not masquerade detection), and spells out that
  canonicalize must NOT be used for the registration path.
- Striked the "nested compiler invocation" Testing scenario from
  this requirement -- that guarantee belongs to
  interception-wrapper-mechanism and is preserved here by not
  modifying the child's PATH. Added a short note pointing there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 09:57:18 +00:00
Laszlo Nagy d7305bac20 fix(intercept): resolve past masquerade compiler wrappers in wrapper mode
Wrapper mode previously stored whatever `which(gcc)` returned as the
"real compiler" for each wrapper. On distributions with a ccache
masquerade in PATH (Fedora/Arch/Gentoo by default), that is the
ccache symlink, so the wrapper's child process was ccache. ccache
then searched PATH for gcc, skipping only symlinks to itself; Bear's
hard-linked wrapper in `.bear/` passed the self-check and was
re-executed, producing an infinite loop.

environment.rs now detects masquerade wrappers at discovery time by
canonicalising candidate paths and checking the target's basename
against a fixed set (ccache, distcc, icecc, colorgcc, buildcache).
The containing directory is stripped from the lookup PATH and
resolution retries, so the wrapper config always names the real
compiler. Both the CC-env and PATH-scan discovery paths are covered.

Other changes in the same fix:
- Requirement reworked around "resolve past masquerade wrappers at
  discovery time"; the original CCACHE_COMPILER proposal is
  documented as rejected, verified empirically to reproduce the
  hang via CCACHE_COMPILER pointing at the ccache symlink.
- Nine new unit tests cover detection, filtering, and the
  no-real-compiler fallback.
- New integration test wrapper_mode_survives_masquerade_wrapper_in_path
  prepends the masquerade dir to its own child PATH so the
  recursion scenario is exercised regardless of host PATH, while
  keeping other tests ccache-free.
- build.rs scans well-known masquerade locations (/usr/lib/ccache,
  /usr/lib64/ccache, /usr/libexec/ccache), exposes the found dir
  via CCACHE_MASQUERADE_DIR, and sets cfg(host_has_ccache_masquerade)
  to gate the new test.
- The manual ccache_free_path_and_compiler workaround in the
  wrapper-mode tests is gone; the tests now run against the host's
  real PATH and also protect this requirement.
- CI: Ubuntu job runs apt-get install ccache so the masquerade dir
  exists on every PR. The job PATH is deliberately not modified --
  ccache first on PATH would inflate event counts for preload-mode
  tests that assert exact compiler-event counts.

Side effect: ccache is bypassed while Bear is observing. That
matches Bear's observe-don't-optimise stance and keeps
compile_commands.json recording the real compiler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 09:35:38 +00:00
Laszlo Nagy 9dedc88cfb chore(deps): refresh dependency pins and consolidate into the workspace
Bump three constraints in the workspace table:
  - ctor 0.4 -> 0.6 to align with the version Fedora is staging in
    rust-ctor PR #2, removing the need for a downstream relaxation
    patch.
  - signal-hook 0.3 -> 0.4 and serde-saphyr 0.0.22 -> 0.0.24 to
    pick up upstream patch fixes.
None require source changes.

Raise the insta floor to 1.46 and proptest to 1.11 so the manifest
stops advertising stale minimums; both stay at-or-below what Fedora
rawhide ships, so no new substitution constraints are imposed on
packagers.

Hoist the four remaining inline dev-deps (mockall, insta, proptest,
encoding_rs) into [workspace.dependencies] so every dependency
version lives in one place.
2026-04-24 07:50:54 +00:00
Laszlo Nagy 8fc472ecf6 feat(output): drop invalid entries with a warning instead of aborting
Previously, the first entry that failed Entry::validate() returned a
SerializationError from the serializer, which propagated back through
the consumer, dropped the event channel, and aborted the build's output
entirely -- producing no compile_commands.json and, in live-intercept
mode, a flood of "sending on a disconnected channel" errors as ongoing
compilations kept sending events past the already-closed consumer
(issue #692).

Make validation a distinct stage in the output pipeline instead of a
side-effect of serialization. A new ValidatingOutputWriter sits just
above ClangOutputWriter: invalid entries are dropped, each drop emits
a WARN line with file + directory + reason, and a new
entries_dropped_invalid counter appears in the pipeline summary.

When every candidate entry is dropped (entries_written == 0 &&
entries_dropped_invalid > 0), Bear also emits a single ERROR-level
summary line so the resulting empty database is never silent.

With validation moved upstream, JsonCompilationDatabase::write no
longer needs Result-wrapped entries; serialize_result_seq is renamed
to serialize_seq and takes a plain iterator.

Tangentially fixes the channel-disconnect log spam: TcpEventProducer
and RawEventReader now break quietly on first disconnect rather than
logging an ERROR per remaining event. A closed channel during shutdown
is normal, not a failure.

Requirement output-json-compilation-database is extended with a
"Validation failure handling" subsection documenting the contract,
along with Given/When/Then scenarios for partial-drop and total-drop
cases. Integration tests in integration-tests/tests/cases/config.rs
cover both scenarios end-to-end via the semantic subcommand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 12:05:27 +00:00
Laszlo Nagy 6afbc8e8e0 fix(output): return "." from relative_to for self-referential paths
When `format.paths.directory: relative` is configured, the converter
resolves the working directory relative to itself. The component walk
in relative_to exhausts both iterators with no remaining components and
returned an empty PathBuf, which failed Entry::validate (EmptyDirectory)
and aborted the output pipeline -- producing no compile_commands.json
and a flood of "sending on a disconnected channel" log errors on the
live-intercept side.

Emit the POSIX "same directory" form instead. Covered by a new unit
test for the self-to-self case and two integration tests that reproduce
the reporter's config (directory + file both relative) and isolate the
bug to the `directory: relative` axis.

Fixes #692
2026-04-23 11:44:18 +00:00
Laszlo Nagy 5c94ee616c requirements: drop inline test-name pointers and coverage-pending stubs
Two requirement files still named specific integration tests in
prose (\`Protected by \`<test_name>\`.\`) and flagged missing
coverage with \`Coverage pending.\` sentinels. Both patterns are the
same rot-prone anti-pattern we removed from frontmatter in 51d1bbb:
the mapping belongs in the test source via \`// Requirements: <id>\`,
not in requirement files.

- output-compilation-entries.md: remove the three \`Protected by\`
  lines and the six \`Coverage pending.\` stubs, plus the outdated
  Notes bullet that listed pending test cases.
- interception-signal-forwarding.md: remove the four \`Protected by\`
  lines, the \`Coverage pending.\` stub, and the Notes bullet
  pointing at the now-implemented mid-compile test.

The Testing sections now read as pure Given/When/Then scenario
specs. To find tests that protect a requirement, grep the repo for
\`Requirements:.*<id>\`; the coverage script at
\`requirements/check-coverage.sh\` verifies that every implemented
requirement has at least one tagged test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:29:58 +00:00
Laszlo Nagy 56db46349a test(intercept): cover mid-compile signal interruption
Add exit_code_when_compiler_is_interrupted_mid_compile: uses a FIFO
source with no writer so the compiler blocks on read(), sends SIGTERM
to bear via \`kill -TERM\`, and asserts Bear exits with non-success
within ~1 second. This covers the "Coverage pending" scenario
explicitly called out in interception-signal-forwarding's Testing
section (interrupted mid-compile, as opposed to the interrupted-sleep
case already covered by exit_code_when_signaled).

Uses \`kill -TERM\` rather than std::process::Child::kill() (which is
SIGKILL) so Bear's signal handler path is exercised, not bypassed.

Known implementation gap not covered here: the spec's shell-trap
criterion ("script receives the signal, its trap runs, and Bear's
exit code reflects whatever the script ultimately exited with") is
not satisfied by the current implementation. bear/src/intercept/
supervise.rs:35 sends SIGKILL to the child, which cannot be caught,
so shell traps never run. Writing this test revealed the gap; fixing
the gap is out of scope for this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:24:23 +00:00
Laszlo Nagy a8adfce111 test(intercept): cover wrapper-mode .bear/ lifecycle and determinism
Two integration tests for interception-wrapper-mechanism scenarios
that were not yet exercised:

- wrapper_mode_creates_and_cleans_up_bear_directory: the build script
  observes .bear/ during execution (records a sentinel file), and the
  test asserts .bear/ is removed after Bear exits.
- wrapper_mode_bear_directory_is_deterministic_across_runs: two
  back-to-back Bear invocations in the same working directory both
  log the observed wrapper directory; the assertion pins the name to
  the deterministic ".bear" form (not a random temp dir) and verifies
  cleanup after each run.

Both tests strip ccache directories from PATH (same workaround
wrapper_mode_resolves_cc_bare_name_via_path uses) to prevent the
ccache recursion documented in interception-wrapper-recursion: an
initial run without it took 6 minutes, produced 1929 duplicate events,
and failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:17:24 +00:00
Laszlo Nagy 9995e9f220 test(output): cover duplicate-detection match fields and append priority
Add five integration tests for output-duplicate-detection that exercise
acceptance criteria not covered by the existing duplicate_filter_config:

- duplicate_match_on_file_alone_collapses_flag_variants: two semantic
  events with the same file but different flags collapse to one entry
  when match_on is [file]; first-occurrence wins (-O2 kept, -O3
  dropped).
- duplicate_match_on_file_and_output_preserves_differing_outputs: same
  source compiled to debug/test.o vs release/test.o yields two entries
  under match_on: [file, output].
- duplicate_match_on_command_and_arguments_is_rejected: config
  validation rejects lists that mix command and arguments.
- duplicate_match_on_empty_is_rejected: config validation rejects an
  empty match_on list.
- duplicate_append_mode_preserves_original_entry: with
  match_on: [file, directory] and --append, the original -O2 entry
  read from the existing compile_commands.json takes priority over a
  new -O3 entry for the same file (also exercises output-append).

All tests drive the semantic subcommand with hand-crafted events for
deterministic field control; no build/interception required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:06:15 +00:00
Laszlo Nagy e89dd8a6f7 test(output): cover canonical, relative, and fallback path formats
The existing path_format_config test only exercised the 'absolute'
strategy. Add three tests for the remaining acceptance criteria:

- canonical_path_format_resolves_symlinks: compiles via a symlinked
  directory (src/ -> real/) and asserts the canonical file field
  resolves through the symlink to the real path.
- relative_file_format_is_relative_to_directory: invokes the compiler
  with an absolute source path and asserts 'file: relative' rewrites
  it relative to the formatted directory (which is absolute).
- canonical_file_format_falls_back_for_missing_source: feeds the
  semantic subcommand an event referencing a source that does not
  exist on disk; asserts the entry is kept (not dropped) and file
  falls back to the unformatted path.

The Windows \\?\ prefix stripping scenario is Windows-only and remains
uncovered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:57:56 +00:00
Laszlo Nagy 7fc6c097ae test(output): cover atomic-write success and failure paths
Add two integration tests for output-atomic-write:

- atomic_write_cleans_up_temp_file_on_success: verifies the
  deterministic temp filename (output extension replaced with .tmp) is
  removed after a successful run so only the final compile_commands.json
  remains.
- atomic_write_preserves_existing_object_on_failure: forces a rename
  failure by making the output path a non-empty directory (stable
  across environments regardless of whether tests run as root), then
  verifies the pre-existing filesystem object is untouched.

The read-only-directory approach was considered and rejected because
root bypasses DAC permission checks in the typical devcontainer setup.
Rename-over-non-empty-directory fails with ENOTEMPTY/EISDIR for every
user, including root.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:53:21 +00:00
Laszlo Nagy e51d879d5c test(output): cover compile-entries transformation rules
Add six integration tests that exercise acceptance criteria in
output-compilation-entries that existed in the requirement spec but
had no integration coverage:

- compile_and_link_split_produces_compile_entries: cc -o a.out src1.c
  src2.c src3.c yields one compile entry per source, none for a.out.
- pure_link_invocation_produces_no_entries: cc -o a.out src.o (after
  pre-building src.o outside Bear) yields an empty database.
- link_only_flags_are_stripped_from_entries: cc -o a.out -lm -O2 src.c
  keeps -O2 in the entry and drops -lm.
- argument_order_is_preserved_in_entries: cc -I first -I second -DFOO
  -DBAR -c src.c preserves the original relative order of include paths
  and macro defines.
- info_only_invocation_produces_no_entries: cc --version yields an
  empty database.
- output_field_is_recorded_when_enabled: with format.entries.
  include_output_field: true, a multi-source invocation records
  output = a.out on every entry (the known limitation: the single -o
  value is copied verbatim, not inferred per source).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:50:21 +00:00
Laszlo Nagy 2e117e6b45 test(output): add integration coverage for source directory filter
The filter already had thorough unit tests in
bear/src/output/writers/filtering/source.rs (last-match-wins,
component-boundary matching, case sensitivity, platform separators,
complex scenarios), but none of them were tagged with the requirement,
so the coverage script flagged output-source-directory-filter as
uncovered.

- Tag the unit tests in source.rs with the requirement ID.
- Add source_directory_filter_config: an end-to-end integration test
  that exercises the YAML-config -> pipeline path with three rules
  (include src, exclude src/test, include src/test/integration) plus
  a file outside any rule, verifying last-match-wins and default-include
  semantics on the actual output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:43:39 +00:00
Laszlo Nagy d9b050be00 test: tag existing tests with the requirements they protect
Four tests already exercised requirements but carried no `Requirements:`
tag, so they were invisible to the coverage script. Tag them with the
requirement IDs they already protect:

- hardened_env_clear, hardened_competing_ld_preload,
  shell_command_interception -> interception-preload-mechanism
- semantic_non_compilation_events -> output-compilation-entries
  (exercises the non-compiler filter)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:41:13 +00:00
Laszlo Nagy 51d1bbb48d requirements: link tests to requirements via source tags
Replace the `tests:` frontmatter list in each requirement file with a
`Requirements: <id>` tag placed directly on the protecting test(s). This
gives tests a single source of truth for the link, so renaming or deleting
a test cannot silently orphan a requirement.

- Drop `tests:` from the template and all 11 requirement files.
- Document the new tag convention in requirements/CLAUDE.md and
  integration-tests/CLAUDE.md; remove the unused `test_req_<id>_<desc>`
  naming rule.
- Tag the existing integration tests (compilation_output, config,
  exit_codes, intercept) with the requirements they protect.
- Add `requirements/check-coverage.sh`, which scans implemented
  requirements and fails when any has zero tagged tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:23:11 +00:00
Laszlo Nagy 78b6532e4f test(msvc): cover glued form and clang-cl inheritance for per-warning options
Follow-up to 9c04b2e. That change added integration coverage for the
space-separated forms of /wd, /we, /wo, /w1-/w4, but left two gaps:

1. The glued form (/wd4995, /w34326) now runs through
   FlagPattern::ExactlyWithGluedOrSep instead of the former
   FlagPattern::Prefix -- a different matcher path.

2. clang_cl.yaml inherits these rules via `extends: msvc`. The
   bear-codegen snapshot proves the generated array is correct, but
   no runtime test drives clang-cl with these flags.

Add msvc_per_warning_options_preserve_glued_value and
clang_cl_inherits_msvc_per_warning_options.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:54:55 +00:00
scc 9c04b2e1a5 fix(msvc): handle all per-warning cl.exe options
cl.exe accepts a warning-number argument either glued (/wd4995) or as
a separate token (/wd 4995); the latter form is widely used in nmake
Makefiles. Three gaps on the current 4.1.2-rc tip:

1. `/wd*`, `/we*`, `/wo*` used a plain prefix pattern that matches the
   glued form only. When bear saw "/wd 4995", the flag consumed zero
   extra args and the trailing numeric token was reclassified as a
   Source and dropped from compile_commands.json. clangd then emitted
   drv_invalid_int_value for every translation unit.

2. `/w1nnnn`, `/w2nnnn`, `/w3nnnn`, `/w4nnnn` (set warning level for a
   specific warning) were not defined at all, so `/w1 4326` was split
   into an unknown flag plus an orphan numeric token.

3. `/Wv[:version]` was not defined. Both the bare `/Wv` form (cl uses
   the current compiler version when omitted) and `/Wv:17` were
   affected.

All three classes are documented on the MS warning-level options page:
https://learn.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level

Fix:
  * /wd*, /we*, /wo*  ->  /wd{ }*, /we{ }*, /wo{ }*
    (ExactlyWithGluedOrSep, matching /D, /I, /U, /FI).
  * Add /w1{ }*, /w2{ }*, /w3{ }*, /w4{ }*.
  * Add /Wv (exact) plus /Wv:* (ExactlyWithColon, required value).

clang_cl.yaml inherits the fix via `extends: msvc`. Codegen snapshot
fixtures updated accordingly.

Two integration tests in integration-tests/tests/cases/semantic.rs
cover all three classes.

Manually verified by scc-tw <scc@scc.tw>.
Closes: #690

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 20:49:57 +10:00
Laszlo Nagy c11cf30ac4 requirements: flesh out signal-forwarding and compilation-entries specs
Expand the stubs into full user-facing requirement pages. Both follow
the template from requirements/CLAUDE.md: intent, acceptance criteria,
known limitations (where applicable), and Given/When/Then testing.

interception-signal-forwarding covers Ctrl-C, SIGTERM, and SIGQUIT
propagation to the build, exit-code preservation including the
signal-termination case, shell-trap interactions, and the
nested-bear unsupported case.

output-compilation-entries covers the transformation from an
intercepted invocation into zero, one, or many entries: multi-source
splitting, pure-link rejection, link-flag stripping, argument-order
preservation, and the current surprising behaviour of the output
field on multi-source invocations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 09:36:08 +00:00
Laszlo Nagy 3ebfea41f5 fix(lint): use sort_by_key instead of sort_by in bear-codegen
Fixes clippy::unnecessary_sort_by warning on Rust 1.95+.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 06:38:02 +00:00
Laszlo Nagy 7566a86c40 requirements: rewrite preload and add wrapper interception specs
The preload requirement was a stub (27 lines). Rewrote it to match
the quality of other requirement files: user-perspective intent,
implementation details, acceptance criteria, known limitations with
GitHub issue references, and Given/When/Then test scenarios.

Added a new requirement for wrapper-based interception covering
activation defaults, setup phase, design decisions (hard links,
deterministic directory, startup resolution), and known limitations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 06:30:09 +00:00
Laszlo Nagy 16406858ac requirements: document output pipeline and simplify naming
Add requirement pages for the output pipeline features:
- output-append: --append flag and merge behavior
- output-atomic-write: temp file + rename pattern
- output-duplicate-detection: configurable hash-based dedup
- output-path-format: as-is/absolute/relative/canonical strategies
- output-source-directory-filter: include/exclude rules by path

Extend output-json-compilation-database with the full Clang spec
format definition, command field escaping details, and compiler
path handling.

Remove the sequential number from requirement filenames and drop
the redundant id frontmatter field. The filename itself serves as
the unique identifier for cross-references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 06:02:31 +00:00
dependabot[bot] 8315c4d6ba build(deps): Bump rand from 0.9.2 to 0.9.4
Bumps [rand](https://github.com/rust-random/rand) from 0.9.2 to 0.9.4.
- [Release notes](https://github.com/rust-random/rand/releases)
- [Changelog](https://github.com/rust-random/rand/blob/0.9.4/CHANGELOG.md)
- [Commits](https://github.com/rust-random/rand/compare/rand_core-0.9.2...0.9.4)

---
updated-dependencies:
- dependency-name: rand
  dependency-version: 0.9.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-04-16 18:21:34 +10:00
Water_s0urce 14fd1ec1bf Update INSTALL.md with 'lld' requirement for Linux
Added note about the requirement of 'lld' for Linux/ELF platform. Follow-up of the commit ead6251 (Clarify comment about forcing lld linker), which affected a comment in `intercept-preload/build.rs` file for version 4.1.0.
2026-04-16 18:17:38 +10:00
Laszlo Nagy a3f9c8671e requirements: capture missing functionalities 2026-04-16 08:15:20 +00:00