Commit Graph

16 Commits

Author SHA1 Message Date
Mohamed Bassem
4c0220f217 fix: drop idProvider from restate hot path 2026-01-03 13:27:59 +00:00
Mohamed Bassem
0efffdcc83 fix(restate): change journal retention for services to 3d 2025-12-25 13:02:37 +00:00
Mohamed Bassem
ddd4b578cd fix: preserve failure count when rescheduling rate limited domains (#2303)
* fix: preserve retry count when rate-limited jobs are rescheduled

Previously, when a domain was rate-limited in the crawler worker,
the job would be re-enqueued as a new job, which reset the failure
count. This meant rate-limited jobs could retry indefinitely without
respecting the max retry limit.

This commit introduces a RateLimitRetryError exception that signals
the queue system to retry the job after a delay without counting it
as a failed attempt. The job is retried within the same invocation,
preserving the original retry count.

Changes:
- Add RateLimitRetryError class to shared/queueing.ts
- Update crawler worker to throw RateLimitRetryError instead of re-enqueuing
- Update Restate queue service to handle RateLimitRetryError with delay
- Update Liteque queue wrapper to handle RateLimitRetryError with delay

This ensures that rate-limited jobs respect the configured retry limits
while still allowing for delayed retries when domains are rate-limited.

* refactor: use liteque's native RetryAfterError for rate limiting

Instead of manually handling retries in a while loop, translate
RateLimitRetryError to liteque's native RetryAfterError. This is
cleaner and lets liteque handle the retry logic using its built-in
mechanism.

* test: add tests for RateLimitRetryError handling in restate queue

Added comprehensive tests to verify that:
1. RateLimitRetryError delays retry appropriately
2. Rate-limited retries don't count against the retry limit
3. Jobs can be rate-limited more times than the retry limit
4. Regular errors still respect the retry limit

These tests ensure the queue correctly handles rate limiting
without exhausting retry attempts.

* lint & format

* fix: prevent onError callback for RateLimitRetryError

Fixed two issues with RateLimitRetryError handling in restate queue:

1. RateLimitRetryError now doesn't trigger the onError callback since
   it's not a real error - it's an expected rate limiting behavior

2. Check for RateLimitRetryError in runWorkerLogic before calling onError,
   ensuring the instanceof check works correctly before the error gets
   further wrapped by restate

Updated tests to verify onError is not called for rate limit retries.

* fix: catch RateLimitRetryError before ctx.run wraps it

Changed approach to use a discriminated union instead of throwing
and catching RateLimitRetryError. Now we catch the error inside the
ctx.run callback before it gets wrapped by restate's TerminalError,
and return a RunResult type that indicates success, rate limit, or error.

This fixes the issue where instanceof checks would fail because
ctx.run wraps all errors in TerminalError.

* more fixes

* rename error name

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-25 12:46:45 +00:00
Mohamed Bassem
dc8ab86279 feat(restate): Add a var to control whether to expose core services or not 2025-12-21 22:02:05 +00:00
Mohamed Bassem
58eb6c0054 feat: add more restate semaphore controls 2025-12-20 13:27:51 +00:00
Mohamed Bassem
510174db96 Revert "fix: fix restate service to return control to restate service on timeout"
This reverts commit 6db14ac492.
2025-12-15 15:48:07 +00:00
Mohamed Bassem
6db14ac492 fix: fix restate service to return control to restate service on timeout 2025-12-15 00:02:24 +00:00
Mohamed Bassem
a71b9505ea fix: Add restate queued idempotency (#2169)
* fix: Add restate queued idempotency

* return on failed to acquire
2025-11-30 00:19:28 +00:00
Mohamed Bassem
5426875949 feat: Introduce groupId in restate queue (#2168)
* feat: Introduce groupId in restate queue

* add group ids to the interface

* use last served timestamp
2025-11-24 01:23:06 +00:00
Mohamed Bassem
38842f77e5 fix: support invocation cancellation while awaiting sempahore 2025-11-24 00:47:03 +00:00
Mohamed Bassem
1b44eafeb3 fix: drop journal retention for sempahore and id providers 2025-11-17 01:48:29 +00:00
Mohamed Bassem
d4b7b89ae2 fix: stop retrying indefinitely in restate queues 2025-11-10 09:54:42 +00:00
Mohamed Bassem
4cf0856e39 feat: add crawler domain rate limiting (#2115) 2025-11-09 21:20:54 +00:00
Mohamed Bassem
b28cd03a4a refactor: Allow runner functions to return results to onComplete 2025-11-09 20:13:39 +00:00
Mohamed Bassem
03161482b4 refactor: Extract ratelimiter into separate plugin (#2112)
* refactor(trpc): extract rate limiter into dedicated plugin

Move the rate limiting middleware from the trpc package to the
centralized plugins package. This improves code organization by
consolidating all plugins in a single location.

Changes:
- Created packages/plugins/trpc-ratelimit/ plugin
- Moved rate limiter from packages/trpc/rateLimit.ts to
  packages/plugins/trpc-ratelimit/src/index.ts
- Added trpc-ratelimit export to plugins package.json
- Added @trpc/server dependency to plugins package
- Updated trpc package to import from @karakeep/plugins/trpc-ratelimit
- Added @karakeep/plugins dependency to trpc package
- Removed packages/trpc/plugins/ directory

* refactor(plugins): decouple rate limiter from tRPC

Refactor the rate limiting plugin to be framework-agnostic, allowing
it to be used outside of tRPC contexts. The plugin now has a generic
core with a tRPC-specific adapter.

Changes:
- Renamed trpc-ratelimit to ratelimit plugin
- Created generic RateLimiter class with framework-agnostic API
- Added checkRateLimit() method that returns allow/deny results
- Created separate tRPC adapter (src/trpc.ts) that uses the generic core
- Exported both generic (RateLimiter, globalRateLimiter) and
  tRPC-specific (createRateLimitMiddleware) APIs
- Updated trpc package to import from @karakeep/plugins/ratelimit
- Updated plugins package.json exports

Benefits:
- Rate limiter can now be used in any context (HTTP handlers, WebSocket, etc.)
- Cleaner separation of concerns
- Easy to create adapters for other frameworks
- Generic API allows for custom error handling

* refactor(plugins): integrate rate limiter with plugin registry

Refactor the rate limiting plugin to use the centralized plugin
system with PluginManager, making it consistent with other plugins
like queue and search providers.

Changes:
- Added RateLimit plugin type to PluginType enum
- Created RateLimitClient interface in packages/shared/ratelimiting.ts
- Created RateLimitProvider class implementing PluginProvider
- Updated plugin to auto-register with PluginManager on import
- Updated tRPC adapter to use getRateLimitClient() from PluginManager
- Added ratelimit plugin to loadAllPlugins() in shared-server
- Updated shared/plugins.ts with RateLimit type mapping

Benefits:
- Consistent plugin architecture across the codebase
- Rate limiter can be swapped with alternative implementations
- Centralized plugin management and logging
- Better separation of concerns
- Framework-agnostic core with tRPC adapter pattern

* refactor(trpc): move rate limit middleware to trpc package

Move the tRPC-specific rate limiting middleware from the plugins
package to the trpc package, making the plugins package
framework-agnostic.

Changes:
- Moved packages/plugins/ratelimit/src/trpc.ts to
  packages/trpc/lib/rateLimit.ts
- Updated packages/trpc/index.ts to import from local lib/rateLimit
- Removed tRPC export from packages/plugins/ratelimit/index.ts
- Removed @trpc/server dependency from packages/plugins/package.json

Benefits:
- plugins package is now framework-agnostic
- tRPC-specific code lives in the trpc package where it belongs
- Cleaner separation of concerns
- Rate limiter plugin can be used in any context without tRPC

* refactor(plugins): rename to ratelimit-memory and add tests

Rename the rate limiting plugin from "ratelimit" to "ratelimit-memory"
to better indicate it's an in-memory implementation. This naming leaves
room for future implementations like ratelimit-redis. Also added
comprehensive test coverage.

Changes:
- Renamed packages/plugins/ratelimit to ratelimit-memory
- Updated package.json export from ./ratelimit to ./ratelimit-memory
- Updated shared-server to import @karakeep/plugins/ratelimit-memory
- Added comprehensive unit tests (index.test.ts):
  - Rate limit enforcement tests
  - Window expiration tests
  - Identifier and path isolation tests
  - Reset functionality tests
  - Cleanup mechanism tests
- Added provider integration tests (provider.test.ts):
  - PluginProvider interface compliance
  - Client singleton behavior
  - End-to-end rate limiting functionality

Benefits:
- More descriptive plugin name indicating the storage mechanism
- Better test coverage ensuring reliability
- Easier to add alternative implementations (Redis, etc.)

* change the api to only take the key

* move the serverConfig check to the trpc

* fix lockfile

* get rid of the timer

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-09 17:10:54 +00:00
Mohamed Bassem
99413db0e7 refactor: consolidate multiple karakeep plugins into one package (#2101)
* refactor: consolidate plugin packages into single plugins directory

- Create new `packages/plugins` directory with consolidated package.json
- Move queue-liteque, queue-restate, and search-meilisearch to subdirectories
- Update imports in packages/shared-server/src/plugins.ts
- Remove individual plugin package directories
- Update shared-server dependency to use @karakeep/plugins

This reduces overhead of maintaining multiple separate packages for plugins.

* refactor: consolidate plugin config files to root level

- Move .oxlintrc.json to packages/plugins root
- Move vitest.config.ts to packages/plugins root
- Update vitest config paths to work from root
- Remove individual config files from plugin subdirectories

This reduces configuration duplication across plugin subdirectories.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-08 14:50:00 +00:00