149 Commits

Author SHA1 Message Date
hius07
f8175e34a4 koptinterface: prevent crash when word is nil (#14705) 2025-12-10 14:18:33 +02:00
Frans de Jonge
f69a2eebb2 DjVu: flatten nested text layer
Closes #2943.
2025-09-27 12:48:02 +02:00
Frans de Jonge
d2d39c503d KoptInterface: prevent crash when word is nil 2025-09-15 19:49:11 +02:00
zwim
6fd1335196 Fix some typos (harmless) (#14079) 2025-07-21 21:18:13 +02:00
hius07
d412dd2fba koptinterface: search with leading and trailing spaces (#13414) 2025-03-16 08:03:54 +02:00
weijiuqiao
b045df3ff3 Vocabbuiler.koplugin: fix PDF context extraction with hyphenation (#12975)
As pointed out at https://github.com/koreader/koreader/issues/12916#issuecomment-2564755827.
2025-01-01 11:28:11 +01:00
weijiuqiao
2298ef76f0 vocabbuilder.koplugin: fix selecting context bug (#12938)
Missed one line in #12917.

Fixes #12916
2024-12-25 09:28:00 +01:00
weijiuqiao
9df814593d VocabBuilder.koplugin: support PDF context extraction for multi-word phrase (#12917)
Closes #12916. Closes #12475.
2024-12-18 14:13:37 +01:00
Benoit Pierre
0fd8c5af81 kopt: implement reflow
Factorized from lower layers' implementations (djvu & mupdf).
2024-11-20 22:23:02 +01:00
Benoit Pierre
94ee6813f6 kopt: minor factorization 2024-11-20 22:23:02 +01:00
Benoit Pierre
18d2ec6761 kopt: fix OCR segmentation mode (#12726)
Previously unused by `libk2pdfopt`, the `ocr_type` argument passed to `k2pdfopt_tocr_single_word`
and forwarded to `ocrtess_ocrwords_from_bmp8` now has a big impact for some languages (e.g. Arabic).
2024-11-11 12:44:22 +01:00
mergen3107
f8446538c0 Fix KOReader spelling in the code (#12670) 2024-10-24 10:46:46 +02:00
hius07
975efae929 ReaderSearch: "All text" improve (#12287) 2024-08-06 19:12:16 +03:00
Benoit Pierre
99d45d7584 djvu: honor render mode when reflowing 2024-07-19 21:32:20 +02:00
Benoit Pierre
0c17941ffb kopt: color support
Keep colors when reflowing documents.
2024-07-19 21:32:20 +02:00
Benoit Pierre
cb002f3d1f kopt: fix bad KoptInterface:renderPage call
Fix `render_mode` argument: add missing `gamma` argument.
2024-07-19 21:32:20 +02:00
Benoit Pierre
4c6919ac2a bump base: update tesseract, leptonica and libk2pdfopt
- update leptonica to 1.84.1
- update tesseract to 5.3.4
- update libk2pdfopt to 2.55
2024-06-01 09:56:36 +02:00
Benoit Pierre
10e6f489d0 kopt: honor TESSDATA_PREFIX environment variable
Don't override it by default, but honor it if present.
2024-06-01 09:56:36 +02:00
Galunid
ca14420372 Add relevant nil guards to prevent reflow crashes (#11715)
closes #10854 #9272 #4481
2024-04-27 23:16:13 +02:00
poire-z
43d36b2ea9 TextBoxWidget: allow showing bits of text in bold
Allow for embedding "tags" (invalid Unicode codepoints)
in the text string to trigger some text formatting:
for now only bolding some parts of text is possible.

Use it with fulltext search "all results" to highlight the
matched word (instead of the previously used brackets).
2024-01-18 12:51:10 +01:00
hius07
0ceb88a9a3 Fulltext search: all entries in entire document (#11313) 2024-01-13 12:58:05 +02:00
hius07
a767ad44db PDF contrast: incorrect set by a gesture (#10798) 2023-09-02 09:41:27 +03:00
poire-z
626864f856 [chore] replace utf8 bytes with Unicode escape sequence 2023-08-02 01:28:24 +02:00
poire-z
a4720b44cd Text search: various Kopt search fixes
- Properly parse input text for words (the previous
  code wasn't working with Greek letters)
- With multiple words search, don't allow "substring
  matching" for words in the middle
- Remove support for Lua pattens, so to get proper
  substring matching (as we have with cre text search)
2023-07-04 09:03:34 +02:00
hius07
cd56dd2edf ReaderHighlight: pdf multi-page highlights (#9850) 2022-12-02 20:22:27 +02:00
weijiuqiao
edf7cc9a61 Vocabulary builder: support extracting context from pdfs (#9622)
Move getSelectedWordContext(), now document specific,
from ReaderHighlight into each document module.
2022-10-25 12:23:18 +02:00
NiLuJe
62059f8d68 Misc: Get rid of the legacy defaults.lua globals (#9546)
* This removes support for the following deprecated constants: `DTAP_ZONE_FLIPPING`, `DTAP_ZONE_BOOKMARK`, `DCREREADER_CONFIG_DEFAULT_FONT_GAMMA`
* The "Advanced settings" panel now highlights modified values in bold (think about:config in Firefox ;)).
* LuaData: Isolate global table lookup shenanigans, and fix a few issues in unused-in-prod codepaths.
* CodeStyle: Require module locals for Lua/C modules, too.
* ScreenSaver: Actually garbage collect our widget on close (ScreenSaver itself is not an instantiated object).
* DateTimeWidget: Code cleanups to ensure child widgets can be GC'ed.
2022-09-28 01:10:50 +02:00
NiLuJe
13e8213e0a A random assortment of fixes (#9513)
* Android: Make sure sdcv can find the STL
* DocCache: Be less greedy when serializing to disk, and only do that for the *current* document ;).
* CanvasContext: Explicitly document API quirks.
* Fontlist: Switch the on-disk Persist format to zstd (it's ever so slightly faster).
* Bump base for https://github.com/koreader/koreader-base/pull/1515 (fix #9506)
2022-09-14 03:49:50 +02:00
NiLuJe
dcb11c2542 Make luacheck >= 0.26 happy (#9174)
Re: https://github.com/koreader/koreader-base/pull/1487
2022-06-11 19:06:06 +02:00
NiLuJe
7018853940 Stash enableCPUCores in CanvasContext
Avoids requring Device direction in Document.

The method needs complete access to the Device object, though, so it's
just another layer of indirection, with an extra reference on the Device
object stashed in CanvasContext...
(much like it already does for Screen)
2022-01-19 12:44:35 +01:00
hius07
6a5f330b3b Fix djvu crash on long-press on scanned text (#8626) 2022-01-08 08:39:13 +02:00
hius07
00b08d7b54 Bookmarks: fix sort within one page (#8616)
Accurate sorting of bookmarks located in one page depending on their positions in text.
2022-01-06 21:54:33 +02:00
Aleksa Sarai
5709b4c2f1 kopt: correctly handle CJK character detection for space insertion (#8438)
Previously getTextFromBoxes would just pass the first and last three
bytes of the current and previous words when trying to detect CJK
characters (which shouldn't have spaces inserted).

However, this handling was not correct because CJK characters can be
longer than 3 bytes, and internally BaseUtil.utf8charcode doesn't ensure
that it was only given a single utf8 character (it blindly does the bit
operations on whatever length code you give it).

As a result, before this patch selections in PDF documents would have
lots of spaces stripped because getTextFromBoxes would think that almost
all characters were CJK characters.

Fixes: 6f1b70e5eb ("util.utf8: improve CJK character detection")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-11-11 16:09:05 +01:00
NiLuJe
3483238546 Fix reflow calls for DjVu documents (#8379)
The second argument is a ddjvu_render_mode_t
Try to actually honor the user settings instead of enforcing COLOR
while we're there.

Fix #8376
Regression since #8250
2021-10-26 21:13:57 +02:00
Aleksa Sarai
3ffb4c1692 kopt: add fallbacks for cases where kctx is not in cache
There were a handful of cases where if there was no cached kctx there
was no fallback and several KoptInterface methods would return nil,
causing issues in various parts of KOReader (this happened with the
migration to selected_text everywhere but it's unclear how that change
caused this regression).

In any case, from a correctness perspective it makes sense to have the
corresponding fallback paths to create a new kctx if we couldn't find a
cached one.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-10-23 15:49:54 +02:00
Aleksa Sarai
a29d24f86d geom: supplement :combine with more generic .boundingBox
It is a bit cleaner to do all of the necessary looping over lists of
Geoms within a straight-forward Geom.boundingBox function rather than
looping over :combine every time (or reimplementing :combine in some
cases). Geom:combine can be trivially reimplemented in terms of
Geom.boundingBox as well.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-10-23 15:49:54 +02:00
yparitcher
888802f618 kopt: allow pdf auto straighten 2021-10-05 23:34:29 +02:00
NiLuJe
48da545e32 Kobo/Elipsa: More fine-grained control over the amount of online CPU
cores

* Only keep a single core online most of the time.
* Device: Add an enableCPUCores method to allow controlling the amount of
  online CPU cores.
* Move the initial core onlining setup to Kobo:init, instead of the startup script.
* Enable two CPU cores while hinting new (e.g., cache miss) pages in PDF land.
* Enable two CPU cores while processing book metadata.
* Drive-by fix to isolate the DocCache pressure check to KoptInterface
  and actually apply it when it matters most (e.g., k2pdfopt stuff).
2021-09-25 02:47:06 +02:00
poire-z
eeb09d2150 PDF text selection: fix/tweak spacing between words/boxes
We may get multiple boxes when selecting texts, one for each
word, and we have to add spaces between the extracted words
ourselves. Previously, we were only adding a space if the
last char of previous word was ASCII, so missing spaces
after accents or greek words.
Try to do better by measuring the distances between boxes
and comparing to box heights, with a few heuristics.
2021-07-20 15:19:59 +02:00
NiLuJe
e4a333a980 KOptInterface: Keep returning nil in get*Boxes when we don't actually
get any boxes

Exposed by #7624, but we were arguably putting garbage in the Cache
before that anyway, so, it w<asn't all that great either ;p.

Fix #7850
2021-06-16 13:53:17 +02:00
NiLuJe
1ffbd8760d KOPTInterface: Minor optimization when hashing the configurable status
Use a table & table.concat instead of individual concats.
And then use that same table for every hash-related operation.

(Nothing else uses the configurable hash function, otherwise I'd have
limited the table shenanigans to the function itself).
2021-05-09 23:10:44 +02:00
NiLuJe
21b067792d Cache: Rewrite based on lua-lru
Ought to be faster than our naive array-based approach.
Especially for the glyph cache, which has a solid amount of elements,
and is mostly cache hits.
(There are few things worse for performance in Lua than
table.remove @ !tail and table.insert @ !tail, which this was full of :/).

DocCache: New module that's now an actual Cache instance instead of a
weird hack. Replaces "Cache" (the instance) as used across Document &
co.
Only Cache instance with on-disk persistence.

ImageCache: Update to new Cache.

GlyphCache: Update to new Cache.
Also, actually free glyph bbs on eviction.
2021-05-05 20:37:33 +02:00
NiLuJe
ce624be8b8 Cache: Fix a whole lot of things.
* Minor updates to the min & max cache sizes (16 & 64MB). Mostly to satisfy my power-of-two OCD.
  * Purge broken on-disk cache files
  * Optimize free RAM computations
  * Start dropping LRU items when running low on memory before pre-rendring (hinting) pages in non-reflowable documents.
  * Make serialize dump the most recently *displayed* page, as the actual MRU item is the most recently *hinted* page, not the current one.
  * Use more accurate item size estimations across the whole codebase.

TileCacheItem:

  * Drop lua-serialize in favor of Persist.

KoptInterface:

  * Drop lua-serialize in favor of Persist.
  * Make KOPTContext caching actually work by ensuring its hash is stable.
2021-05-05 20:37:33 +02:00
Martín Fernández
40b4ccffa8 KoptInterface:getWordFromBoxes: guard against nil boxes (#6827)
Fixes #6825
2020-10-26 12:26:18 +01:00
Galunid
15455b594d [feat] Comics: zoom to panel (#6511)
This pull requests aims to provide convenient way to zoom in comics. The idea is when user holds/double taps (not decided yet) on a manga/comic panel, it gets cut out from the rest of the image and zoomed. More details in koreader/koreader-base#1148. Depends on koreader/koreader-base#1159
2020-09-24 15:17:37 +02:00
Martín Fernández
d87b09d11c handle newlines in exported pdf highlights (#6247) 2020-06-09 17:07:51 +02:00
Frans de Jonge
b2554ba5da [fix] Prevent crash when no page boxes (#5289)
Can occur with invalid page numbers, for example by changing the font size in a reflowable MuPDF document.

Discussion in <https://github.com/koreader/koreader/pull/5282#issuecomment-526842921>.
2019-09-01 22:17:15 +02:00
Frans de Jonge
bc2412a67a [doc] Convert koptinterface comments to LDoc format (#5290) 2019-08-31 21:15:22 +02:00
poire-z
5c38bcb8b7 [UX] Links menu and handling tweaks (#4867)
- Removed "Swipe to follow first link on page" menu item and
  handling code, as it feels not really as practical as
  "Swipe to follow nearest link".
- Removed recently added "External link action", as we can
  just always present a popup with the url and the available
  actions.
- Generic handling of these actions in onGoToExternalLink(),
  so they are proposed on the Wikipedia lookup popup too.
- Allow external link on PDF documents (previously, only
  internal links were handled).
- Have "Ignore external links on tap" available on all
  document types.
- Added "Ignore external links on swipe" (default to true,
  the current behaviour).
- Added multiswipe gesture "Follow nearest internal link"
  (the existing "Follow nearest link" now follows the
  nearest external or internal link)
- ButtonDialogTitle: added an option to look a bit more
  alike ConfirmBoxes.
- Footnote popups: fix link unhighlight when tap on external link.
2019-04-02 18:27:35 +02:00
Qingping Hou
1605409c60 rename runtimectl to document/canvascontext 2019-03-03 13:10:45 +01:00