Kovid Goyal
82e2fe82d6
Add a couple more gseg tests
2025-04-11 13:34:16 +05:30
Kovid Goyal
c03dd673ae
Restore fast path for printable ASCII
2025-04-11 09:34:21 +05:30
Kovid Goyal
e976cf67fd
Make GraphemeBreakProperty available globally
2025-04-11 09:34:21 +05:30
Kovid Goyal
6712169c0f
...
2025-04-01 17:18:11 +05:30
Kovid Goyal
057dde35a7
Use a two stage lookup table for segmentation
...
Saves one extra array lookup at no cost in size
2025-04-01 14:25:24 +05:30
Kovid Goyal
557e6547f2
...
2025-04-01 13:31:20 +05:30
Kovid Goyal
d4d2ae969e
Use a branchless check for unicode range
2025-04-01 12:32:17 +05:30
Kovid Goyal
6ecd78d9db
Remove bounds checking for unicode table access in Go
2025-04-01 10:41:17 +05:30
Kovid Goyal
de1adeee5e
DRYer
2025-03-31 22:01:49 +05:30
Kovid Goyal
66856e7b52
Use a multi-stage lookup table for grapheme segmentation
2025-03-31 21:51:28 +05:30
Kovid Goyal
163b3de85b
Also forgot to add non-characters to invalid class
2025-03-30 10:44:26 +05:30
Kovid Goyal
a5a25fbd8c
Fix missed out some codepoints when porting is_non_rendered to unicode lookup table
...
Fixes #8495
2025-03-30 10:40:19 +05:30
Kovid Goyal
2eed7b62ab
More work on seg lookup tables
2025-03-29 09:35:44 +05:30
Kovid Goyal
d9d483d2c1
More work on segmentation lookup table
2025-03-29 08:49:52 +05:30
Kovid Goyal
01cdfcd002
Work on table based lookup for grapheme segmentation
2025-03-28 15:06:48 +05:30
Kovid Goyal
3e50588525
Add a test for PUA recog
2025-03-25 16:52:01 +05:30
Kovid Goyal
fd2bbf57e3
Make unicode category data useable in other modules
2025-03-25 16:35:09 +05:30
Kovid Goyal
294de16898
Use ms table for remaining UCD lookups
2025-03-25 15:41:34 +05:30
Kovid Goyal
aad58cf703
Declare CharProps just once
2025-03-25 14:08:47 +05:30
Kovid Goyal
d429f732e1
DRYer
2025-03-25 13:45:56 +05:30
Kovid Goyal
61ae12e0a9
DRYer
2025-03-25 13:29:11 +05:30
Kovid Goyal
b66a763ddf
Use a 3 stage table for Unicode properties
...
Halves the data size and reduces source code size by 50x
Shows no significant runtime performance effect.
Allows for easily adding more properties to the table in the future
2025-03-25 13:16:59 +05:30
Kovid Goyal
9f7643078c
Use unicode multi-table for remaining hot path lookups
...
Results in a 15% improvement in the unicode throughput benchmark
2025-03-24 15:04:33 +05:30
Kovid Goyal
3d0e45ace8
Use the new multi-stage unicode table for wcwidth
2025-03-24 14:20:40 +05:30
Kovid Goyal
7697a1650d
Add is_emoji_presentation_base to char props table
2025-03-24 13:55:49 +05:30
Kovid Goyal
16f7380cb0
Implement grapheme segmentation in Go
2025-03-23 19:24:12 +05:30
Kovid Goyal
aa8c32006f
Implement grapheme seg algo in Go
2025-03-22 14:54:58 +05:30
Kovid Goyal
7e780a2294
CharProps data for Go
2025-03-22 13:18:09 +05:30
Kovid Goyal
9663f935fb
...
2025-03-22 11:56:56 +05:30
Kovid Goyal
583a858769
Use a multistage lookup table for grapheme segmentation
2025-03-22 11:50:04 +05:30
Kovid Goyal
0d866b1f13
Add tests for grapheme segmentation
...
Test data provided by Unicode organisation
2025-03-13 13:48:35 +05:30
Kovid Goyal
9c1c141775
Start work on grapheme segmentation algorithm
2025-03-13 11:19:54 +05:30
Kovid Goyal
98f9a568ce
Add Extended_Pictographic property
2025-03-13 10:01:41 +05:30
Kovid Goyal
039af78785
Add Indic Conjunct Break data
2025-03-13 09:18:42 +05:30
Kovid Goyal
1ee0b3369d
Fix GBP generation
2025-03-13 08:37:52 +05:30
Kovid Goyal
9cb56c2775
Run gofmt on grapheme-segmentation-data
2025-03-13 07:11:21 +05:30
Kovid Goyal
dc625c5e0c
Add grapheme break properties when generating wcwidth data
2025-03-13 07:06:46 +05:30
Kovid Goyal
61edc2aef7
Dont treat multicells containing narrow emoji as having emoji presentation
...
Fixes #8330
2025-02-14 20:37:31 +05:30
Kovid Goyal
1481fb4fe9
Dont generate mark mapping
2024-11-04 09:10:07 +05:30
Kovid Goyal
2b3f2258ff
More pyugrade to 3.9
2024-08-05 11:00:51 +05:30
Kovid Goyal
9bea8bb5bc
remove no longer needed code
2024-02-05 13:54:22 +05:30
Kovid Goyal
8cc2cad4d9
Use list of legal chars in URL from the WHATWG standard
...
Notably this excludes some ASCII chars: <>{}[]`|
See https://url.spec.whatwg.org/#url-code-points
Fixes #7095
2024-02-05 13:27:22 +05:30
Kovid Goyal
77292a16d6
Make shebangs consistent
...
Follow PEP 0394 and use /usr/bin/env python so that the python in the
users venv is respected. Not that the kitty python files are meant to be
executed standalone anyway, but, whatever.
Fixes #6810
2023-11-11 08:32:05 +05:30
Kovid Goyal
119582a9d4
Make relative imports work in gen scripts even when directly executed
2023-10-15 09:51:03 +05:30
Kovid Goyal
a79dd3996a
Also move data files for gen scripts into gen dir
2023-10-14 08:04:37 +05:30
Kovid Goyal
e6ef2fceea
py3.8 support
2023-10-14 07:57:03 +05:30
Kovid Goyal
56063b96fd
Move gen scripts into their own package
2023-10-14 07:44:18 +05:30