20 Commits

Author SHA1 Message Date
Kovid Goyal
ad3ab877f8 Use a fast SIMD implementation to XOR data going into the disk cache 2024-02-25 09:57:43 +05:30
Kovid Goyal
f48e4ffd5e Port aligned load based find algorithm to C 2024-02-25 09:57:42 +05:30
Kovid Goyal
f1fe0bf40a Code to easily compare SIMD and scalar decode in a live instance
Also remove -mtune=intel as it fails with clang
2024-02-25 09:57:41 +05:30
Kovid Goyal
6cdc7ac91d A further 5% speedup for UTF-8 decoding
Achieved by decoding in larger chunks thereby amortizing the cost
of creating various constant vectors over larger chunks.
2024-02-25 09:57:40 +05:30
Kovid Goyal
9cb9373274 Allow unbounded output in UTF8Decoder
This will allow us to eventually decode more than a single
vector's worth in a fast inner loop
2024-02-25 09:57:39 +05:30
Kovid Goyal
3b65c1a58a remove declaration without implementation 2024-02-25 09:57:39 +05:30
Kovid Goyal
7e77a196e6 Build only the SIMD code with SIMD compiler flags 2024-02-25 09:57:38 +05:30
Kovid Goyal
8975d1a9f4 no need to parametrize sentinel 2024-02-25 09:57:31 +05:30
Kovid Goyal
0ed1c6f840 Simplify utf8 parser func
Also show a replacement char for incomplete utf-8 sequences interrupted by an esc char
2024-02-25 09:57:31 +05:30
Kovid Goyal
72e73f2f81 Fix alignment of output array in UTF8Decoder 2024-02-25 09:57:31 +05:30
Kovid Goyal
ba18c5a669 Move ByteLoader back to simd-string.c in preparation for getting rid of it 2024-02-25 09:57:31 +05:30
Kovid Goyal
718f4b328f Go back to a single code path for drawing text
Slightly reduces pure ASCII performance and improves Unicode
performance. We should be able to get pure ASCII performance back
via SIMD eventually.
2024-02-25 09:57:30 +05:30
Kovid Goyal
fe2cd543ba Switch to same algorithm for 128bit SIMD as used for 256 bit SIMD
Avoids needing to write to the haystack and also less chance of a bug in
the never tested simd since all CPUs I have access to have AVX2
2024-02-25 09:57:28 +05:30
Kovid Goyal
b032313c45 Only use SIMD if CPU supports it at runtime 2024-02-25 09:57:27 +05:30
Kovid Goyal
19a41b4d9a Use sse4.2 instruction for normal mode printable ascii detection 2024-02-25 09:57:27 +05:30
Kovid Goyal
25e7a2882d Work on using SIMD for normal mode dispatch 2024-02-25 09:57:27 +05:30
Kovid Goyal
e3d6aa2c60 Use simd in a few loops 2024-02-25 09:57:27 +05:30
Kovid Goyal
89d416806b ... 2024-02-25 09:57:26 +05:30
Kovid Goyal
200e5bf6e3 Examine 8 bytes at once for terminator char 2024-02-25 09:57:26 +05:30
Kovid Goyal
f4819175b0 Start work on vectorizing searches 2024-02-25 09:57:26 +05:30