kitty-mirror

mirror of https://github.com/kovidgoyal/kitty.git synced 2026-02-01 11:34:59 +01:00

Author	SHA1	Message	Date
Kovid Goyal	7e12cc57c6	Fix #7245	2024-03-21 20:50:05 +05:30
Kovid Goyal	3bb9e36fc8	...	2024-03-14 21:00:57 +05:30
Kovid Goyal	76ae5f5b9b	DRYer: Use the SIMD detection in setup.py to avoid calling __builtin_cpu_supports	2024-03-14 20:57:09 +05:30
Kovid Goyal	a839af04dc	Fix #7219	2024-03-14 11:13:54 +05:30
Kovid Goyal	1a9a7a59ac	Make XOR64 test also test alignment issues	2024-02-25 09:57:44 +05:30
Kovid Goyal	ad3ab877f8	Use a fast SIMD implementation to XOR data going into the disk cache	2024-02-25 09:57:43 +05:30
Kovid Goyal	b021e9b648	Do the default func test last so we can see what the failure is on more explicitly	2024-02-25 09:57:42 +05:30
Kovid Goyal	d0797a025b	Add dedicated tests for find_either_of_two	2024-02-25 09:57:42 +05:30
Kovid Goyal	f1fe0bf40a	Code to easily compare SIMD and scalar decode in a live instance Also remove -mtune=intel as it fails with clang	2024-02-25 09:57:41 +05:30
Kovid Goyal	6cdc7ac91d	A further 5% speedup for UTF-8 decoding Achieved by decoding in larger chunks thereby amortizing the cost of creating various constant vectors over larger chunks.	2024-02-25 09:57:40 +05:30
Kovid Goyal	9cb9373274	Allow unbounded output in UTF8Decoder This will allow us to eventually decode more than a single vector's worth in a fast inner loop	2024-02-25 09:57:39 +05:30
Kovid Goyal	66341aa28e	Make the env var controlling which SIMD level to use more capable	2024-02-25 09:57:38 +05:30
Kovid Goyal	7e77a196e6	Build only the SIMD code with SIMD compiler flags	2024-02-25 09:57:38 +05:30
Kovid Goyal	4b846e0106	Turns out that using 256 bit code on ARM is slightly faster even though it is emulated with 128 bit registers	2024-02-25 09:57:38 +05:30
Kovid Goyal	76c6630084	Dont use 256 bit code paths on ARM ARM only has 128 bit registers. simde simulates 256 bit operations using them, which is fairly pointless for us.	2024-02-25 09:57:38 +05:30
Kovid Goyal	23a4012aeb	Add an env var to turn off use of SIMD instructions	2024-02-25 09:57:38 +05:30
Kovid Goyal	eee14ae148	Workaround for machines on GitHub Actions that incorrectly report CPU vector instruction availability	2024-02-25 09:57:37 +05:30
Kovid Goyal	bbaccfdaae	DRYer	2024-02-25 09:57:37 +05:30
Kovid Goyal	43f64f71e4	DRYer	2024-02-25 09:57:36 +05:30
Kovid Goyal	0e4c49a0d6	Fix building on macOS ARM	2024-02-25 09:57:35 +05:30
Kovid Goyal	b3ca5d51fb	Use the new SIMD utf-8 decoder	2024-02-25 09:57:35 +05:30
Kovid Goyal	7e6459a5e4	DRYer	2024-02-25 09:57:35 +05:30
Kovid Goyal	4c8b8caead	Handle trailing incomplete sequences	2024-02-25 09:57:34 +05:30
Kovid Goyal	99e67f0859	...	2024-02-25 09:57:33 +05:30
Kovid Goyal	2cb87861c0	Ensure cpu is inited before calling cpu_supports()	2024-02-25 09:57:33 +05:30
Kovid Goyal	74391d7c50	More work on SIMD utf-8 decode	2024-02-25 09:57:31 +05:30
Kovid Goyal	8975d1a9f4	no need to parametrize sentinel	2024-02-25 09:57:31 +05:30
Kovid Goyal	0ed1c6f840	Simplify utf8 parser func Also show a replacement char for incomplete utf-8 sequences interrupted by an esc char	2024-02-25 09:57:31 +05:30
Kovid Goyal	95eac2e510	...	2024-02-25 09:57:31 +05:30
Kovid Goyal	bc499000a5	Infrastructure for developing and testing UTF-8 SIMD decode	2024-02-25 09:57:31 +05:30
Kovid Goyal	e2be8c2d37	Use unaligned loads for SIMD makes no difference to the benchmarks and simplifies the code	2024-02-25 09:57:31 +05:30
Kovid Goyal	fd4c8e1e2d	Get rid of ByteLoader Doesnt move the benchmarks	2024-02-25 09:57:31 +05:30
Kovid Goyal	ba18c5a669	Move ByteLoader back to simd-string.c in preparation for getting rid of it	2024-02-25 09:57:31 +05:30
Kovid Goyal	c79baa56e4	Remove unused SIMD code	2024-02-25 09:57:30 +05:30
Kovid Goyal	8742fb8cce	Detect availability of intrinsics on intel macs just in case	2024-02-25 09:57:30 +05:30
Kovid Goyal	718f4b328f	Go back to a single code path for drawing text Slightly reduces pure ASCII performance and improves Unicode performance. We should be able to get pure ASCII performance back via SIMD eventually.	2024-02-25 09:57:30 +05:30
Kovid Goyal	794bd85371	Ignore warning from simde on clang	2024-02-25 09:57:29 +05:30
Kovid Goyal	49a54b086f	Use simde so SIMD speedups work on ARM as well	2024-02-25 09:57:28 +05:30
Kovid Goyal	fe2cd543ba	Switch to same algorithm for 128bit SIMD as used for 256 bit SIMD Avoids needing to write to the haystack and also less chance of a bug in the never tested simd since all CPUs I have access to have AVX2	2024-02-25 09:57:28 +05:30
Kovid Goyal	1925d5ea65	Prepare for plain sse4 fallback	2024-02-25 09:57:27 +05:30
Kovid Goyal	aacdffd539	DRYer	2024-02-25 09:57:27 +05:30
Kovid Goyal	a0e1eb4985	AVX2 implementation for find either of two	2024-02-25 09:57:27 +05:30
Kovid Goyal	e4c48a5f17	Add AVX2 implementation of find byte not in range Also fix alignment bug and ensure the simd finders dont return a pointer beyond the end	2024-02-25 09:57:27 +05:30
Kovid Goyal	b032313c45	Only use SIMD if CPU supports it at runtime	2024-02-25 09:57:27 +05:30
Kovid Goyal	19a41b4d9a	Use sse4.2 instruction for normal mode printable ascii detection	2024-02-25 09:57:27 +05:30
Kovid Goyal	25e7a2882d	Work on using SIMD for normal mode dispatch	2024-02-25 09:57:27 +05:30
Kovid Goyal	e3d6aa2c60	Use simd in a few loops	2024-02-25 09:57:27 +05:30
Kovid Goyal	89d416806b	...	2024-02-25 09:57:26 +05:30
Kovid Goyal	200e5bf6e3	Examine 8 bytes at once for terminator char	2024-02-25 09:57:26 +05:30
Kovid Goyal	f4819175b0	Start work on vectorizing searches	2024-02-25 09:57:26 +05:30

50 Commits