Kovid Goyal
|
6f588a0c29
|
run modernize
|
2025-11-11 17:09:37 +05:30 |
|
Kovid Goyal
|
fd5876b94e
|
Use SIMD to replace C0 control codes in Go code
|
2025-07-21 08:54:22 +05:30 |
|
Kovid Goyal
|
c861259e3b
|
Rename go module from kitty -> github.com/kovidgoyal/kitty
Makes the code more easily re-useable in other projects
|
2025-05-16 08:43:39 +05:30 |
|
Kovid Goyal
|
237bb35ee9
|
More CodeQL fixes
|
2025-04-20 21:53:11 +05:30 |
|
Kovid Goyal
|
32f0da2e77
|
Ensure no frame is created for assembly functions
|
2024-03-15 07:58:09 +05:30 |
|
Kovid Goyal
|
65923b1aba
|
Add some benchamrking
|
2024-03-07 11:09:24 +05:30 |
|
Kovid Goyal
|
47fea26b62
|
Add an IndexByte implementation useful for benchmarking against stdlib SIMD implementation
|
2024-03-07 09:36:40 +05:30 |
|
Kovid Goyal
|
a7c06b38e6
|
We dont actually need vzeroupper at start of function
GCC emits vzeroupper automatically when compiling with native
optimizations but we still need it otherwise
|
2024-02-25 09:57:43 +05:30 |
|
Kovid Goyal
|
720618bc37
|
Use go 1.22 for building
It supports PCALIGN on non ARM arches as well
|
2024-02-25 09:57:43 +05:30 |
|
Kovid Goyal
|
ede4d7fbca
|
...
|
2024-02-25 09:57:42 +05:30 |
|
Kovid Goyal
|
c01b959723
|
Fix Go unaligned index implementation
|
2024-02-25 09:57:42 +05:30 |
|
Kovid Goyal
|
7467307200
|
Add some alignment tests
|
2024-02-25 09:57:42 +05:30 |
|
Kovid Goyal
|
bbdb0b15f3
|
DRYer
|
2024-02-25 09:57:42 +05:30 |
|
Kovid Goyal
|
b5edd9ad57
|
Dont precalculate mask in loop body
No need since we dont shift. Avoids the extra mask instructions for the
not found case.
|
2024-02-25 09:57:42 +05:30 |
|
Kovid Goyal
|
f9fd6ffd46
|
Use only aligned loads for index funcs
Also obviates the necessity for safe slice wrappers
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
31a5fcf297
|
DRYer
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
561712090d
|
Fix cmplt implementation
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
d9190ea675
|
DRYer
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
57f4ea4d4a
|
Add some tests for broadcast from constant intrinsic
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
9b0ae8d403
|
Dont use VEX encoded instructions for 128 bit ISA
|
2024-02-25 09:57:41 +05:30 |
|
Kovid Goyal
|
aed0611fb8
|
Avoid double trailing RET
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
5a5e31c38b
|
Also zero upper at start of function
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
db2e0e816d
|
Fix mixing of register types in the same function
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
a298781b85
|
DRYer
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
d5cd9ef2ca
|
...
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
da31db3212
|
...
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
601c4ad4df
|
Fix some typos
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
68d800d4fa
|
make clean should clean generated asm as well
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
9fc3db1dd1
|
Work on C0 index func
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
161eae78b6
|
Make generated asm_* files world readable
|
2024-02-25 09:57:40 +05:30 |
|
Kovid Goyal
|
77cfd44f24
|
More efficient clearing of register to all zeros or all ones
|
2024-02-25 09:57:39 +05:30 |
|
Kovid Goyal
|
59be7213cf
|
Make set1_epi8 more general
|
2024-02-25 09:57:39 +05:30 |
|
Kovid Goyal
|
d60dacbd09
|
Implement > and < intrinsics for vector registers
|
2024-02-25 09:57:39 +05:30 |
|
Kovid Goyal
|
82b7b4fcce
|
Make a re-useable template for generating ASM index functions with different tests
|
2024-02-25 09:57:39 +05:30 |
|
Kovid Goyal
|
4e6138d785
|
Generate SIMD code during build
|
2024-02-25 09:57:39 +05:30 |
|
Kovid Goyal
|
de8c1e0206
|
Work on porting SIMD vt arser to Go for the kittens
|
2024-02-25 09:57:39 +05:30 |
|