* This creates working code for most tests
* New templates for dual types comparisons and binary operations
* Also starting unary operation shape analysis, so we can complete number operations
* Also 3.13 compatibility work for No-GIL
* Was dependent on module variable performance and covered int rather than str
* Added both in and str tests that are more clearly separating the operation in question.
* New black removes a few leading new lines in blocks, so many
files changed in that way.
* The "pygments" has a vulnerability, but updating that and the
restructured text checking stuff did not actually matter at
all.
* Bumped needed version for development to 3.8, since black does
not do 3.7 anymore, and it's old enough.
* Small immutable constants get their own code that is faster for small
sizes.
* Medium sized ones get code that just is hinted the size, but takes
item from a source list.
* For repeated lists use a dedicated helper for all sizes, even faster.
* Only large constant lists get copied with generic code now.
* This should still retain or even improve C scalability, while we
get a good performance boost out of this, 2%-20% depending on the
cases.
* With a bit of guidance, many checks can be avoided, example added
speeds up from 137% to 201% with Python3.10
* Add our own "TUPLE_COPY" as well, as that is used in this deep copy.
* Make sure specialized calls are also giving C functions SystemError in
case they return NULL.
* Also make sure we use call code that takes advantage of tuple size
knowledge for constant positional arguments.
* Check tuple size to catch errors in using the POSARGS tuple variants.
* Add specialization for keywords only and mixed calls for large gains
as dictionaries might be avoided.
* Merge CompiledCodeHelpers and CompiledFunction for better friendship
of code.
* For unknown called objects, use "tp_vectorcall" if available for 3.8
or higher.
* Pass around argument and keyword value vectors as constant values, as
these are not modified, might help compiler optimization.
* Avoid creating the constant tuple used to make the call when it is
needed to call non-compiled code.
* This gains 10% on instance creations with 6 arguments, but should be
most important when calling C functions.
* Also avoids tuples in cases where the internal API allows it, namely
3.7 exactly.
* We don't have an alternative that could demonstrate performance of
actually assigning, because for many operations, etc. void is now
a different operation.
* These have a different reference count to consider for extending
the value, pass this as an argument.
* This also splits unary and binary operation code generation which
is overdue too.
* Avoids assigning back to module dict if the value didn't change.
* More work is needed for this to be actually effective for all
helper kinds, and esp. for Python3 unicode.
* For some in-place operations, support for writing module variables
has been added, including avoiding unnecessary write backs.
* Also defaults changes for standalone, non-Windows, use no suffix, and
follow platform conventions inside the ".dist" folder.
* For accelerated mode, use ".exe" on Windows only, and change to ".bin"
on others, which still avoids collisions, but is less confusion and
now can be overriden with "-o" option.
* Deleting read only files is an extra hurdle on Windows, and this
avoids it.
* Copying file permissions is not needed and wastes only time. A
few places were already using "copyfile", but many were using
"copy" for no reason.
* This should handle cases, where overwriting DLLs failed on Windows
due to not being able to delete the ".dist" folder.
* Only part of the tools and test code was working with Python2.6, and
some parts were having their copies of that helper function still.
* This should enable more tools to work with Python 2.6 as their
driver.
* Added support for memory measurement to the common code as
a mode.
* No need to not use unstripped at all when producing the binary,
as it doesn't affect what "size" will say.
* Make output optional.
* Allow copying out the valgrind log file to a given filename, so
kcachegrind can find it.
* This is to allow sharing it with the code that wants to
render it for the speedcenter web site.
* Also move valgrind testing tool to that name space.
* Use proper temp file name instead of derived name from test case.
* Make running valgrind to get a tick report a common tool code.
* Move "my_print" test function to utils code for reuse in tools.
* Added context manager to get temp filename to use.
* This is aiming at making construct running a proper tool too
and merge with the interactive numbers / kcachegrind launching
tool version.