The `__future__` we relied on is now, where the 3 specific things are
all included [since Python 3.0](https://docs.python.org/3/library/__future__.html):
* absolute_import
* print_function
* unicode_literals
* division
These import statements are no-ops and are no longer necessary.
* Replace lhs/rhs with a/b for clarity of documentation and to match concrete ops.
* Concretize additional SIMDMask operations:
.&=, .|=, .^=, .==, .!=
Also reflect documentation changes back to generic implementations.
Adds concrete overloads of the following SIMD operations:
- Comparisons: .==, .!=, .<, .<=, .>, .>=
- Logical operations on masks: .!, .&, .^, .|
- Integer arithmetic: &+, &-, &, &+=, &-=, &=
This makes some simple benchmarks 10-100x faster, which is basically a no-brainer, while staying away from the most heavily used operators, so hopefully doesn't impact compilation performance too badly.