Mercurial > vec
view README @ 27:d00b95f95dd1 default tip
impl/arm/neon: it compiles again, but is untested
author | Paper <paper@tflc.us> |
---|---|
date | Mon, 25 Nov 2024 00:33:02 -0500 |
parents | e26874655738 |
children |
line wrap: on
line source
vec - a tiny SIMD vector library written in C99 it comes with an extremely basic API that is similar to other intrinsics libraries; each type is in the exact same format: v[sign][bits]x[size] where `sign' is either nothing (for signed) or `u' (for unsigned), `bits' is the bit size of the integer format, and `size' is the how many integers are in the vector vec provides types for 64-bit, 128-bit, 256-bit, and 512-bit SIMD intrinsics on processors where vec has an implementation and falls back to array-based implementations where they are not. to initialize vec, you MUST call `vec_init()' when your program starts up. note that `vec_init()' is NOT thread-safe, and things can and will blow up if you call it simultaneously from different threads (i.e. you try to only initialize it when you need to... please just initialize it on startup so you don't have to worry about that!!!) all of these have many operations that are prefixed with the name of the type and an underscore, for example: vint8x16 vint8x16_splat(uint8_t x) - creates a vint8x16 where all of the values are filled with the value of `x' the current supported operations are: v[u]intAxB splat([u]intA_t x) creates a vector with all of the values are filled with the value of `x' v[u]intAxB load(const [u]intA_t x[B]) copies the values from the memory address stored at `x'; the address is NOT required to be aligned void store(v[u]intAxB vec, [u]intA_t x[B]) copies the values from the vector into the memory address stored at `x' like with load(), this does not require address alignment v[u]intAxB add(v[u]intAxB vec1, v[u]intAxB vec2) adds the value of `vec1' and `vec2' and returns it v[u]intAxB sub(v[u]intAxB vec1, v[u]intAxB vec2) subtracts the value of `vec2' from `vec1' and returns it v[u]intAxB mul(v[u]intAxB vec1, v[u]intAxB vec2) multiplies the values of `vec1' and `vec2' together and returns it v[u]intAxB div(v[u]intAxB vec1, v[u]intAxB vec2) divides vec1 by the values in vec2. dividing by zero is considered defined behavior and should result in a zero; if this doesn't happen it's considered a bug v[u]intAxB and(v[u]intAxB vec1, v[u]intAxB vec2) bitwise AND (&) of the values in both vectors v[u]intAxB or(v[u]intAxB vec1, v[u]intAxB vec2) bitwise OR (|) of the values in both vectors v[u]intAxB xor(v[u]intAxB vec1, v[u]intAxB vec2) bitwise XOR (^) of the values in both vectors v[u]intAxB rshift(v[u]intAxB vec1, vuintAxB vec2) arithmetic right shift of the values in vec1 by the corresponding values in vec2 v[u]intAxB lshift(v[u]intAxB vec1, vuintAxB vec2) arithmetic left shift of the values in vec1 by the corresponding values in vec2 v[u]intAxB lrshift(v[u]intAxB vec1, vuintAxB vec2) logical right shift of the values in vec1 by the corresponding values in vec2 v[u]intAxB avg(v[u]intAxB vec1, v[u]intAxB vec2) returns the average of the values in both vectors i.e., div(mul(vec1, vec2), splat(2)) there are also a number of comparisons possible: v[u]intAxB cmplt(v[u]intAxB vec1, v[u]intAxB vec2) turns on all bits of the corresponding value in the result vector if the value in `vec1' is less than the corresponding value in `vec2', else all of the bits are turned off. v[u]intAxB cmpgt(v[u]intAxB vec1, v[u]intAxB vec2) turns on all bits of the corresponding value in the result vector if the value in `vec1' is greater than the corresponding value in `vec2', else all of the bits are turned off. v[u]intAxB cmpeq(v[u]intAxB vec1, v[u]intAxB vec2) turns on all bits of the corresponding value in the result vector if the value in `vec1' are equal to the corresponding value in `vec2', else all of the bits are turned off. v[u]intAxB cmple(v[u]intAxB vec1, v[u]intAxB vec2) turns on all bits of the corresponding value in the result vector if the value in `vec1' is less than or equal to the corresponding value in `vec2', else all of the bits are turned off. v[u]intAxB cmpge(v[u]intAxB vec1, v[u]intAxB vec2) turns on all bits of the corresponding value in the result vector if the value in `vec1' is greater than or equal to the corresponding value in `vec2', else all of the bits are turned off.