vec: README annotate

annotate README @ 20:627d548b23c8

impl/generic: fix load and store implementations this caused a segmentation fault under AltiVec, but it went under the radar on x86 because my main PC supports all of the non-generic vector implementations.

author	Paper <paper@tflc.us>
date	Thu, 21 Nov 2024 21:19:11 +0000
parents	e05c257c6a23
children	e26874655738

rev	line source
0 02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	1 vec - a tiny SIMD vector header-only library written in C99
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	2
15 e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	3 it comes with an extremely basic API that is similar to other intrinsics
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	4 libraries; each type is in the exact same format:
0 02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	5
15 e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	6 v[sign][bits]x[size]
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	7 where `sign' is either nothing (for signed) or `u' (for unsigned),
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	8 `bits' is the bit size of the integer format,
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	9 and `size' is the how many integers are in the vector
0 02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	10
15 e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	11 vec provides types for 64-bit, 128-bit, 256-bit, and 512-bit SIMD intrinsics
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	12 on processors where vec has an implementation and falls back to array-based
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	13 implementations where they are not.
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	14
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	15 all of these have many operations that are prefixed with the name of the
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	16 type and an underscore, for example:
0 02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	17
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	18 vint8x16 vint8x16_splat(uint8_t x)
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	19 - creates a vint8x16 where all of the values are filled
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	20 with the value of `x'
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	21
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	22 the current supported operations are:
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	23
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	24 v[u]intAxB splat([u]intA_t x)
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	25 creates a vector with all of the values are filled with
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	26 the value of `x'
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	27
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	28 v[u]intAxB load(const [u]intA_t x[B])
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	29 copies the values from the memory address stored at `x';
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	30 the address is NOT required to be aligned
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	31
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	32 void store(v[u]intAxB vec, [u]intA_t x[B])
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	33 copies the values from the vector into the memory address
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	34 stored at `x'
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	35
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	36 like with load(), this does not require address alignment
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	37
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	38 v[u]intAxB add(v[u]intAxB vec1, v[u]intAxB vec2)
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	39 adds the value of `vec1' and `vec2' and returns it
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	40
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	41 v[u]intAxB sub(v[u]intAxB vec1, v[u]intAxB vec2)
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	42 subtracts the value of `vec2' from `vec1' and returns it
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	43
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	44 v[u]intAxB mul(v[u]intAxB vec1, v[u]intAxB vec2)
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	45 multiplies the values of `vec1' and `vec2' together and
02a517e4c492 : initial commit Paper <paper@paper.us.eu.org>* parents: diff changeset	46 returns it
2 f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	47
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	48 v[u]intAxB div(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	49 divides vec1 by the values in vec2. dividing by zero is
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	50 considered defined behavior and should result in a zero;
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	51 if this doesn't happen it's considered a bug
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	52
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	53 v[u]intAxB and(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	54 bitwise AND (&) of the values in both vectors
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	55
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	56 v[u]intAxB or(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	57 bitwise OR (\|) of the values in both vectors
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	58
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	59 v[u]intAxB xor(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	60 bitwise XOR (^) of the values in both vectors
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	61
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	62 v[u]intAxB rshift(v[u]intAxB vec1, vuintAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	63 arithmetic right shift of the values in vec1 by
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	64 the corresponding values in vec2
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	65
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	66 v[u]intAxB lshift(v[u]intAxB vec1, vuintAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	67 arithmetic left shift of the values in vec1 by
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	68 the corresponding values in vec2
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	69
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	70 v[u]intAxB lrshift(v[u]intAxB vec1, vuintAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	71 logical right shift of the values in vec1 by
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	72 the corresponding values in vec2
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	73
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	74 v[u]intAxB avg(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	75 returns the average of the values in both vectors
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	76 i.e., div(mul(vec1, vec2), splat(2))
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	77
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	78 there are also a number of comparisons possible:
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	79
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	80 v[u]intAxB cmplt(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	81 turns on all bits of the corresponding value in
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	82 the result vector if the value in `vec1' is less
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	83 than the corresponding value in `vec2', else all
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	84 of the bits are turned off.
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	85
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	86 v[u]intAxB cmpgt(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	87 turns on all bits of the corresponding value in
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	88 the result vector if the value in `vec1' is greater
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	89 than the corresponding value in `vec2', else all
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	90 of the bits are turned off.
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	91
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	92 v[u]intAxB cmpeq(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	93 turns on all bits of the corresponding value in
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	94 the result vector if the value in `vec1' are equal
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	95 to the corresponding value in `vec2', else all
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	96 of the bits are turned off.
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	97
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	98 v[u]intAxB cmple(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	99 turns on all bits of the corresponding value in
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	100 the result vector if the value in `vec1' is less
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	101 than or equal to the corresponding value in `vec2',
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	102 else all of the bits are turned off.
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	103
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	104 v[u]intAxB cmpge(v[u]intAxB vec1, v[u]intAxB vec2)
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	105 turns on all bits of the corresponding value in
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	106 the result vector if the value in `vec1' is greater
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	107 than or equal to the corresponding value in `vec2',
f12b5dd4e18c : many new operations and a real test suite Paper <paper@tflc.us>* parents: 0 diff changeset	108 else all of the bits are turned off.
15 e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	109
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	110 to initialize vec, you MUST call `vec_init()' when your programs starts up.
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	111
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	112 note that `vec_init()' is NOT thread-safe, and things can and will
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	113 blow up if you call it simultaneously from different threads (i.e. you
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	114 try to only initialize it when you need to... please just initialize
e05c257c6a23 : huge refactor, add many new x86 intrinsics and the like Paper <paper@tflc.us>* parents: 2 diff changeset	115 it on startup so you don't have to worry about that!!!)

Mercurial > vec

annotate README @ 20:627d548b23c8