Mercurial > vec
annotate README @ 10:d1d5d767004c
chore: merge diverging branches
author | Paper <paper@tflc.us> |
---|---|
date | Mon, 18 Nov 2024 15:44:09 -0500 |
parents | f12b5dd4e18c |
children | e05c257c6a23 |
rev | line source |
---|---|
0 | 1 vec - a tiny SIMD vector header-only library written in C99 |
2 | |
3 it comes with an extremely basic (and somewhat lacking) API, | |
4 where there are eight supported vector types, all 128-bit: | |
5 | |
6 vint8x16 - 16 signed 8-bit integers | |
7 vint16x8 - 8 signed 16-bit integers | |
8 vint32x4 - 4 signed 32-bit integers | |
9 vint64x2 - 2 signed 64-bit integers | |
10 vuint8x16 - 16 unsigned 8-bit integers | |
11 vuint16x8 - 8 unsigned 16-bit integers | |
12 vuint32x4 - 4 unsigned 32-bit integers | |
13 vuint32x4 - 2 unsigned 64-bit integers | |
14 | |
15 all of these have many operations that are prefixed with the | |
16 name of the type and an underscore, for example: | |
17 | |
18 vint8x16 vint8x16_splat(uint8_t x) | |
19 - creates a vint8x16 where all of the values are filled | |
20 with the value of `x' | |
21 | |
22 the current supported operations are: | |
23 | |
24 v[u]intAxB splat([u]intA_t x) | |
25 creates a vector with all of the values are filled with | |
26 the value of `x' | |
27 | |
28 v[u]intAxB load(const [u]intA_t x[B]) | |
29 copies the values from the memory address stored at `x'; | |
30 the address is NOT required to be aligned | |
31 | |
32 void store(v[u]intAxB vec, [u]intA_t x[B]) | |
33 copies the values from the vector into the memory address | |
34 stored at `x' | |
35 | |
36 like with load(), this does not require address alignment | |
37 | |
38 v[u]intAxB add(v[u]intAxB vec1, v[u]intAxB vec2) | |
39 adds the value of `vec1' and `vec2' and returns it | |
40 | |
41 v[u]intAxB sub(v[u]intAxB vec1, v[u]intAxB vec2) | |
42 subtracts the value of `vec2' from `vec1' and returns it | |
43 | |
44 v[u]intAxB mul(v[u]intAxB vec1, v[u]intAxB vec2) | |
45 multiplies the values of `vec1' and `vec2' together and | |
46 returns it | |
2
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
47 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
48 v[u]intAxB div(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
49 divides vec1 by the values in vec2. dividing by zero is |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
50 considered defined behavior and should result in a zero; |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
51 if this doesn't happen it's considered a bug |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
52 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
53 v[u]intAxB and(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
54 bitwise AND (&) of the values in both vectors |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
55 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
56 v[u]intAxB or(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
57 bitwise OR (|) of the values in both vectors |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
58 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
59 v[u]intAxB xor(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
60 bitwise XOR (^) of the values in both vectors |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
61 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
62 v[u]intAxB rshift(v[u]intAxB vec1, vuintAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
63 arithmetic right shift of the values in vec1 by |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
64 the corresponding values in vec2 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
65 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
66 v[u]intAxB lshift(v[u]intAxB vec1, vuintAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
67 arithmetic left shift of the values in vec1 by |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
68 the corresponding values in vec2 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
69 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
70 v[u]intAxB lrshift(v[u]intAxB vec1, vuintAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
71 logical right shift of the values in vec1 by |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
72 the corresponding values in vec2 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
73 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
74 v[u]intAxB avg(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
75 returns the average of the values in both vectors |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
76 i.e., div(mul(vec1, vec2), splat(2)) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
77 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
78 there are also a number of comparisons possible: |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
79 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
80 v[u]intAxB cmplt(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
81 turns on all bits of the corresponding value in |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
82 the result vector if the value in `vec1' is less |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
83 than the corresponding value in `vec2', else all |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
84 of the bits are turned off. |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
85 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
86 v[u]intAxB cmpgt(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
87 turns on all bits of the corresponding value in |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
88 the result vector if the value in `vec1' is greater |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
89 than the corresponding value in `vec2', else all |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
90 of the bits are turned off. |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
91 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
92 v[u]intAxB cmpeq(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
93 turns on all bits of the corresponding value in |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
94 the result vector if the value in `vec1' are equal |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
95 to the corresponding value in `vec2', else all |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
96 of the bits are turned off. |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
97 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
98 v[u]intAxB cmple(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
99 turns on all bits of the corresponding value in |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
100 the result vector if the value in `vec1' is less |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
101 than or equal to the corresponding value in `vec2', |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
102 else all of the bits are turned off. |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
103 |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
104 v[u]intAxB cmpge(v[u]intAxB vec1, v[u]intAxB vec2) |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
105 turns on all bits of the corresponding value in |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
106 the result vector if the value in `vec1' is greater |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
107 than or equal to the corresponding value in `vec2', |
f12b5dd4e18c
*: many new operations and a real test suite
Paper <paper@tflc.us>
parents:
0
diff
changeset
|
108 else all of the bits are turned off. |