Mercurial > vec
annotate src/impl/x86/avx512f.c @ 25:92156fe32755
impl/ppc/altivec: update to new implementation
the signed average function is wrong; it needs to round up the number
when only one of them is odd, but that doesn't necessarily seem to be
true because altivec is weird, and that's what we need to emulate the
quirks for. ugh.
also the altivec backend uses the generic functions instead of fallbacks
because it does indeed use the exact same memory structure as the generic
implementation...
author | Paper <paper@tflc.us> |
---|---|
date | Sun, 24 Nov 2024 11:15:59 +0000 |
parents | e49e70f7012f |
children |
rev | line source |
---|---|
23
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
1 /** |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
2 * vec - a tiny SIMD vector library in C99 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
3 * |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
4 * Copyright (c) 2024 Paper |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
5 * |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
6 * Permission is hereby granted, free of charge, to any person obtaining a copy |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
7 * of this software and associated documentation files (the "Software"), to deal |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
8 * in the Software without restriction, including without limitation the rights |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
9 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
10 * copies of the Software, and to permit persons to whom the Software is |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
11 * furnished to do so, subject to the following conditions: |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
12 * |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
13 * The above copyright notice and this permission notice shall be included in all |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
14 * copies or substantial portions of the Software. |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
15 * |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
16 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
17 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
18 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
19 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
20 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
21 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
22 * SOFTWARE. |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
23 **/ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
24 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
25 #include "vec/impl/x86/avx512f.h" |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
26 #include "vec/impl/generic.h" |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
27 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
28 #include <immintrin.h> |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
29 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
30 // this is a stupid amount of work just to do these operations, is it really worth it ? |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
31 // also same note in avx2.c applies here, these do not handle sign bits properly, which |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
32 // isn't that big of a deal for regular arithmetic operations, but matters quite a bit |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
33 // when doing things like arithmetic shifts. |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
34 #define VEC_AVX512F_OPERATION_8x64(op, sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
35 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
36 union v##sign##int8x64_impl_data *vec1d = (union v##sign##int8x64_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
37 union v##sign##int8x64_impl_data *vec2d = (union v##sign##int8x64_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
38 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
39 /* unpack and operate */ \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
40 __m512i dst_1 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 24), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 24), 24)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
41 __m512i dst_2 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 16), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 16), 24)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
42 __m512i dst_3 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 8), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 8), 24)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
43 __m512i dst_4 = _mm512_##op##_epi32(_mm512_srli_epi32(vec1d->avx512f, 24), _mm512_srli_epi32(vec2d->avx512f, 24)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
44 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
45 /* repack */ \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
46 vec1d->avx512f = _mm512_or_si512( \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
47 _mm512_or_si512( \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
48 _mm512_srli_epi32(_mm512_slli_epi32(dst_1, 24), 24), \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
49 _mm512_srli_epi32(_mm512_slli_epi32(dst_2, 24), 16) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
50 ), \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
51 _mm512_or_si512( \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
52 _mm512_srli_epi32(_mm512_slli_epi32(dst_3, 24), 8), \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
53 _mm512_slli_epi32(dst_4, 24) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
54 ) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
55 ); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
56 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
57 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
58 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
59 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
60 #define VEC_AVX512F_OPERATION_16x32(op, sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
61 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
62 union v##sign##int16x32_impl_data *vec1d = (union v##sign##int16x32_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
63 union v##sign##int16x32_impl_data *vec2d = (union v##sign##int16x32_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
64 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
65 /* unpack and operate; it would be nice if we had an _m512_andi_epi32... */ \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
66 __m512i dst_1 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 16), 16), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 16), 16)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
67 __m512i dst_2 = _mm512_##op##_epi32(_mm512_srli_epi32(vec1d->avx512f, 16), _mm512_srli_epi32(vec2d->avx512f, 16)); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
68 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
69 /* repack */ \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
70 vec1d->avx512f = _mm512_or_si512( \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
71 _mm512_srli_epi32(_mm512_slli_epi32(dst_1, 16), 16), \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
72 _mm512_slli_epi32(dst_2, 16) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
73 ); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
74 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
75 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
76 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
77 #define VEC_AVX512F_ADD_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
78 VEC_AVX512F_OPERATION_8x64(add, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
79 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
80 #define VEC_AVX512F_ADD_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
81 VEC_AVX512F_OPERATION_16x32(add, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
82 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
83 #define VEC_AVX512F_ADD_32x16(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
84 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
85 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
86 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
87 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
88 vec1d->avx512f = _mm512_add_epi32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
89 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
90 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
91 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
92 #define VEC_AVX512F_ADD_64x8(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
93 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
94 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
95 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
96 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
97 vec1d->avx512f = _mm512_add_epi64(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
98 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
99 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
100 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
101 #define VEC_AVX512F_SUB_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
102 VEC_AVX512F_OPERATION_8x64(sub, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
103 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
104 #define VEC_AVX512F_SUB_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
105 VEC_AVX512F_OPERATION_16x32(sub, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
106 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
107 #define VEC_AVX512F_SUB_32x16(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
108 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
109 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
110 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
111 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
112 vec1d->avx512f = _mm512_sub_epi32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
113 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
114 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
115 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
116 #define VEC_AVX512F_SUB_64x8(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
117 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
118 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
119 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
120 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
121 vec1d->avx512f = _mm512_sub_epi64(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
122 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
123 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
124 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
125 #define VEC_AVX512F_MUL_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
126 VEC_AVX512F_OPERATION_8x64(mullo, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
127 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
128 #define VEC_AVX512F_MUL_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
129 VEC_AVX512F_OPERATION_16x32(mullo, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
130 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
131 #define VEC_AVX512F_MUL_32x16(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
132 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
133 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
134 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
135 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
136 vec1d->avx512f = _mm512_mullo_epi32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
137 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
138 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
139 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
140 #define VEC_AVX512F_MUL_64x8(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
141 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
142 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
143 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
144 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
145 __m512i ac = _mm512_mul_epu32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
146 __m512i b = _mm512_srli_epi64(vec1d->avx512f, 32); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
147 __m512i bc = _mm512_mul_epu32(b, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
148 __m512i d = _mm512_srli_epi64(vec2d->avx512f, 32); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
149 __m512i ad = _mm512_mul_epu32(vec1d->avx512f, d); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
150 __m512i hi = _mm512_add_epi64(bc, ad); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
151 hi = _mm512_slli_epi64(hi, 32); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
152 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
153 vec1d->avx512f = _mm512_add_epi64(hi, ac); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
154 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
155 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
156 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
157 #define VEC_AVX512F_LSHIFT_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
158 VEC_AVX512F_OPERATION_8x64(sllv, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
159 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
160 #define VEC_AVX512F_LSHIFT_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
161 VEC_AVX512F_OPERATION_16x32(sllv, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
162 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
163 #define VEC_AVX512F_LSHIFT_32x16(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
164 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
165 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
166 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
167 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
168 vec1d->avx512f = _mm512_sllv_epi32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
169 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
170 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
171 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
172 #define VEC_AVX512F_LSHIFT_64x8(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
173 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
174 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
175 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
176 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
177 vec1d->avx512f = _mm512_sllv_epi64(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
178 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
179 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
180 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
181 #define VEC_AVX512F_lRSHIFT_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
182 VEC_AVX512F_OPERATION_8x64(srlv, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
183 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
184 #define VEC_AVX512F_lRSHIFT_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
185 VEC_AVX512F_OPERATION_16x32(srlv, sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
186 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
187 #define VEC_AVX512F_aRSHIFT_8x64(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
188 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
189 return v##sign##int8x64_generic_rshift(vec1, vec2); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
190 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
191 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
192 #define VEC_AVX512F_aRSHIFT_16x32(sign) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
193 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
194 return v##sign##int16x32_generic_rshift(vec1, vec2); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
195 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
196 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
197 #define VEC_AVX512F_RSHIFT_8x64(sign, aORl) VEC_AVX512F_##aORl##RSHIFT_8x64(sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
198 #define VEC_AVX512F_RSHIFT_16x32(sign, aORl) VEC_AVX512F_##aORl##RSHIFT_16x32(sign) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
199 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
200 #define VEC_AVX512F_RSHIFT_32x16(sign, aORl) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
201 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
202 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
203 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
204 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
205 vec1d->avx512f = _mm512_sr##aORl##v_epi32(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
206 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
207 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
208 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
209 #define VEC_AVX512F_RSHIFT_64x8(sign, aORl) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
210 do { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
211 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
212 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
213 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
214 vec1d->avx512f = _mm512_sr##aORl##v_epi64(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
215 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
216 } while (0) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
217 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
218 #define VEC_AVX512F_uRSHIFT_8x64(sign, aORl) VEC_AVX512F_RSHIFT_8x64(sign, l) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
219 #define VEC_AVX512F_uRSHIFT_16x32(sign, aORl) VEC_AVX512F_RSHIFT_16x32(sign, l) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
220 #define VEC_AVX512F_uRSHIFT_32x16(sign, aORl) VEC_AVX512F_RSHIFT_32x16(sign, l) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
221 #define VEC_AVX512F_uRSHIFT_64x8(sign, aORl) VEC_AVX512F_RSHIFT_64x8(sign, l) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
222 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
223 #define VEC_AVX512F_DEFINE_OPERATIONS_SIGN(sign, bits, size) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
224 union v##sign##int##bits##x##size##_impl_data { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
225 v##sign##int##bits##x##size vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
226 __m512i avx512f; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
227 }; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
228 \ |
24
e49e70f7012f
impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents:
23
diff
changeset
|
229 VEC_STATIC_ASSERT(VEC_ALIGNOF(__m512i) <= VEC_ALIGNOF(v##sign##int##bits##x##size), "vec: v" #sign "int" #bits "x" #size " alignment needs to be expanded to fit intrinsic type size"); \ |
e49e70f7012f
impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents:
23
diff
changeset
|
230 VEC_STATIC_ASSERT(sizeof(__m512i) <= sizeof(v##sign##int##bits##x##size), "vec: v" #sign "int" #bits "x" #size " needs to be expanded to fit intrinsic type size"); \ |
e49e70f7012f
impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents:
23
diff
changeset
|
231 \ |
23
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
232 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_load_aligned(const vec_##sign##int##bits in[size]) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
233 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
234 union v##sign##int##bits##x##size##_impl_data vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
235 vec.avx512f = _mm512_load_si512((const __m512i *)in); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
236 return vec.vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
237 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
238 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
239 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_load(const vec_##sign##int##bits in[size]) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
240 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
241 union v##sign##int##bits##x##size##_impl_data vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
242 vec.avx512f = _mm512_loadu_si512((const __m512i *)in); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
243 return vec.vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
244 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
245 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
246 static void v##sign##int##bits##x##size##_avx512f_store_aligned(v##sign##int##bits##x##size vec, vec_##sign##int##bits out[size]) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
247 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
248 _mm512_store_si512((__m512i *)out, ((union v##sign##int##bits##x##size##_impl_data *)&vec)->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
249 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
250 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
251 static void v##sign##int##bits##x##size##_avx512f_store(v##sign##int##bits##x##size vec, vec_##sign##int##bits out[size]) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
252 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
253 _mm512_storeu_si512((__m512i *)out, ((union v##sign##int##bits##x##size##_impl_data *)&vec)->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
254 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
255 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
256 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_add(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
257 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
258 VEC_AVX512F_ADD_##bits##x##size(sign); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
259 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
260 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
261 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_sub(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
262 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
263 VEC_AVX512F_SUB_##bits##x##size(sign); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
264 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
265 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
266 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_mul(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
267 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
268 VEC_AVX512F_MUL_##bits##x##size(sign); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
269 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
270 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
271 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_and(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
272 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
273 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
274 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
275 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
276 vec1d->avx512f = _mm512_and_si512(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
277 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
278 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
279 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
280 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_or(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
281 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
282 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
283 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
284 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
285 vec1d->avx512f = _mm512_or_si512(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
286 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
287 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
288 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
289 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_xor(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
290 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
291 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
292 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
293 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
294 vec1d->avx512f = _mm512_xor_si512(vec1d->avx512f, vec2d->avx512f); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
295 return vec1d->vec; \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
296 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
297 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
298 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_lshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
299 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
300 VEC_AVX512F_LSHIFT_##bits##x##size(sign); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
301 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
302 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
303 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_rshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
304 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
305 VEC_AVX512F_##sign##RSHIFT_##bits##x##size(sign, a); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
306 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
307 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
308 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_lrshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
309 { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
310 VEC_AVX512F_RSHIFT_##bits##x##size(sign, l); \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
311 } \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
312 \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
313 const v##sign##int##bits##x##size##_impl v##sign##int##bits##x##size##_impl_avx512f = { \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
314 v##sign##int##bits##x##size##_generic_splat, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
315 v##sign##int##bits##x##size##_avx512f_load_aligned, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
316 v##sign##int##bits##x##size##_avx512f_load, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
317 v##sign##int##bits##x##size##_avx512f_store_aligned, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
318 v##sign##int##bits##x##size##_avx512f_store, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
319 v##sign##int##bits##x##size##_avx512f_add, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
320 v##sign##int##bits##x##size##_avx512f_sub, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
321 v##sign##int##bits##x##size##_avx512f_mul, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
322 v##sign##int##bits##x##size##_generic_div, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
323 v##sign##int##bits##x##size##_generic_avg, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
324 v##sign##int##bits##x##size##_avx512f_and, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
325 v##sign##int##bits##x##size##_avx512f_or, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
326 v##sign##int##bits##x##size##_avx512f_xor, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
327 v##sign##int##bits##x##size##_generic_not, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
328 v##sign##int##bits##x##size##_avx512f_lshift, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
329 v##sign##int##bits##x##size##_avx512f_rshift, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
330 v##sign##int##bits##x##size##_avx512f_lrshift, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
331 v##sign##int##bits##x##size##_generic_cmplt, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
332 v##sign##int##bits##x##size##_generic_cmple, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
333 v##sign##int##bits##x##size##_generic_cmpeq, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
334 v##sign##int##bits##x##size##_generic_cmpge, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
335 v##sign##int##bits##x##size##_generic_cmpgt, \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
336 }; |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
337 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
338 #define VEC_AVX512F_DEFINE_OPERATIONS(bits, size) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
339 VEC_AVX512F_DEFINE_OPERATIONS_SIGN(u, bits, size) \ |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
340 VEC_AVX512F_DEFINE_OPERATIONS_SIGN( , bits, size) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
341 |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
342 VEC_AVX512F_DEFINE_OPERATIONS(8, 64) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
343 VEC_AVX512F_DEFINE_OPERATIONS(16, 32) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
344 VEC_AVX512F_DEFINE_OPERATIONS(32, 16) |
e26874655738
*: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff
changeset
|
345 VEC_AVX512F_DEFINE_OPERATIONS(64, 8) |