annotate src/impl/x86/avx512f.c @ 25:92156fe32755

impl/ppc/altivec: update to new implementation the signed average function is wrong; it needs to round up the number when only one of them is odd, but that doesn't necessarily seem to be true because altivec is weird, and that's what we need to emulate the quirks for. ugh. also the altivec backend uses the generic functions instead of fallbacks because it does indeed use the exact same memory structure as the generic implementation...
author Paper <paper@tflc.us>
date Sun, 24 Nov 2024 11:15:59 +0000
parents e49e70f7012f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
23
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
1 /**
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
2 * vec - a tiny SIMD vector library in C99
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
3 *
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
4 * Copyright (c) 2024 Paper
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
5 *
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
6 * Permission is hereby granted, free of charge, to any person obtaining a copy
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
7 * of this software and associated documentation files (the "Software"), to deal
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
8 * in the Software without restriction, including without limitation the rights
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
9 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
10 * copies of the Software, and to permit persons to whom the Software is
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
11 * furnished to do so, subject to the following conditions:
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
12 *
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
13 * The above copyright notice and this permission notice shall be included in all
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
14 * copies or substantial portions of the Software.
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
15 *
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
16 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
17 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
18 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
19 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
20 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
21 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
22 * SOFTWARE.
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
23 **/
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
24
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
25 #include "vec/impl/x86/avx512f.h"
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
26 #include "vec/impl/generic.h"
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
27
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
28 #include <immintrin.h>
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
29
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
30 // this is a stupid amount of work just to do these operations, is it really worth it ?
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
31 // also same note in avx2.c applies here, these do not handle sign bits properly, which
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
32 // isn't that big of a deal for regular arithmetic operations, but matters quite a bit
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
33 // when doing things like arithmetic shifts.
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
34 #define VEC_AVX512F_OPERATION_8x64(op, sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
35 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
36 union v##sign##int8x64_impl_data *vec1d = (union v##sign##int8x64_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
37 union v##sign##int8x64_impl_data *vec2d = (union v##sign##int8x64_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
38 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
39 /* unpack and operate */ \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
40 __m512i dst_1 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 24), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 24), 24)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
41 __m512i dst_2 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 16), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 16), 24)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
42 __m512i dst_3 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 8), 24), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 8), 24)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
43 __m512i dst_4 = _mm512_##op##_epi32(_mm512_srli_epi32(vec1d->avx512f, 24), _mm512_srli_epi32(vec2d->avx512f, 24)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
44 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
45 /* repack */ \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
46 vec1d->avx512f = _mm512_or_si512( \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
47 _mm512_or_si512( \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
48 _mm512_srli_epi32(_mm512_slli_epi32(dst_1, 24), 24), \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
49 _mm512_srli_epi32(_mm512_slli_epi32(dst_2, 24), 16) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
50 ), \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
51 _mm512_or_si512( \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
52 _mm512_srli_epi32(_mm512_slli_epi32(dst_3, 24), 8), \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
53 _mm512_slli_epi32(dst_4, 24) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
54 ) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
55 ); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
56 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
57 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
58 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
59
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
60 #define VEC_AVX512F_OPERATION_16x32(op, sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
61 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
62 union v##sign##int16x32_impl_data *vec1d = (union v##sign##int16x32_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
63 union v##sign##int16x32_impl_data *vec2d = (union v##sign##int16x32_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
64 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
65 /* unpack and operate; it would be nice if we had an _m512_andi_epi32... */ \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
66 __m512i dst_1 = _mm512_##op##_epi32(_mm512_srli_epi32(_mm512_slli_epi32(vec1d->avx512f, 16), 16), _mm512_srli_epi32(_mm512_slli_epi32(vec2d->avx512f, 16), 16)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
67 __m512i dst_2 = _mm512_##op##_epi32(_mm512_srli_epi32(vec1d->avx512f, 16), _mm512_srli_epi32(vec2d->avx512f, 16)); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
68 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
69 /* repack */ \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
70 vec1d->avx512f = _mm512_or_si512( \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
71 _mm512_srli_epi32(_mm512_slli_epi32(dst_1, 16), 16), \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
72 _mm512_slli_epi32(dst_2, 16) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
73 ); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
74 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
75 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
76
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
77 #define VEC_AVX512F_ADD_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
78 VEC_AVX512F_OPERATION_8x64(add, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
79
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
80 #define VEC_AVX512F_ADD_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
81 VEC_AVX512F_OPERATION_16x32(add, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
82
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
83 #define VEC_AVX512F_ADD_32x16(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
84 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
85 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
86 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
87 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
88 vec1d->avx512f = _mm512_add_epi32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
89 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
90 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
91
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
92 #define VEC_AVX512F_ADD_64x8(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
93 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
94 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
95 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
96 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
97 vec1d->avx512f = _mm512_add_epi64(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
98 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
99 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
100
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
101 #define VEC_AVX512F_SUB_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
102 VEC_AVX512F_OPERATION_8x64(sub, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
103
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
104 #define VEC_AVX512F_SUB_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
105 VEC_AVX512F_OPERATION_16x32(sub, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
106
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
107 #define VEC_AVX512F_SUB_32x16(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
108 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
109 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
110 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
111 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
112 vec1d->avx512f = _mm512_sub_epi32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
113 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
114 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
115
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
116 #define VEC_AVX512F_SUB_64x8(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
117 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
118 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
119 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
120 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
121 vec1d->avx512f = _mm512_sub_epi64(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
122 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
123 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
124
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
125 #define VEC_AVX512F_MUL_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
126 VEC_AVX512F_OPERATION_8x64(mullo, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
127
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
128 #define VEC_AVX512F_MUL_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
129 VEC_AVX512F_OPERATION_16x32(mullo, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
130
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
131 #define VEC_AVX512F_MUL_32x16(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
132 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
133 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
134 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
135 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
136 vec1d->avx512f = _mm512_mullo_epi32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
137 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
138 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
139
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
140 #define VEC_AVX512F_MUL_64x8(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
141 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
142 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
143 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
144 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
145 __m512i ac = _mm512_mul_epu32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
146 __m512i b = _mm512_srli_epi64(vec1d->avx512f, 32); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
147 __m512i bc = _mm512_mul_epu32(b, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
148 __m512i d = _mm512_srli_epi64(vec2d->avx512f, 32); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
149 __m512i ad = _mm512_mul_epu32(vec1d->avx512f, d); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
150 __m512i hi = _mm512_add_epi64(bc, ad); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
151 hi = _mm512_slli_epi64(hi, 32); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
152 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
153 vec1d->avx512f = _mm512_add_epi64(hi, ac); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
154 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
155 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
156
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
157 #define VEC_AVX512F_LSHIFT_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
158 VEC_AVX512F_OPERATION_8x64(sllv, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
159
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
160 #define VEC_AVX512F_LSHIFT_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
161 VEC_AVX512F_OPERATION_16x32(sllv, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
162
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
163 #define VEC_AVX512F_LSHIFT_32x16(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
164 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
165 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
166 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
167 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
168 vec1d->avx512f = _mm512_sllv_epi32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
169 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
170 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
171
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
172 #define VEC_AVX512F_LSHIFT_64x8(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
173 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
174 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
175 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
176 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
177 vec1d->avx512f = _mm512_sllv_epi64(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
178 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
179 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
180
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
181 #define VEC_AVX512F_lRSHIFT_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
182 VEC_AVX512F_OPERATION_8x64(srlv, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
183
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
184 #define VEC_AVX512F_lRSHIFT_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
185 VEC_AVX512F_OPERATION_16x32(srlv, sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
186
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
187 #define VEC_AVX512F_aRSHIFT_8x64(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
188 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
189 return v##sign##int8x64_generic_rshift(vec1, vec2); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
190 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
191
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
192 #define VEC_AVX512F_aRSHIFT_16x32(sign) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
193 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
194 return v##sign##int16x32_generic_rshift(vec1, vec2); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
195 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
196
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
197 #define VEC_AVX512F_RSHIFT_8x64(sign, aORl) VEC_AVX512F_##aORl##RSHIFT_8x64(sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
198 #define VEC_AVX512F_RSHIFT_16x32(sign, aORl) VEC_AVX512F_##aORl##RSHIFT_16x32(sign)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
199
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
200 #define VEC_AVX512F_RSHIFT_32x16(sign, aORl) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
201 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
202 union v##sign##int32x16_impl_data *vec1d = (union v##sign##int32x16_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
203 union v##sign##int32x16_impl_data *vec2d = (union v##sign##int32x16_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
204 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
205 vec1d->avx512f = _mm512_sr##aORl##v_epi32(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
206 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
207 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
208
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
209 #define VEC_AVX512F_RSHIFT_64x8(sign, aORl) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
210 do { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
211 union v##sign##int64x8_impl_data *vec1d = (union v##sign##int64x8_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
212 union v##sign##int64x8_impl_data *vec2d = (union v##sign##int64x8_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
213 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
214 vec1d->avx512f = _mm512_sr##aORl##v_epi64(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
215 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
216 } while (0)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
217
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
218 #define VEC_AVX512F_uRSHIFT_8x64(sign, aORl) VEC_AVX512F_RSHIFT_8x64(sign, l)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
219 #define VEC_AVX512F_uRSHIFT_16x32(sign, aORl) VEC_AVX512F_RSHIFT_16x32(sign, l)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
220 #define VEC_AVX512F_uRSHIFT_32x16(sign, aORl) VEC_AVX512F_RSHIFT_32x16(sign, l)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
221 #define VEC_AVX512F_uRSHIFT_64x8(sign, aORl) VEC_AVX512F_RSHIFT_64x8(sign, l)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
222
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
223 #define VEC_AVX512F_DEFINE_OPERATIONS_SIGN(sign, bits, size) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
224 union v##sign##int##bits##x##size##_impl_data { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
225 v##sign##int##bits##x##size vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
226 __m512i avx512f; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
227 }; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
228 \
24
e49e70f7012f impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents: 23
diff changeset
229 VEC_STATIC_ASSERT(VEC_ALIGNOF(__m512i) <= VEC_ALIGNOF(v##sign##int##bits##x##size), "vec: v" #sign "int" #bits "x" #size " alignment needs to be expanded to fit intrinsic type size"); \
e49e70f7012f impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents: 23
diff changeset
230 VEC_STATIC_ASSERT(sizeof(__m512i) <= sizeof(v##sign##int##bits##x##size), "vec: v" #sign "int" #bits "x" #size " needs to be expanded to fit intrinsic type size"); \
e49e70f7012f impl/x86: add static assertions for alignment and size
Paper <paper@tflc.us>
parents: 23
diff changeset
231 \
23
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
232 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_load_aligned(const vec_##sign##int##bits in[size]) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
233 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
234 union v##sign##int##bits##x##size##_impl_data vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
235 vec.avx512f = _mm512_load_si512((const __m512i *)in); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
236 return vec.vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
237 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
238 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
239 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_load(const vec_##sign##int##bits in[size]) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
240 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
241 union v##sign##int##bits##x##size##_impl_data vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
242 vec.avx512f = _mm512_loadu_si512((const __m512i *)in); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
243 return vec.vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
244 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
245 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
246 static void v##sign##int##bits##x##size##_avx512f_store_aligned(v##sign##int##bits##x##size vec, vec_##sign##int##bits out[size]) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
247 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
248 _mm512_store_si512((__m512i *)out, ((union v##sign##int##bits##x##size##_impl_data *)&vec)->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
249 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
250 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
251 static void v##sign##int##bits##x##size##_avx512f_store(v##sign##int##bits##x##size vec, vec_##sign##int##bits out[size]) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
252 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
253 _mm512_storeu_si512((__m512i *)out, ((union v##sign##int##bits##x##size##_impl_data *)&vec)->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
254 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
255 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
256 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_add(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
257 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
258 VEC_AVX512F_ADD_##bits##x##size(sign); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
259 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
260 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
261 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_sub(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
262 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
263 VEC_AVX512F_SUB_##bits##x##size(sign); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
264 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
265 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
266 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_mul(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
267 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
268 VEC_AVX512F_MUL_##bits##x##size(sign); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
269 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
270 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
271 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_and(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
272 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
273 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
274 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
275 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
276 vec1d->avx512f = _mm512_and_si512(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
277 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
278 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
279 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
280 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_or(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
281 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
282 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
283 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
284 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
285 vec1d->avx512f = _mm512_or_si512(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
286 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
287 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
288 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
289 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_xor(v##sign##int##bits##x##size vec1, v##sign##int##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
290 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
291 union v##sign##int##bits##x##size##_impl_data *vec1d = (union v##sign##int##bits##x##size##_impl_data *)&vec1; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
292 union v##sign##int##bits##x##size##_impl_data *vec2d = (union v##sign##int##bits##x##size##_impl_data *)&vec2; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
293 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
294 vec1d->avx512f = _mm512_xor_si512(vec1d->avx512f, vec2d->avx512f); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
295 return vec1d->vec; \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
296 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
297 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
298 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_lshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
299 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
300 VEC_AVX512F_LSHIFT_##bits##x##size(sign); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
301 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
302 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
303 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_rshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
304 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
305 VEC_AVX512F_##sign##RSHIFT_##bits##x##size(sign, a); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
306 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
307 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
308 static v##sign##int##bits##x##size v##sign##int##bits##x##size##_avx512f_lrshift(v##sign##int##bits##x##size vec1, vuint##bits##x##size vec2) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
309 { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
310 VEC_AVX512F_RSHIFT_##bits##x##size(sign, l); \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
311 } \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
312 \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
313 const v##sign##int##bits##x##size##_impl v##sign##int##bits##x##size##_impl_avx512f = { \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
314 v##sign##int##bits##x##size##_generic_splat, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
315 v##sign##int##bits##x##size##_avx512f_load_aligned, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
316 v##sign##int##bits##x##size##_avx512f_load, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
317 v##sign##int##bits##x##size##_avx512f_store_aligned, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
318 v##sign##int##bits##x##size##_avx512f_store, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
319 v##sign##int##bits##x##size##_avx512f_add, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
320 v##sign##int##bits##x##size##_avx512f_sub, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
321 v##sign##int##bits##x##size##_avx512f_mul, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
322 v##sign##int##bits##x##size##_generic_div, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
323 v##sign##int##bits##x##size##_generic_avg, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
324 v##sign##int##bits##x##size##_avx512f_and, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
325 v##sign##int##bits##x##size##_avx512f_or, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
326 v##sign##int##bits##x##size##_avx512f_xor, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
327 v##sign##int##bits##x##size##_generic_not, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
328 v##sign##int##bits##x##size##_avx512f_lshift, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
329 v##sign##int##bits##x##size##_avx512f_rshift, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
330 v##sign##int##bits##x##size##_avx512f_lrshift, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
331 v##sign##int##bits##x##size##_generic_cmplt, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
332 v##sign##int##bits##x##size##_generic_cmple, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
333 v##sign##int##bits##x##size##_generic_cmpeq, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
334 v##sign##int##bits##x##size##_generic_cmpge, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
335 v##sign##int##bits##x##size##_generic_cmpgt, \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
336 };
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
337
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
338 #define VEC_AVX512F_DEFINE_OPERATIONS(bits, size) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
339 VEC_AVX512F_DEFINE_OPERATIONS_SIGN(u, bits, size) \
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
340 VEC_AVX512F_DEFINE_OPERATIONS_SIGN( , bits, size)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
341
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
342 VEC_AVX512F_DEFINE_OPERATIONS(8, 64)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
343 VEC_AVX512F_DEFINE_OPERATIONS(16, 32)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
344 VEC_AVX512F_DEFINE_OPERATIONS(32, 16)
e26874655738 *: huge refactor, new major release (hahaha)
Paper <paper@tflc.us>
parents:
diff changeset
345 VEC_AVX512F_DEFINE_OPERATIONS(64, 8)