Mercurial > vec
comparison README @ 39:f9ca85d2f14c
*: rearrange some things; add avx512bw support
author | Paper <paper@tflc.us> |
---|---|
date | Sat, 26 Apr 2025 15:31:39 -0400 |
parents | fd42f9b1b95e |
children | 55cadb1fac4b |
comparison
equal
deleted
inserted
replaced
38:fd42f9b1b95e | 39:f9ca85d2f14c |
---|---|
136 To use vec, simply include `vec/vec.h` in your program. If you would like | 136 To use vec, simply include `vec/vec.h` in your program. If you would like |
137 your program to also be able to run on older systems, you can create | 137 your program to also be able to run on older systems, you can create |
138 multiple translation units and pass different command line arguments | 138 multiple translation units and pass different command line arguments |
139 to the compiler to enable SSE2/AVX2/Altivec etc, and detect the vector | 139 to the compiler to enable SSE2/AVX2/Altivec etc, and detect the vector |
140 modes the CPU supports at runtime. vec provides an optional public API | 140 modes the CPU supports at runtime. vec provides an optional public API |
141 specifically for this use-case within `vec/impl/cpu.h`; bear in mind | 141 specifically for this use-case within `vec/cpu.h`; bear in mind though |
142 though that it is not thread-safe, so if your program is multithreaded | 142 that it is not thread-safe, so if your program is multithreaded you'll want |
143 you'll want to cache the results on startup. | 143 to cache the results on startup. |
144 | 144 |
145 The CPU vector detection API is extremely simple, and self-explanatory. | 145 The CPU vector detection API is extremely simple, and self-explanatory. |
146 You call `vec_get_CPU_features()', and it returns a bit-mask of the | 146 You call `vec_get_CPU_features()', and it returns a bit-mask of the |
147 values within the enum placed above the function definition. From there, | 147 values within the enum placed above the function definition. From there, |
148 you can test for each value specifically. | 148 you can test for each value specifically. |
175 | 175 |
176 /* no need to free the aligned array -- it is always on the stack */ | 176 /* no need to free the aligned array -- it is always on the stack */ |
177 | 177 |
178 The heap-based API is based off the good old C malloc API: | 178 The heap-based API is based off the good old C malloc API: |
179 | 179 |
180 /* heap allocation stuff is only defined here: */ | |
181 #include "vec/mem.h" | |
182 | |
180 vec_int32 *q = vec_malloc(1024 * sizeof(vec_int32)); | 183 vec_int32 *q = vec_malloc(1024 * sizeof(vec_int32)); |
181 | 184 |
182 /* q is now aligned, and ready for use with a vector aligned load | 185 /* q is now aligned, and ready for use with a vector aligned load |
183 * function. */ | 186 * function. */ |
184 vint32x16_load_aligned(q); | 187 vint32x16_load_aligned(q); |