Mercurial > vec
comparison README @ 39:f9ca85d2f14c
*: rearrange some things; add avx512bw support
| author | Paper <paper@tflc.us> |
|---|---|
| date | Sat, 26 Apr 2025 15:31:39 -0400 |
| parents | fd42f9b1b95e |
| children | 55cadb1fac4b |
comparison
equal
deleted
inserted
replaced
| 38:fd42f9b1b95e | 39:f9ca85d2f14c |
|---|---|
| 136 To use vec, simply include `vec/vec.h` in your program. If you would like | 136 To use vec, simply include `vec/vec.h` in your program. If you would like |
| 137 your program to also be able to run on older systems, you can create | 137 your program to also be able to run on older systems, you can create |
| 138 multiple translation units and pass different command line arguments | 138 multiple translation units and pass different command line arguments |
| 139 to the compiler to enable SSE2/AVX2/Altivec etc, and detect the vector | 139 to the compiler to enable SSE2/AVX2/Altivec etc, and detect the vector |
| 140 modes the CPU supports at runtime. vec provides an optional public API | 140 modes the CPU supports at runtime. vec provides an optional public API |
| 141 specifically for this use-case within `vec/impl/cpu.h`; bear in mind | 141 specifically for this use-case within `vec/cpu.h`; bear in mind though |
| 142 though that it is not thread-safe, so if your program is multithreaded | 142 that it is not thread-safe, so if your program is multithreaded you'll want |
| 143 you'll want to cache the results on startup. | 143 to cache the results on startup. |
| 144 | 144 |
| 145 The CPU vector detection API is extremely simple, and self-explanatory. | 145 The CPU vector detection API is extremely simple, and self-explanatory. |
| 146 You call `vec_get_CPU_features()', and it returns a bit-mask of the | 146 You call `vec_get_CPU_features()', and it returns a bit-mask of the |
| 147 values within the enum placed above the function definition. From there, | 147 values within the enum placed above the function definition. From there, |
| 148 you can test for each value specifically. | 148 you can test for each value specifically. |
| 175 | 175 |
| 176 /* no need to free the aligned array -- it is always on the stack */ | 176 /* no need to free the aligned array -- it is always on the stack */ |
| 177 | 177 |
| 178 The heap-based API is based off the good old C malloc API: | 178 The heap-based API is based off the good old C malloc API: |
| 179 | 179 |
| 180 /* heap allocation stuff is only defined here: */ | |
| 181 #include "vec/mem.h" | |
| 182 | |
| 180 vec_int32 *q = vec_malloc(1024 * sizeof(vec_int32)); | 183 vec_int32 *q = vec_malloc(1024 * sizeof(vec_int32)); |
| 181 | 184 |
| 182 /* q is now aligned, and ready for use with a vector aligned load | 185 /* q is now aligned, and ready for use with a vector aligned load |
| 183 * function. */ | 186 * function. */ |
| 184 vint32x16_load_aligned(q); | 187 vint32x16_load_aligned(q); |
