Skip to content

Commit 3cdd5e5

Browse files
jan-wassenbergcopybara-github
authored andcommitted
Fix loop iteration in GeluMulToBF16
Also attempt to speed up builders (parallel) PiperOrigin-RevId: 613092863
1 parent c8b9675 commit 3cdd5e5

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

.github/workflows/build.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ jobs:
4444
-D CMAKE_CXX_COMPILER_LAUNCHER=ccache
4545
4646
- name: Build
47-
run: cmake --build ${{ github.workspace }}/build --preset ${{ matrix.preset }} --config ${{ matrix.build_type }}
47+
run: cmake --build ${{ github.workspace }}/build --preset ${{ matrix.preset }} --config ${{ matrix.build_type }} -j 4
4848

4949
- name: Archive production artifacts
5050
uses: actions/upload-artifact@v4

ops.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ static HWY_NOINLINE HWY_MAYBE_UNUSED void GeluMulToBF16(
241241

242242
size_t i = 0;
243243
if (size >= 2 * NF) {
244-
for (; i < size - 2 * NF; i += 2 * NF) {
244+
for (; i <= size - 2 * NF; i += 2 * NF) {
245245
const VF mul0 = hn::LoadU(df, mul + i);
246246
const VF mul1 = hn::LoadU(df, mul + i + NF);
247247
const VF g0 = hn::Mul(mul0, Gelu(df, hn::LoadU(df, gelu_in + i)));

0 commit comments

Comments
 (0)