Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Winch: i8x16.shuffle for x64 with AVX #9959

Merged
merged 10 commits into from
Jan 15, 2025

Conversation

jeffcharles
Copy link
Contributor

@jeffcharles jeffcharles commented Jan 9, 2025

Part of #8093. Implements i8x16.shuffle on x64 with AVX extensions.

@jeffcharles jeffcharles requested review from a team as code owners January 9, 2025 16:39
@jeffcharles jeffcharles requested review from fitzgen and removed request for a team January 9, 2025 16:39
@jeffcharles jeffcharles changed the title i8x16.shuffle for x64 with AVX512 Winch: i8x16.shuffle for x64 with AVX512 Jan 9, 2025
@saulecabrera
Copy link
Member

I can take this review as well.

@saulecabrera saulecabrera requested review from saulecabrera and removed request for a team and fitzgen January 9, 2025 17:23
@github-actions github-actions bot added the winch Winch issues or pull requests label Jan 9, 2025
Copy link

github-actions bot commented Jan 9, 2025

Subscribe to Label Action

cc @saulecabrera

This issue or pull request has been labeled: "winch"

Thus the following users have been cc'd because of the following labels:

  • saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@jeffcharles jeffcharles changed the title Winch: i8x16.shuffle for x64 with AVX512 Winch: i8x16.shuffle for x64 with AVX Jan 13, 2025
Comment on lines 1263 to 1290
if self.flags.has_avx() {
// Use `vpshufb` with `lanes` to set the lanes in `lhs` and `rhs`
// separately to either the selected index or 0.
// Then use `vpor` to combine `lhs` and `rhs` into `dst`.
// Setting the most significant bit in the mask's lane to 1 will
// result in corresponding lane in the destination register being
// set to 0. 0x80 sets the most significant bit to 1.
let mut mask_lhs: [u8; 16] = [0x80; 16];
let mut mask_rhs: [u8; 16] = [0x80; 16];
for i in 0..lanes.len() {
if lanes[i] < 16 {
mask_lhs[i] = lanes[i];
} else {
mask_rhs[i] = lanes[i] - 16;
}
}
let mask_lhs = self.asm.add_constant(&mask_lhs);
let mask_rhs = self.asm.add_constant(&mask_rhs);

self.asm.xmm_vpshufb_rrm(dst, lhs, &mask_lhs);
let scratch = writable!(regs::scratch_xmm());
self.asm.xmm_vpshufb_rrm(scratch, rhs, &mask_rhs);
self.asm.vpor(dst, dst.to_reg(), scratch.to_reg());
} else {
bail!(CodeGenError::UnimplementedForNoAvx)
}
Ok(())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion perhaps to improve readability, given that the then branch is somewhat lengthy, could we invert the check so that we return early in case there's no avx support?

if !self.flags().has_avx() {
  bail!(...);
}

// ....
Ok(())

@saulecabrera saulecabrera added this pull request to the merge queue Jan 15, 2025
Merged via the queue into bytecodealliance:main with commit ba950f2 Jan 15, 2025
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
winch Winch issues or pull requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants