-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pulley: Implement interpreter-to-host calls #9665
pulley: Implement interpreter-to-host calls #9665
Conversation
This commit is an initial stab at implementing interpreter-to-host communication in Pulley. The basic problem is that Pulley needs the ability to call back into Wasmtime to implement tasks such as `memory.grow`, imported functions, etc. For native platforms this is a simple `call_indirect` operation in Cranelift but the story for Pulley must be different because it's effectively switching from interpreted code to native code. The initial idea for this in bytecodealliance#9651 is replaced here and looks mostly similar but with a few changes. The overall structure of how this works is: * A new `call_indirect_host` opcode is added to Pulley. * Function signatures that can be called from Pulley bytecode are statically enumerated at build-time. * This enables the implementation of `call_indirect_host` to take an immediate of which signature is being used and cast the function pointer to the right type. * A new pulley-specific relocation is added to Cranelift for this opcode. * `RelocDistance::Far` calls to a name trigger the use of `call_indirect_host`. * The relocation is filled in by Wasmtime after compilation where the signature number is inserted. * A new `NS_*` value for user-function namespaces is reserved in `wasmtime-cranelift` for this new namespace of functions. * Code generation for Pulley in `wasmtime-cranelift` now has Pulley-specific handling of the wasm-to-host transition where all previous `call_indirect` instructions are replaced with a call to a "backend intrinsic" which gets lowered to a `call_indirect_host`. Note that most of this still isn't hooked up everywhere in Wasmtime. That means that the testing here is pretty light at this time. It'll require a fair bit more work to get everything fully integrated from Wasmtime in Pulley. This is expected to be one of the significant remaining chunks of work and should help unblock future testing (or make those diffs smaller ideally).
0ea4d86
to
1c9e235
Compare
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "cranelift", "cranelift:area:machinst", "pulley", "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to get something working.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
/// Pulley - call a host function indirectly where the embedder resolving | ||
/// this relocation needs to fill in the expected signature. | ||
PulleyCallIndirectHost, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the signature actually need to be resolved at reloc time? It can't be done at compile time and embedded in the instruction itself?
The address of any host function obviously needs to be reloc time (this is a bit of an aside because my understanding is that we aren't actually embedding any host function addresses in the pulley bytecode) however the signature doesn't seem like it should need to be resolved at reloc time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you're thinking is already done actually, but the phrasing here is ambiguous. The "reloc time" technically happens twice -- once when linking things into artifacts and again when loading the artifacts. Putting the signature into the instruction happens in the first of these, during linking time. The relocation here is needed because the UserExternalName
isn't available during compilation, only after the compile has finished, so that level of relocation processing is required to stuff it in.
Otherwise though there's no runtime relocation when we load the bytecode itself, it's all frozen and loaded as-is from disk or the compile artifact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh it is the function's id/code that is being reloc'd at link time here? That makes sense to me. When I read "signature" I was thinking "parameter and result types" and perhaps "calling convention", which happens to align with cranelift_codegen::ir::Signature
.
Can we replace "signature" with "code" or "id" in these bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point yeah, this is also something that changed halfway through the design and I didn't get around to updating all the docs
_tmp: Writable<Reg>, | ||
info: CallInfo<()>, | ||
) -> SmallVec<[Self::I; 2]> { | ||
match dest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check the calling conventions at all here? That was what I was (hackily) using to distinguish between pulley-to-pulley and pulley-to-host before. I like reloc-distance better but maybe we should be asserting that pulley-to-pulley always uses tail
and pulley-to-host always uses systemv
(which is a bit of a lie) or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's reasonable yeah, I'll try to go back and add some assertions.
This commit is a continuation of the plan of implementing host calls in Pulley through bytecodealliance#9665, bytecodealliance#9675, and bytecodealliance#9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code.
Handling review comments from bytecodealliance#9665 and fully updating documentation to reflect the mid-pr design shift to the currently-landed state.
* Enumerate all host calls in `wasmtime_environ::HostCall` This commit is a continuation of the plan of implementing host calls in Pulley through #9665, #9675, and #9693. Here the `Compiler::call_indirect_host` method is updated to take a new type, `HostCall`, which indicates what type of host call is being performed. This is then serialized to a 32-bit integer which will be present in the pulley instruction being generated. This 32-bit integer will then be used to perform a dispatch (the dispatch is left for a future PR with more Pulley integration). This new `HostCall` structure is defined with `BuiltinFunctionIndex` internally. Additionally a new `ComponentBuiltinFunctionIndex` is added to enumerate the same set of indexes for components as well. Along the way the split between component transcoders/builtins were removed and they're now all lumped together in one macro for builtins. (no need to have two separate macros). This new `HostCall` is used to implement the `call_indirect_host` instruction for Pulley to fill out an unimplemented piece of code. * Rename `max` to `len`
Handling review comments from #9665 and fully updating documentation to reflect the mid-pr design shift to the currently-landed state.
This commit is an initial stab at implementing interpreter-to-host communication in Pulley. The basic problem is that Pulley needs the ability to call back into Wasmtime to implement tasks such as
memory.grow
, imported functions, etc. For native platforms this is a simplecall_indirect
operation in Cranelift but the story for Pulley must be different because it's effectively switching from interpreted code to native code.The initial idea for this in #9651 is replaced here and looks mostly similar but with a few changes. The overall structure of how this works is:
call_indirect_host
opcode is added to Pulley.call_indirect_host
to take an immediate of which signature is being used and cast the function pointer to the right type.RelocDistance::Far
calls to a name trigger the use ofcall_indirect_host
.NS_*
value for user-function namespaces is reserved inwasmtime-cranelift
for this new namespace of functions.wasmtime-cranelift
now has Pulley-specific handling of the wasm-to-host transition where all previouscall_indirect
instructions are replaced with a call to a "backend intrinsic" which gets lowered to acall_indirect_host
.Note that most of this still isn't hooked up everywhere in Wasmtime. That means that the testing here is pretty light at this time. It'll require a fair bit more work to get everything fully integrated from Wasmtime in Pulley. This is expected to be one of the significant remaining chunks of work and should help unblock future testing (or make those diffs smaller ideally).