-
-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack alloc reduce #4191
base: main
Are you sure you want to change the base?
Stack alloc reduce #4191
Conversation
~20%: stack overflow occurs after 77 rec fn calls VS 63 before changes of the commit
before this change stack size in `expression!:parse_boxed` was 0x118, after the changes -- 0xC8 ~ 30% less; stack size of `ExponentialExpression:parse_boxed` also reduced from 0xE0 to 0x90 ~ 35% less; And before all commits in the branch: * `expression!:parse_boxed` was 0x368, so ~75% less; * `ExponentialExpression:parse_boxed` was 0x360, so ~85% less;
from 0x4D8 to 0x1A8 (~65% less)
inlined for `BitwiseORExpression` & `ShiftExpression` (top called of parsed expression which generated by `expression!`); Looks ugly, need to make more readable & generate unwrapping by the macro itself
I would like to see the performance impact of this one, in release. Would you mind running the v8 combined benchmarks? cargo run -p boa_cli --release -- -O $SRC/boa-dev/data/bench/bench-v8/combined.js (with |
Current results of the bench.
Previous(checkout 35b44d9) results of the bench.
Seems like a noise error. And it's pretty predictable, because as I can understand the bench tests the perf of js execution, and PR's changes affect only the parser stage. So is there a bench to test perf of the parser stage? |
Nice! I'll review when I have some time this weekend. |
/// A struct which represents errors encountered during parsing an expression. | ||
/// Contains boxed `ErrorInner` to reduce memory allocation on the stack. | ||
#[derive(Debug)] | ||
pub struct Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make a difference? Note that the previous Error
is rather small (32 bytes?), and this code should not clone errors themselves anyway.
BTW, still reviewing the rest, I haven't forgotten about you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous Error
(that now named ErrorInner
) when compiled with 8 bytes sized pointer has a size of 72 bytes:
Expected
variant has 3 fat pointers(2 * sizeof(ptr)
each) + Span(16 bytes
) = 3 * 2 * 8 + 16
= 64 bytes
+ (because it's not only variant of the enum) its maximum align (8 bytes
) = 72 bytes
; In case of 4 bytes sized pointer = 3 * 2 * 4 + 16 + 4(align)
= 44 bytes
.
For Result<T, E>
the allocated stack size will be:
Result<Box<Expr>, ErrorInner>
--> 72 bytes (no additional align because of null pointer optimization).Result<Box<Expr>, Error>
--> 16 bytes.
So, to me it seems ok to have an extra boxed value in case of an error, but save at least 56 bytes on stack per a call(with low optimization it for sure will be more than 56 bytes because of stack allocation for each call in different execution branches). ... (function()( ...
has ~20 calls per one recursion, so this changes at least save 56 * 20 = 1120 bytes per one recursive repetition of the expression.
Another question I would have is whether the new |
Hmm... |
This Pull Request closes #4089
Reduce the stack size. Mostly affected expression with nested parenthesizes.
Some examples & comparison charts
n
without stack overflow
n
without stack overflow
opt-level = 0
)opt-level = 1
)n
without stack overflow
n
without stack overflow
opt-level = 0
)opt-level = 1
)Now there no stack overflow when running the
boa_tester
in debug(cargo run -p boa_tester -- run -v -o test262
). The problem was heretest262\test\language\statements\function\S13.2.1_A1_T1.js
, so you can runcargo run -p boa_tester -- run -s test/language/statements/function/
to test the fix. For more details see the issue.The reduction performed by 3 operations:
profile
s and allow to reduce the stack size by ~ 3 times;parse
calls was unwrapped. Also affected allprofile
s;parse
function. Affected onlyprofile.dev
(opt-level = 0
) where the problem arose. Allow to reduce the stack size by ~ 2 times (and with boxing it becomes ~ 6 times).There are some ugly changes that need to be discussed & maybe rewritten to a less ugly:
All of the changes reduce the stack size, so even if they are ugly, they are still not meaningless :/