Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack alloc reduce #4191

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

Nikita-str
Copy link
Contributor

This Pull Request closes #4089

Reduce the stack size. Mostly affected expression with nested parenthesizes.

Some examples & comparison charts
fn parenthesized_fn_test() {
    let n = ...;
    let js = "(function(){ ".repeat(n) + &" })()".repeat(n);

    let interner = &mut Interner::default();
    crate::Parser::new(crate::Source::from_bytes(&js))
        .parse_script(&boa_ast::scope::Scope::new_global(), interner)
        .expect("failed to parse");
}
profile prev max n
without stack overflow
new max n
without stack overflow
x
dev (opt-level = 0) 11 79 ~6
test (opt-level = 1) 50 160 ~3
release 50 217 ~4

fn parenthesized_sum_test() {
    let n = ...;
    let js = "(1+".repeat(n) + "41" + &")".repeat(n);

    let interner = &mut Interner::default();
    crate::Parser::new(crate::Source::from_bytes(&js))
        .parse_script(&boa_ast::scope::Scope::new_global(), interner)
        .expect("failed to parse");
}
profile prev max n
without stack overflow
new max n
without stack overflow
x
dev (opt-level = 0) 30 200 ~6.5
test (opt-level = 1) 105 390 ~3.5
release 105 510 ~4.5

Now there no stack overflow when running the boa_tester in debug(cargo run -p boa_tester -- run -v -o test262 ). The problem was here test262\test\language\statements\function\S13.2.1_A1_T1.js, so you can run cargo run -p boa_tester -- run -s test/language/statements/function/ to test the fix. For more details see the issue.


The reduction performed by 3 operations:

  1. boxing of large objects. It's affected all profiles and allow to reduce the stack size by ~ 3 times;
  2. In one place nested parse calls was unwrapped. Also affected all profiles;
  3. Splitting parse function. Affected only profile.dev (opt-level = 0) where the problem arose. Allow to reduce the stack size by ~ 2 times (and with boxing it becomes ~ 6 times).

There are some ugly changes that need to be discussed & maybe rewritten to a less ugly:

All of the changes reduce the stack size, so even if they are ugly, they are still not meaningless :/

~20%: stack overflow occurs after 77 rec fn calls VS 63 before changes of the commit
before this change stack size in `expression!:parse_boxed` was 0x118, after the changes -- 0xC8 ~ 30% less;
stack size of `ExponentialExpression:parse_boxed` also reduced from 0xE0 to 0x90 ~ 35% less;

And before all commits in the branch:
* `expression!:parse_boxed` was 0x368, so ~75% less;
* `ExponentialExpression:parse_boxed` was 0x360, so ~85% less;
from 0x4D8 to 0x1A8 (~65% less)
inlined for `BitwiseORExpression` & `ShiftExpression` (top called of parsed expression which generated by `expression!`);

Looks ugly, need to make more readable & generate unwrapping by the macro itself
Copy link

codecov bot commented Feb 27, 2025

Codecov Report

Attention: Patch coverage is 71.97232% with 162 lines in your changes missing coverage. Please review.

Project coverage is 52.78%. Comparing base (6ddc2b4) to head (e324a18).
Report is 385 commits behind head on main.

Files with missing lines Patch % Lines
core/parser/src/parser/expression/primary/mod.rs 45.45% 60 Missing ⚠️
core/parser/src/parser/expression/mod.rs 76.08% 22 Missing ⚠️
...ore/parser/src/parser/expression/assignment/mod.rs 68.65% 21 Missing ⚠️
...parser/src/parser/expression/left_hand_side/mod.rs 52.27% 21 Missing ⚠️
...ser/src/parser/expression/left_hand_side/member.rs 62.96% 10 Missing ⚠️
core/parser/src/parser/expression/update.rs 91.11% 4 Missing ⚠️
core/ast/src/expression/call.rs 62.50% 3 Missing ⚠️
core/ast/src/expression/operator/binary/mod.rs 62.50% 3 Missing ⚠️
...e/parser/src/parser/expression/assignment/yield.rs 40.00% 3 Missing ⚠️
...arser/src/parser/expression/left_hand_side/call.rs 70.00% 3 Missing ⚠️
... and 10 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4191      +/-   ##
==========================================
+ Coverage   47.24%   52.78%   +5.54%     
==========================================
  Files         476      488      +12     
  Lines       46892    52236    +5344     
==========================================
+ Hits        22154    27575    +5421     
+ Misses      24738    24661      -77     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hansl
Copy link
Contributor

hansl commented Feb 28, 2025

I would like to see the performance impact of this one, in release. Would you mind running the v8 combined benchmarks?

cargo run -p boa_cli --release -- -O $SRC/boa-dev/data/bench/bench-v8/combined.js

(with $SRC being wherever your root for boa-dev/data repo is)

@Nikita-str
Copy link
Contributor Author

Current results of the bench.
PROGRESS Richards
RESULT Richards 72.8
PROGRESS DeltaBlue
RESULT DeltaBlue 64.4
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 144
PROGRESS RayTrace
RESULT RayTrace 196
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 229
PROGRESS RegExp
RESULT RegExp 61.0
PROGRESS Splay
RESULT Splay 355
PROGRESS NavierStokes
RESULT NavierStokes 337
SCORE 147
undefined
Previous(checkout 35b44d9) results of the bench.
PROGRESS Richards
RESULT Richards 76.0
PROGRESS DeltaBlue
RESULT DeltaBlue 64.8
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 140
PROGRESS RayTrace
RESULT RayTrace 190
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 230
PROGRESS RegExp
RESULT RegExp 61.2
PROGRESS Splay
RESULT Splay 357
PROGRESS NavierStokes
RESULT NavierStokes 334
SCORE 147
undefined

Seems like a noise error. And it's pretty predictable, because as I can understand the bench tests the perf of js execution, and PR's changes affect only the parser stage. So is there a bench to test perf of the parser stage?

@hansl
Copy link
Contributor

hansl commented Feb 28, 2025

Nice! I'll review when I have some time this weekend.

/// A struct which represents errors encountered during parsing an expression.
/// Contains boxed `ErrorInner` to reduce memory allocation on the stack.
#[derive(Debug)]
pub struct Error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a difference? Note that the previous Error is rather small (32 bytes?), and this code should not clone errors themselves anyway.

BTW, still reviewing the rest, I haven't forgotten about you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous Error (that now named ErrorInner) when compiled with 8 bytes sized pointer has a size of 72 bytes:
Expected variant has 3 fat pointers(2 * sizeof(ptr) each) + Span(16 bytes) = 3 * 2 * 8 + 16 = 64 bytes + (because it's not only variant of the enum) its maximum align (8 bytes) = 72 bytes; In case of 4 bytes sized pointer = 3 * 2 * 4 + 16 + 4(align) = 44 bytes.

For Result<T, E> the allocated stack size will be:

  • Result<Box<Expr>, ErrorInner> --> 72 bytes (no additional align because of null pointer optimization).
  • Result<Box<Expr>, Error> --> 16 bytes.

So, to me it seems ok to have an extra boxed value in case of an error, but save at least 56 bytes on stack per a call(with low optimization it for sure will be more than 56 bytes because of stack allocation for each call in different execution branches). ... (function()( ... has ~20 calls per one recursion, so this changes at least save 56 * 20 = 1120 bytes per one recursive repetition of the expression.

@hansl
Copy link
Contributor

hansl commented Mar 8, 2025

Another question I would have is whether the new _boxed constructors are necessary, instead of simply replacing the regular constructors. WDYT?

@Nikita-str
Copy link
Contributor Author

Another question I would have is whether the new _boxed constructors are necessary, instead of simply replacing the regular constructors. WDYT?

Hmm... new_boxed shows that the return value is not Self but Box<Self>; and sometimes there is another code which does not need Box<T>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

boa_tester stack overflows in debug mode
2 participants