Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flyweight wrapping error race condition #146

Closed
jfallows opened this issue Jan 15, 2023 · 2 comments · Fixed by #161
Closed

Flyweight wrapping error race condition #146

jfallows opened this issue Jan 15, 2023 · 2 comments · Fixed by #161
Assignees
Labels
bug Something isn't working

Comments

@jfallows
Copy link
Contributor

jfallows commented Jan 15, 2023

Describe the bug
Unexpected exception while wrapping header in HTTP 1.1 response for 404 not found due to stream RESET.

org.agrona.concurrent.AgentTerminationException: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:586)
2023-01-15 12:06:01     at org.agrona.core/org.agrona.concurrent.AgentRunner.doDutyCycle(AgentRunner.java:291)
2023-01-15 12:06:01     at org.agrona.core/org.agrona.concurrent.AgentRunner.run(AgentRunner.java:164)
2023-01-15 12:06:01     at java.base/java.lang.Thread.run(Thread.java:833)
2023-01-15 12:06:01 Caused by: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.checkLimit(Flyweight.java:67)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.String8FW.wrap(String8FW.java:70)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:28)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:8)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.wrap(Flyweight.java:60)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Array32FW.matchFirst(Array32FW.java:90)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer.doEncodeHeaders(HttpServerFactory.java:2266)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer$HttpExchange.onRequestReset(HttpServerFactory.java:2826)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer$HttpExchange.onExchange(HttpServerFactory.java:2788)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleReadInitial(DispatchAgent.java:1039)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleRead(DispatchAgent.java:967)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.concurent.ManyToOneRingBuffer.read(ManyToOneRingBuffer.java:181)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:580)
2023-01-15 12:06:01     ... 3 more
2023-01-15 12:06:01     Suppressed: java.lang.Exception: [engine/data#1]        [0x010100000000000d] streams=[consumeAt=0x00000c78 (0x0000000000000c78), produceAt=0x00000c78 (0x0000000000000c78)]
2023-01-15 12:06:01             at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:584)
2023-01-15 12:06:01             ... 3 more
2023-01-15 12:06:01 org.agrona.concurrent.AgentTerminationException: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:586)
2023-01-15 12:06:01     at org.agrona.core/org.agrona.concurrent.AgentRunner.doDutyCycle(AgentRunner.java:291)
2023-01-15 12:06:01     at org.agrona.core/org.agrona.concurrent.AgentRunner.run(AgentRunner.java:164)
2023-01-15 12:06:01     at java.base/java.lang.Thread.run(Thread.java:833)
2023-01-15 12:06:01 Caused by: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.checkLimit(Flyweight.java:67)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.String8FW.wrap(String8FW.java:70)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:28)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:8)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.wrap(Flyweight.java:60)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.types.Array32FW.matchFirst(Array32FW.java:90)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer.doEncodeHeaders(HttpServerFactory.java:2263)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer$HttpExchange.onRequestReset(HttpServerFactory.java:2826)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.binding.http/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$HttpServer$HttpExchange.onExchange(HttpServerFactory.java:2788)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleReadInitial(DispatchAgent.java:1039)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleRead(DispatchAgent.java:967)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.concurent.ManyToOneRingBuffer.read(ManyToOneRingBuffer.java:181)
2023-01-15 12:06:01     at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:580)
2023-01-15 12:06:01     ... 3 more
2023-01-15 12:06:01     Suppressed: java.lang.Exception: [engine/data#2]        [0x020200000000000d] streams=[consumeAt=0x00000c78 (0x0000000000000c78), produceAt=0x00000c78 (0x0000000000000c78)]
2023-01-15 12:06:01             at io.aklivity.zilla.runtime.engine/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:584)
2023-01-15 12:06:01             ... 3 more
2023-01-15 12:06:01 stopped

To Reproduce
Steps to reproduce the behavior:

  1. Use http.echo example
  2. In zilla.json, change http_server0 exit to "none"
  3. Start the docker stack per README
  4. echo "Hello, world" > /tmp/postfile
  5. Use ab -v 4 -n 20 -c 20 -p /tmp/postfile -T "text/plain" -m "POST" http://localhost:8080/ to trigger bug
    Expected behavior
    Zilla should not trigger an exception.

Additional context
This is triggered because when stream RESET occurs, in this case due to exit binding not found, we automatically send a 404 not found response, based on static field HEADERS_404_NOT_FOUND in HttpServerFactory.

During encode, the HTTP 1.1 server factory uses the matchFirst method on the headers, which mutates the internal header item flyweight, causing a race condition across threads.

Note: this bug no longer reproduces when zilla is configured via stack.yml to use one thread only, via -w 1 command line argument which sets workers to 1.

@jfallows jfallows added the bug Something isn't working label Jan 15, 2023
@jfallows
Copy link
Contributor Author

Suggested fix, change static fields such as HEADERS_404_NOT_FOUND on HttpServerFactory to non-static so there is a separate instance of them per instance of HttpServerFactory, which has exactly one instance per worker thread.

@attilakreiner
Copy link
Contributor

I bumped into a very similar error case, with an almost identical stacktrace, just with the Http2* classes. Posting the stacktrace here FTR:

org.agrona.concurrent.AgentTerminationException: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
    at [email protected]/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:585)
    at org.agrona.core/org.agrona.concurrent.AgentRunner.doDutyCycle(AgentRunner.java:291)
    at org.agrona.core/org.agrona.concurrent.AgentRunner.run(AgentRunner.java:164)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.IndexOutOfBoundsException: limit=40 is beyond maxLimit=39
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.checkLimit(Flyweight.java:67)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.String8FW.wrap(String8FW.java:70)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:28)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.HttpHeaderFW.wrap(HttpHeaderFW.java:8)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.Flyweight.wrap(Flyweight.java:60)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.types.Array32FW.forEach(Array32FW.java:67)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$Http2HeadersEncoder.encodeHeaders(HttpServerFactory.java:6464)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$Http2Server.lambda$doEncodeHeaders$27(HttpServerFactory.java:5263)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.codec.Http2HeadersFW$Builder.headers(Http2HeadersFW.java:231)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$Http2Server.doEncodeHeaders(HttpServerFactory.java:5263)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$Http2Server$Http2Exchange.onRequestReset(HttpServerFactory.java:5648)
    at [email protected]/io.aklivity.zilla.runtime.binding.http.internal.stream.HttpServerFactory$Http2Server$Http2Exchange.onExchange(HttpServerFactory.java:5606)
    at [email protected]/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleReadInitial(DispatchAgent.java:1038)
    at [email protected]/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.handleRead(DispatchAgent.java:966)
stopped
    at [email protected]/io.aklivity.zilla.runtime.engine.internal.concurent.ManyToOneRingBuffer.read(ManyToOneRingBuffer.java:181)
    at [email protected]/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:579)
    ... 3 more
    Suppressed: java.lang.Exception: [engine/data#2]    [0x020200000000000b] streams=[consumeAt=0x000036c8 (0x00000000000036c8), produceAt=0x00003778 (0x0000000000003778)]
        at [email protected]/io.aklivity.zilla.runtime.engine.internal.registry.DispatchAgent.doWork(DispatchAgent.java:583)
        ... 3 more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants