You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently installed sysdig on a test cluster. As it happens, it's the same cluster I run load tests on. While running sysdig I started a load test. Initially kangal controller timedout creating kubernetes resources. I increased the kubernetes client timeout.
And then the kangal controller was unable to create all of the kubernets resources on the first pass. But it succeeded on the second attempt. The error and stack trace are included.
Feb 12 09:30:50.961 kangal-controller E0212 14:30:50.108353 1 loadtest.go:472] there is a conflict with loadtest 'loadtest-coiling-lightningbug' between datastore and cache. it might be because object has been removed or modified in the datastore
Feb 12 09:30:50.961 kangal-controller Created JMeter resources
Feb 12 09:30:40.866 kangal-controller Created pods with test data
Feb 12 09:30:10.769 kangal-controller Remote custom data enabled, creating PVC
Feb 12 09:29:55.762 kangal-controller E0212 14:29:54.895207 1 loadtest.go:309] error syncing 'loadtest-coiling-lightningbug': client rate limiter Wait returned an error: context deadline exceeded, requeuing
Feb 12 09:29:55.762 kangal-controller error syncing loadtest, re-queuing
Feb 12 09:29:55.762 kangal-controller Error on creating new JMeter service
Feb 12 09:29:55.762 kangal-controller Created pods with test data
Feb 12 09:29:15.659 kangal-controller Remote custom data enabled, creating PVC
Feb 12 09:29:00.590 kangal-controller Created new namespace
I uninstalled sysdig and k8s api response time was much peppier. I'm already in touch with their support regarding the problem. Kangal controller also succeeds on its first pass. Clearly they have some work to do. But maybe kangal does as well?
Solution?
I'm not really sure what the expectation of flow control is... Should this be the exclusive province of cluster admins? Should charts offer some guidance for their apps? Should kangal include a priority level configuration and flow schema for its service account?
What do folks think?
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This seems related so posting here, we've been noticing an issue where loadtests wouldnt start due to this error.
E0306 10:26:24.096471 1 loadtest.go:309] error syncing 'loadtest-ns': Post "https://172.20.0.1:443/api/v1/namespaces?timeout=5s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), requeuing
We've now found the root cause of the delay in time taken to create a namespace (it was the webhook amazon-cloudwatch-observability-mutating-webhook-configuration - either removing or reducing the timeout to 3 seconds fixed the loadtest issue)
Would be nice to be able to configure an increased timeout on the kangal side to 15 seconds rather than 5 for the namespace creation as i'm not aware its a requirement for it to within 5 seconds functionally? Obviously you'd still always want to find the root cause but not sure kangal loadtests need to stop working if the issue does occur
I recently installed sysdig on a test cluster. As it happens, it's the same cluster I run load tests on. While running sysdig I started a load test. Initially kangal controller timedout creating kubernetes resources. I increased the kubernetes client timeout.
And then the kangal controller was unable to create all of the kubernets resources on the first pass. But it succeeded on the second attempt. The error and stack trace are included.
Stack trace
Work around
I uninstalled sysdig and k8s api response time was much peppier. I'm already in touch with their support regarding the problem. Kangal controller also succeeds on its first pass. Clearly they have some work to do. But maybe kangal does as well?
Solution?
I'm not really sure what the expectation of flow control is... Should this be the exclusive province of cluster admins? Should charts offer some guidance for their apps? Should kangal include a priority level configuration and flow schema for its service account?
What do folks think?
The text was updated successfully, but these errors were encountered: