-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeing these vulnerability issues on Dogfood every night #25090
Comments
Digging deeper on this, as the deletion should only be picking up files that exist, so if we're attempting to delete files that we can't find when deleting something else may be at play. Still troubleshooting this. |
For #25090. Not 100% sure why we're seeing this issue but this will drop the error severity, and even if a delete fails and leaves a file we'll pick the correct (latest) MSRC file for each OS anyway, so this is low-risk.
@rfairburn Why might these files be getting deleted between the time we calculate a delta between files to download/delete and actually deleting the files? We're consistently seeing this once per day, with one file failing to delete on one hour (~1:22a UTC) and one failing to delete the next hour (~2:22a UTC). Given that we run the vulns cron hourly, this seems odd. Before downgrading the error (see the associated PR), I want to know why we're seeing this. |
Does Dogfood already have the fix that prevents alerts from repeating every time a cron runs until the service is restarted? I'm pretty sure that made it into the RC, but not sure if that version of the RC has been applied to Dogfood yet. This could have been a one-off thing for any number of reasons (we don't have persistent or shared storage at all for example as containers are intended to be stateless as much as possible), but would alert every cron interval with the same error if the alerting fix has not been deployed yet. |
The info above was from CloudWatch Logs, not alerts, and it only happens 2x per day (but happens 2x every day), so I don't think it's one-off, nor is it related to the the repeating alerts thing...I think? |
…ilds of the same version of Windows For #25090.
@rfairburn Your theory on this being due to multiple matches to the same MSRC file to delete was a sound one. Different builds of the same version of Windows is the culprit here (which is why you didn't see this in every environment). The included PR fixes that issue; thanks for the assist here! |
…ilds of the same version of Windows (#27060) For #25090. # Checklist for submitter If some of the following don't apply, delete the relevant line. <!-- Note that API documentation changes are now addressed by the product design team. --> - [x] Changes file added for user-visible changes in `changes/`, `orbit/changes/` or `ee/fleetd-chrome/changes`. See [Changes files](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/Committing-Changes.md#changes-files) for more information. - [x] Input data is properly validated, `SELECT *` is avoided, SQL injection is prevented (using placeholders for values in statements) - [x] Added/updated automated tests - [x] A detailed QA plan exists on the associated ticket (if it isn't there, work with the product group's QA engineer to add it) - [x] Manual QA for all new/changed functionality
@lucasmrod : It does seem benign, so we should log error as debug and continue the for loop.
fleet/server/vulnerabilities/msrc/sync.go
Line 106 in 7ac39e2
Repro/QA
Repro will show the above error. Fixed version won't.
The text was updated successfully, but these errors were encountered: