Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbus authenication regression #9553

Closed
ajeddeloh opened this issue Jul 9, 2018 · 6 comments
Closed

dbus authenication regression #9553

ajeddeloh opened this issue Jul 9, 2018 · 6 comments
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer

Comments

@ajeddeloh
Copy link
Contributor

systemd version the issue has been seen with
239
Bug is not present in 238

Used distribution
Gentoo, Container Linux

Expected behaviour you didn't see
No dbus authentication failures

Unexpected behaviour you saw
go-systemd gets dbus: authentication protocol error when trying to wait on a device unit.
[email protected] fails on container linux

Steps to reproduce the problem
Run the Ignition blackbox tests on a systemd with systemd 239.
I tried also getting the go-systemd tests to repro it they all seemed to pass. Sorry I don't have a cleaner repro.

Other notes
This was originally reported as part of #9456 (see specifically this comment)
On container linux we've seen failures on our integration tests.
I've also hit it when running Ignition blackbox tests (which use go-systemd) on my host system.

cc @lucab

@poettering
Copy link
Member

poettering commented Jul 13, 2018

What shall I make from "dbus: authentication protocol error"? that's not useful at all... Please provide details on what precisely this is error means in in go-systemd? Why do you believe there's a bug in systemd here, rather than in go-systemd? can you reproduce this with any other dbus client, for example "busctl"?

is this a client that talks to dbus-daemon? or a one that uses the private socket of PID 1 and talks directly to it? Is there anything else in the logs?

Is this reproducible on other distributions? What makes you think this is an upstream systemd bug rather than one in ContainerLinux?

@poettering poettering added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jul 13, 2018
@filbranden
Copy link
Member

@ajeddeloh Try using dbus-monitor to see the DBus messages while you can reproduce the error.

That message is coming from auth code in godbus but unfortunately it's not detailed enough to help troubleshooting it...

Running dbus-monitor (perhaps dbus-monitor --system as root, assuming you're talking to pid1) might help shed a light on that.

@ajeddeloh
Copy link
Contributor Author

@poettering I haven't been able to repro with anything but go-systemd as used by Ignition (which uses godbus, a dbus library for go). I know that's not exactly helpful. It should be using dbus, not the private socket. I haven't seen anything interesting on the systemd logs (with debug log level).

@filbranden I don't see the details of the auth handshake in the dbus-monitor --system output. Is there some special trick to get it? I did however hack up the godbus code to log what it's recieving and it looks like its getting ERROR back when it sends AUTH instead of REJECTED <list of methods>. It's worth noting this doesn't happen every time, only occasionally. The godbus code hasn't been changed in a while, so if there's an error there its a pretty old one.

@poettering
Copy link
Member

@ajeddeloh hmm, can you paste a full strace trace of the nego phase of the connection?

Do you have anything in the logs from dbus-daemon's side?

@filbranden
Copy link
Member

Ok so I managed to figure out where this is failing.

Ignition is trying to connect to systemd directly, not through DBus. This connects to the /run/systemd/private Unix socket.

Connecting directly to this Unix socket and reproducing the authentication protocol that is implemented in the Go code, I can reproduce this issue (using netcat alone), on systemd v239 I get "ERROR":

$ echo -ne '\0AUTH\r\n' | sudo nc -U /run/systemd/private
ERROR

On older systemd (v233 on a Debian Stretch) I get "REJECTED" instead:

$ echo -ne '\0AUTH\r\n' | sudo nc -U /run/systemd/private
REJECTED EXTERNAL ANONYMOUS

The authentication code is prepared to deal with "REJECTED", but not with "ERROR".

I looked quickly at the systemd DBus auth code and didn't see anything interesting there (looks like most of that code hasn't been touched since 2013), but hopefully this is enough hint for you to pick it up and figure out what caused this change and what's the best path forward.

Cheers!
Filipe

@filbranden
Copy link
Member

Looks like there's an off-by-one in line_begins introduced in d27b725... Investigating further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer
Development

No branches or pull requests

3 participants