Project

General

Profile

Bug #11671

pam_krb5 fails to establish and keep forwardable kerberos ticket for a keyboard interaction ssh session

Added by Adam Stylinski 3 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:

Description

It appears that logging in from the local console allows me to establish a kerberos ticket that I can use, but logging in with keyboard interactive SSH doesn't. GSS based login with SSH does, but the method to establish the key here is just a simple forward, so that's somewhat expected. I cranked up PAM's debug flags (in both pam.conf and pam_debug), and managed to come up with a reasonable hypothesis for what's going.

It seems that OpenSSH forks off a child PID, calls pam_get_data on the SUNW-KRB5-AUTH-DATA key, finds that it's absent for the newly created pid, and populates it with the important PAM environment variables (mainly the ticket cache path). The problem seems to occur when a second child process is forked, possibly from a shared parent pid, for the setcred portion of PAM. PAM expects the key SUNW-KRB5-AUTH-DATA to exist and errors out if it doesn't. When logging in from the login service, this key is present every time it's searched. Right after setcred is called when logging in from console, a kerberos ticket is added and a warning about ktktd_warn is established. When done from keyboard-interactive login with SSH, it terminates early (though the pam stack still sees it as successful) with a kmd get failed error.

The relevant pieces of code:
/usr/src/lib/pam_modules_krb5/krb5_authenticate.c:211
This is where the SUNW-KRB5-AUTH-DATA in pam_krb5 is calloc'd, populated, and environment varibles are later passed in.
/usr/src/lib/pam_modules_krb5/krb5_setcred.c:109
This is where that key is queried, and the pam handle's heap backed linked list fails to find the key, causing a kmd get failed error

I've attached the two logs, from console, and from keyboard-interactive sshd. It's important to note that kerberized login (GSS) with sshd works, as does keyboard interactive for authentication. However, establishing the session's kerberos ticket fails (klist comes up empty unless the user had previously established a ticket with kinit or logged into the console).

I'm not sure what the relevant fix is, either that kmd needs to be populated in the parent pid so that post fork, copy-on-write doesn't clone a local version of that list, or pam_krb5's setcred method needs to reestablish this module data in the same way auth does.

The relevant portion


Files

fail (74.4 KB) fail Failure mode (keyboard interactive SSH) Adam Stylinski, 2019-09-10 03:41 PM
success (57.9 KB) success Success mode (/dev/console login) Adam Stylinski, 2019-09-10 03:41 PM
pam.conf (4.55 KB) pam.conf My pam configuration (for good measure) Adam Stylinski, 2019-09-10 03:42 PM
fail.log (74.4 KB) fail.log Failure mode (keyboard-interactive SSH, with .log extension) Adam Stylinski, 2019-09-10 05:44 PM
success.log (57.9 KB) success.log Success mode (/dev/console login, with .log extension) Adam Stylinski, 2019-09-10 05:44 PM

History

#1

Updated by Adam Stylinski 3 months ago

Ahhh, I can't edit the bug name. That should be:
pam_krb5 fails to establish and keep forwardable kerberos tickets for a keyboard-interactive ssh session

#2

Updated by Adam Stylinski 3 months ago

Uploading the same attachments with file extensions, so that they are previewable in the browser

#3

Updated by Adam Stylinski 3 months ago

Some more info I found:
https://github.com/rra/pam-krb5/blob/master/setcred.c#L264

That's how other PAM modules are handling the exact same issue. On Freenode, it was suggested a pamd service could also, to some extent, solve this issue.

#4

Updated by Cullum Smith 3 months ago

I just hit this bug as well while following the AD integration instructions on the wiki. If you follow those instructions, you're led to setup ldapclient in self/gssapi mode - so all nss lookups are broken after login until you kinit.

This took me much longer to figure out than I'd like - I had disabled the name-service-cache while attempting to troubleshoot the issue, which caused a root-owned /tmp/krb5cc_10000 containing the ticket cache for the host principal to be created every time I logged in with my kerberos account. I believe this happened because NSS lookups started failing right after the initial PAM auth. This led to various confusing failures and even some core dumps when creating new ssh sessions / login shells after privileges were dropped to my UID.

The root user can use the machine keytab to query AD via nss_ldap, but as soon as privileges are dropped in the PAM stack, all passwd NSS queries fail since there's no accessible krb5cc to make the LDAP queries over GSSAPI. If you have the name-service-cache service enabled, then the warm cache entry from the initial pam auth is enough to let PAM finish, but NSS lookups for non-cached users will fail.

I'm going to try disabling privilege separation in sshd_config as a workaround - if that doesn't work I'll probably admit defeat and go with samba/winbind.

#5

Updated by Gordon Ross 3 months ago

FYI, what we recommend with AD servers is to use a proxy account (some ordinary user account in AD) and setup the LDAP client like this:

# Example. Provide these four variables:
dn="blue.contoso.com" 
dc="dc=blue,dc=contoso,dc=com" 
pa="test" 
pw="password" 
srv=10.10.0.10

ldapclient manual \
-a credentialLevel=proxy \
-a authenticationMethod=simple \
-a proxyDN="cn=$pa,cn=Users,$dc" \
-a proxyPassword="$pw" \
-a defaultSearchBase="$dc" \
-a domainName="$dn" \
-a defaultServerList="$srv" \
-a attributeMap=passwd:gecos=cn \
-a attributeMap=passwd:homedirectory=unixHomeDirectory \
-a objectClassMap=group:posixGroup=group \
-a objectClassMap=passwd:posixAccount=user \
-a objectClassMap=shadow:shadowAccount=user \
-a serviceSearchDescriptor="passwd:cn=users,${dc}?sub" \
-a serviceSearchDescriptor="group:cn=users,${dc}?sub" 

And then use nsswitch.ldap of course.

#6

Updated by Cullum Smith 3 months ago

Thanks Gordon. Falling back to a simple LDAP bind (rather than self) will definitely fix the NSS issues I described, but "self" would more secure (since kerberos can encrypt the LDAP traffic, assuming no LDAPS).

I believe credentialLevel=self works fine, barring this one case of SSHing to the box without a forwarded ticket or pre-existing credentials cache.

In any case, even if AD isn't used, this interaction of OpenSSH and the illumos pam_krb5 still forces users to run kinit and type their password a second time, which seems unfortunate. I'll try to make a naive patch this weekend which re-creates the context in krb5_setcred.c and see what happens.

#7

Updated by Gordon Ross 3 months ago

It's a misconception that "simple" means cleartext. Have a look at the traffic.
(It's encrypted, similar to how it is with "self".)
Credential type "self" is a huge hassle, and really doesn't buy you much.

#8

Updated by Cullum Smith 3 months ago

Do you mean that "simple" traffic is encrypted in all cases or just when using "tls:simple"? I was just going by what the man page said:

For simple, be aware that the bind password will be sent in the clear to the LDAP server.

I'm not sure what the mechanism would be for encrypting traffic if you aren't using GSSAPI or TLS.

#9

Updated by Gordon Ross 3 months ago

In the example ldapclient command I posted above, with -a credentialLevel=proxy -a authenticationMethod=simple,
it is indeed using gssapi. It uses the credentials for the proxy user, no matter who is asking NSS LDAP for something,
where "self" uses the credentials of the user calling the NSS functions.

#10

Updated by Cullum Smith 2 months ago

Thanks for the info. In any case, I don't want to derail the comments since this is a separate issue from the one originally reported - namely that OpenSSH calls setcred in a different process from the authentication, so the kerberos context is lost. As a result, pam_krb5 cannot create or renew a user's credentials cache when called from OpenSSH.

Various distributions encountered this issue as far back as 2004:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=342157
https://bugzilla.mindrot.org/show_bug.cgi?id=688

Most of the linux distros as well as FreeBSD package Eyrie's pam_krb5, which has the hack linked above to work around this issue - which I suspect is why there's not many reports of this in the wild.

#11

Updated by Cullum Smith 2 months ago

FYI:

ChallengeResponseAuthentication no

in sshd_config resolves the issue. Unfortunately this precludes you from using fancier auth schemes like s/key and 2fa, but I'm satisfied with this workaround.

For keyboard-interactive authentication, sshd uses a privileged monitor process and unprivileged auth process to do the authentication (which does the initial pam auth step). With Password authentication, this subprocess is not used.

Disabling privilege separation would also solve the problem, but thats not possible in modern OpenSSH.

#12

Updated by Adam Stylinski 2 months ago

It's a bit strange that privsep isn't used when ChallengeResponseAuthentication is set to no. Shouldn't it use privilege separation regardless?

In any case, having a pamd service is the long fix around this, I think. The short fix would be to hack together another context for setcred - though doing so may require requesting a ticket for a second time. I hadn't dug that far into which parts of the auth context it actually uses for that stage.

Also available in: Atom PDF