Kerberos Debugging

Overview

From 2013-2017, I worked for Avalon Consulting, LLC as a Hadoop consultant. During this time I worked with a lot of clients and secured (TLS/SSL, LDAP, Kerberos, etc) quite a few Hadoop clusters for both Hortonworks and Cloudera. There have been a few posts out there about debugging Kerberos problems like @steveloughran “Hadoop and Kerberos: The Madness beyond the Gate”. This post covers a few of the tips I’ve collected over the years that apply to Kerberos in general as well as to Apache Hadoop.

Increase `kinit` verbosity

By default, kinit doesn’t display any debug information and will typically come back with an obscure error on failure. The following command will enable verbose logging to standard out which can help with debugging.

KRB5_TRACE=/dev/stdout kinit -V

Java Kerberos/KRB5 and SPNEGO Debug System Properties

Java internal classes that deal with Kerberos have system properties that turn on debug logging. The properties enable a lot of debugging so should only be turned on when trying to diagnose a problem and then turned off. They can also be combined if necessary.

The first property handles Kerberos errors and can help with misconfigured KDC servers, krb5.conf issues, and other problems.

-Dsun.security.krb5.debug=true

The second property is specifically for SPNEGO debugging for a Kerberos secured web endpoint. SPNEGO can be hard to debug, but this flag can help enable additional debug logging.

-Dsun.security.spnego.debug=true

These properties can be set with *_OPTS variables for Apache Hadoop and related components like the example below:

HADOOP_OPTS="-Dsun.security.krb5.debug=true" #-Dsun.security.spnego.debug=true"

Hadoop Command Line Debug Logging

Most of the Apache Hadoop command line tools (ie: hdfs, hadoop, yarn, etc) use the same underlying mechanism for logging Log4j. Log4j doesn’t allow dynamically adjusting log levels, but it does allow the logger to be adjusted before using the commands. Hadoop exposes the root logger as an environment variable HADOOP_ROOT_LOGGER. This can be used to change the logging of a specific command without changing log4j.properties.

HADOOP_ROOT_LOGGER=DEBUG,console hdfs ...

Debugging Hadoop Users and Groups

Users with Apache Hadoop are typically authenticated through Kerberos as explained here. The username of the user once authenticated is then used to determine groups. Groups with Apache Hadoop can be configured in a variety of ways with Hadoop Groups Mappings. Debugging what Apache Hadoop thinks your user and groups are is critical for setting up security correctly.

The first command takes a user principal and will return what the username is based on the configured hadoop.security.auth_to_local rules.

hadoop org.apache.hadoop.security.HadoopKerberosName USER_PRINCIPAL

The second command takes the username and determines what the groups are associated with it. This uses the configured Hadoop Groups Mappings to determine what the groups are.

hdfs groups USERNAME

The third command is uses the currently authenticated user and prints out the current users UGI. It also can take a principal and keytab to print information about that UGI.

hadoop org.apache.hadoop.security.UserGroupInformation
hadoop org.apache.hadoop.security.UserGroupInformation "PRINCIPAL" "KEYTAB"

The fourth command KDiag is relatively new since it was introduced with HADOOP-12426 and first released in Apache Hadoop 2.8.0. This command wraps up some additional debugging tools in one and checks common Kerberos related misconfigurations.

# Might also set HADOOP_JAAS_DEBUG=true and set the log level 'org.apache.hadoop.security=DEBUG'
hadoop org.apache.hadoop.security.KDiag

Conclusion

More than half the battle of dealing with Kerberos and distributed systems is knowing where to look and what logs to generate. With the right logs, it becomes possible to debug the problem and resolve it quickly.