Integrating Advanced Audit with Aggregated Logging in OpenShift 3.11

Advanced audit is a feature that logs requests at the API server level. When enabled, these logs are output to a log file on the master node but are not caught by the EFK stack in OpenShift. Today, we will walk through the advanced audit feature in OpenShift Container Platform 3.11 and will make modifications to allow it to be integrated with aggregated logging.

Prerequisites

To continue, you must have a running OpenShift 3.11 cluster with logging enabled. The cluster I used for this demo was a single-node, rpm-based installation with the ovs-subnet SDN plugin enabled.

Overview

Before we begin working with advanced audit, let’s dive into the features it has to offer. Advanced audit is an enhancement over the older basic audit. Whereas the basic audit logged all requests to an audit file, advanced audit allows administrators to write a policy file to specify a subset of requests that get logged. This is a good feature for administrators who are only interested in certain interactions. Advanced audit also offers a webhook which can be used to send audit logs to a log aggregator via http or https.

Integration

Now that we know more about advanced audit, let’s integrate this feature with aggregated logging. We will first write a simple policy file to specify which requests will be logged. Then, I will demonstrate two different ways to integrate with aggregated logging – once using hostpath mounts and another using the webhook.

Policy File

In this example, we will write a simple policy file to log when an authenticated user creates or deletes an OpenShift project.

Write the policy file below to /etc/origin/master/audit-policy.yaml.

apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
  - level: Metadata
    userGroups: ["system:authenticated:oauth"]
    verbs: ["create", "delete"]
    resources:
      - group: "project.openshift.io"
        resources: ["projectrequests", "projects"]
    omitStages:
      - RequestReceived

By default, the advanced audit will log two events per request – once when an API request is received and another when it is completed. In this example, we will log only when the request is completed and omit logs that signal when the request was received, as noted by the “omitStages” key.

Now that the policy file is created, continue to Hostpath Mounts or Webhook, depending on your preferred strategy of integration. Then, continue to Test it out to see the integration in action.

Hostpath Mounts

Fluentd is deployed as a daemonset that deploys replicas on nodes labeled logging-infra-fluentd. With this label on master nodes, fluentd will be deployed as a container on the same node as the audit logs. This means that fluentd can be made aware of the logs by creating hostPath mounts on the API server and the fluentd deployment.

Create a directory called /etc/origin/master/audit on the master. This directory will contain audit logs produced by advanced audit and will be used to mount a hostPath on the API server and fluentd deployment.

mkdir /etc/origin/master/audit

Modify the fluentd daemonset to mount this directory to its running instances.

oc set volume ds/logging-fluentd --add --mount-path=/etc/origin/master/audit --name=audit --type=hostPath --path=/etc/origin/master/audit -n openshift-logging

Now, let’s modify /etc/origin/master/master-config.yaml to enable the auditing feature.

...
auditConfig:
  auditFilePath: /etc/origin/master/audit/audit-ocp.log
  enabled: true
  maximumFileRetentionDays: 10
  maximumFileSizeMegabytes: 10
  maximumRetainedFiles: 10
  logFormat: json
  policyFile: /etc/origin/master/audit-policy.yaml
...

Restart the master for the changes to take effect.

master-restart api
master-restart controllers

Now it’s time to tell fluentd where to find the audit logs and how to pass them along to ElasticSearch. Edit the logging-fluentd configmap and add the following configuration in the fluentd.conf file.

oc edit cm/logging-fluentd -n openshift-logging

## sources
...
@include configs.d/user/input-audit.conf
##

While still editing the configmap, add the following configuration to a separate file called input-audit.conf.

...
  input-audit.conf: |
    <source>
      @type tail
      @id audit-ocp
      path /etc/origin/master/audit/audit-ocp.log
      pos_file /etc/origin/master/audit/audit.pos
      tag audit.requests
      format json
    </source>
    <match audit**>
      @type copy
      <store>
        @type elasticsearch
        log_level debug
        host "#{ENV['OPS_HOST']}"
        port "#{ENV['OPS_PORT']}"
        scheme https
        ssl_version TLSv1_2
        index_name .audit

        user fluentd
        password changeme

        client_key "#{ENV['OPS_CLIENT_KEY']}"
        client_cert "#{ENV['OPS_CLIENT_CERT']}"
        ca_file "#{ENV['OPS_CA']}"

        type_name com.redhat.ocp.audit

        reload_connections "#{ENV['ES_RELOAD_CONNECTIONS'] || 'false'}"
        reload_after "#{ENV['ES_RELOAD_AFTER'] || '100'}"
        sniffer_class_name "#{ENV['ES_SNIFFER_CLASS_NAME'] || 'Fluent::ElasticsearchSimpleSniffer'}"
        reload_on_failure false
        flush_interval "#{ENV['ES_FLUSH_INTERVAL'] || '5s'}"
        max_retry_wait "#{ENV['ES_RETRY_WAIT'] || '300'}"
        disable_retry_limit true
        buffer_type file
        buffer_path '/var/lib/fluentd/buffer-output-es-auditlog'
        buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '1024' }"
        buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '1m' }"
        buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'exception'}"

        request_timeout 2147483648
      </store>
    </match>
...

Force a redeploy of fluentd to reload the configuration.

oc delete po -l component=fluentd -n openshift-logging

The cluster is now properly configured to integrate advanced audit with aggregated logging. Continue to Test it out to see this in action.

Webhook

An alternative solution to hostpath mounts is the webhook native to the advanced audit, in which a kubeconfig-like file can be created to forward the logs to a log aggregator. In this example we will forward the logs to the EFK stack, but the webhook could also be used to send logs to an aggregator external to OpenShift.

First, modify the fluentd configmap to accept http input and to pass the logs to ElasticSearch.

oc edit cm/logging-fluentd -n openshift-logging

## sources
...
@include configs.d/user/input-audit.conf
##

While still editing the configmap, add the following configuration.

...
  input-audit.conf: |
    <source>
      @type http
      @id audit-ocp
      port 9880
    </source>
    <match audit**>
      @type copy
      <store>
        @type elasticsearch
        log_level debug
        host "#{ENV['OPS_HOST']}"
        port "#{ENV['OPS_PORT']}"
        scheme https
        ssl_version TLSv1_2
        index_name .audit

        user fluentd
        password changeme

        client_key "#{ENV['OPS_CLIENT_KEY']}"
        client_cert "#{ENV['OPS_CLIENT_CERT']}"
        ca_file "#{ENV['OPS_CA']}"

        type_name com.redhat.ocp.audit

        reload_connections "#{ENV['ES_RELOAD_CONNECTIONS'] || 'false'}"
        reload_after "#{ENV['ES_RELOAD_AFTER'] || '100'}"
        sniffer_class_name "#{ENV['ES_SNIFFER_CLASS_NAME'] || 'Fluent::ElasticsearchSimpleSniffer'}"
        reload_on_failure false
        flush_interval "#{ENV['ES_FLUSH_INTERVAL'] || '5s'}"
        max_retry_wait "#{ENV['ES_RETRY_WAIT'] || '300'}"
        disable_retry_limit true
        buffer_type file
        buffer_path '/var/lib/fluentd/buffer-output-es-auditlog'
        buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '1024' }"
        buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '1m' }"
        buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'exception'}"

        request_timeout 2147483648
      </store>
    </match>
...

Force a redeploy of fluentd to reload the configuration.

oc delete po -l component=fluentd -n openshift-logging

When the new pod is ready, determine the IP address of the new logging-fluentd pod.

oc describe po -n openshift-logging $(oc get po -n openshift-logging | grep fluentd | awk '{print $1}') | grep IP

Next, create the webhook kube config /etc/origin/master/audit-webhook.yaml.

clusters:
  - name: fluentd
    cluster:
      certificate-authority: ""
      server: http://<IP-of-fluentd-pod>:9880/audit.request

users:
  - name: api-server
    user:
      client-certificate: ""
      client-key: ""

current-context: webhook
contexts:
- context:
    cluster: fluentd
    user: api-server
  name: webhook

Modify /etc/origin/master/master-config.yaml to enable the auditing feature.

...
auditConfig:
  auditFilePath: /var/log/audit-ocp.log
  enabled: true
  maximumFileRetentionDays: 10
  maximumFileSizeMegabytes: 10
  maximumRetainedFiles: 10
  logFormat: json
  policyFile: /etc/origin/master/audit-policy.yaml
  webHookKubeConfig: /etc/origin/master/audit-webhook.yaml
  webHookMode: blocking
...

Restart the api server.

master-restart api
master-restart controllers

Test it out

Let’s see what this looks like once everything is configured properly. Use the oc binary to create a new project.

oc new-project test-project

Fluentd will now collect the audit log and pass it along to ElasticSearch. To easily view this in Kibana, create a new index for the audit logs. In Kibana go to Management->Index Patterns->Create Index and enter “.audit*” as the name of the index. Select metadata.creationTimestamp as the Time-field name and click Create. (This will read as items.metadata.creationTimestamp with the webhook method.)

Navigate to the “Discover” tab and select the new index from the drop-down box. You should see an entry with a variety of different fields, including the user that initiated the request, the type of resource that was created, and the IP address in which the request originated from. An additional request will be logged on the deletion of this project, as specified by the policy file.

oc delete project test-project

After performing these actions, your “discover” page in Kibana should look similar to how it does below:

Summary

Advanced audit allows administrators to see all requests flowing in and out of their OpenShift cluster. Integrating this feature with aggregated logging will allow administrators to be able to visualize this data without needing to stand up additional infrastructure. Whether your strategy is to use hostpath mounts or the webhook, integrating advanced audit with aggregated logging is a simple way to begin auditing your cluster!