GCP, Terraform, Elastic, GCS, Kubernetes

Ingest GCP Logs from Elastic using GCS Input

In this post we’re going to use the GCS Filebeat input/module to copy log files from a GCP Storage Bucket to an Elastic instance Running on a Kubernetes cluster.

First, we are going to create a GCP storage bucket and using a GCP Log router, we will redirect the logs generated by our project to the bucket.

We’re going to use a local kubernetes cluster to deploy three components: an Elastic instance, A Kibana instance and A Beat (Filebeat) Instance. Additionally, an Elastic Ingestion pipeline is needed to pre-process the timestamp of the log entries.

Log entries are saved to Cloud Storage buckets in hourly batches. It might take from 2 to 3 hours before the first entries begin to appear. [1]

The ElasticSearch ingestion pipeline is needed because the Filebeat agent is using the timestamp of the ingestion instead of the log entry generation timestamp.

ECK Installation

We are going to deploy a local Kubernetes cluster with a ElasticSearch and a Kibana instance to filter, store and visualize the logs. We’ll use Filebeat to ship the logs to the Elastic server. Best way to install an ELK stack in our cluster is using ECK (Elastic Cloud on Kubernetes) Follow these guides:

We’ll deploy Filebeat agent later.

Deploying the Google Logs router

TL;DR Source Code ( check the examples module)

We’ll use the google-export-logs terraform module to create the log router and the GCS bucket. This module will create several resources in our behalf: The log router itself, a Cloud logging service account, the needed permission to allow the service account to write logs into the bucket, And of course a Storage bucket.

We set a lifecycle rule to delete the files from the bucket after 30 days.

module "log_export" {
  source                 = "terraform-google-modules/log-export/google"
  version                = "~> 10.0"
  destination_uri        = "${module.destination.destination_uri}"
  filter                 = var.filter
  log_sink_name          = "${local.prefix}-logsink-gcs"
  parent_resource_id     = var.project_id
  parent_resource_type   = "project"
  unique_writer_identity = var.unique_writer_identity
}

// change to single region
module "destination" {
  source                   = "terraform-google-modules/log-export/google//modules/storage"
  version                  = "~> 10.0"
  project_id               = var.project_id
  storage_bucket_name      = "${local.prefix}-logsink-bucket"
  log_sink_writer_identity = "${module.log_export.writer_identity}"
  force_destroy            = var.force_destroy_bucket
  lifecycle_rules = [
    {
      action = {
        type = "Delete"
      }
      condition = {
        age        = 30
        with_state = "ANY"
      }
    }
  ]
}

We’ll also create a service account for the Filebeat agent, and the IAM project binding allowing it to use the objects within the Logs Bucket.

resource "google_service_account" "beat_service_account" {
  account_id   = "${local.prefix}-beat"
  display_name = "Service Account used by Filebeat to pull the Logs"
}


resource "google_storage_bucket_iam_member" "beat_service_account_binding" {
  bucket = module.destination.resource_name
  role = "roles/storage.objectUser"
  member =  "serviceAccount:${google_service_account.beat_service_account.email}"
}

We create an instance of the Terraform module and deploy it:

Note that the filter is by default set to severity > "INFO" This filter was excluding a lot of logs in my recently created project, In this case we’ll use the filter severity >= "INFO" to test and route more logs.

locals {
    project_id = "perfect-crawler-464423-g1"
}

module "logs" {
    source = "../"
    project_id = local.project_id
    filter = <<-EOT
        severity > "INFO"
    EOT
    force_destroy_bucket = true
}
#Apply the code
tofu init
tofu apply -auto-approve

If the apply is successful you’ll get the service account email created to allow the Filebeat instance to pull the logs and the name of the GCS bucket.

ElasticSearch Pipeline

Now, go to the ElasticSearch Ingest Pipelines page on the ElasticSearch UI Console and create this pipeline

[
  {
    "script": {
      "source": "ctx['actual_timestamp'] = ctx['gcs']['storage']['object']['json_data'][0]['timestamp']"
    }
  },
  {
    "date": {
      "field": "actual_timestamp",
      "formats": ["ISO8601"]
    }
  },
  {
    "remove": {
      "field": "actual_timestamp",
      "ignore_missing": true
    }
  }
]

Filebeat Agent

We need the name of the bucket and the service account JSON key to setup the filebeat agent. The bucket name is on the Tofu module outputs. The service Account JSON key can be created with gcloud. We don’t use Tofu because the key will be stored in plain text in our state.

gcloud iam service-accounts keys create key.json \
  --iam-account=SERVICE_ACCOUNT_EMAIL \
  --project=PROJECT_ID

We store the JSON key in a Kubernetes secret and delete it from our Filesystem to mount it in our Filebeat agent and to avoid any security issues by leaking it.

# Use the --namespace parameter if the case
kubectl create secret generic SECRET_NAME --from-file=key.json

Then, We deploy the Filebeat instance. replace:

  • ELASTIC_SERVER_NAME: Reference to the kubernetes Elastic Object
  • PROJECT_ID: GCP project ID
  • GCS_BUCKET_NAME: the bucket name
  • SECRET_NAME: the Kubernetes secret name
  • ELASTIC_PIPELINE_NAME: Elastic Ingest pipeline name
## https://www.elastic.co/docs/deploy-manage/deploy/cloud-on-k8s/quickstart-beats
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: gcs-input
spec:
  type: filebeat
  version: 8.16.1
  elasticsearchRef:
    name: ELASTIC_SERVER_NAME
  config:
    filebeat.inputs:
      - type: gcs
        id: gcs-input
        enabled: true
        project_id: PROJECT_ID
        auth.credentials_file.path: /etc/sa/key.json
        retry:
          max_attempts: 3
          initial_backoff_duration: 10s
          max_backoff_duration: 60s
          backoff_multiplier: 2
        parse_json: true
        buckets:
          - name: GCS_BUCKET_NAME
            max_workers: 3
            poll: true
            poll_interval: 30s
            timestamp_epoch: 1630444800
        pipeline: ELASTIC_PIPELINE_NAME
    setup.ilm.enabled: false
    output.elasticsearch:
      indices:
        - index: "filebeat-gcs-%{[agent.version]}"
  daemonSet:
    podTemplate:
      spec:
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true
        securityContext:
          runAsUser: 0
        containers:
          - name: filebeat
            volumeMounts:
              - name: varlogcontainers
                mountPath: /var/log/containers
              - name: varlogpods
                mountPath: /var/log/pods
              - name: varlibdockercontainers
                mountPath: /var/lib/docker/containers
              - name: SECRET_NAME
                readOnly: true
                mountPath: "/etc/sa"
        volumes:
          - name: varlogcontainers
            hostPath:
              path: /var/log/containers
          - name: varlogpods
            hostPath:
              path: /var/log/pods
          - name: varlibdockercontainers
            hostPath:
              path: /var/lib/docker/containers
          - name: SECRET_NAME
            secret:
              secretName: SECRET_NAME

We check the status of the Beat:

$ kubectl get beat
NAME        HEALTH   AVAILABLE   EXPECTED   TYPE       VERSION   AGE
gcs-input   green    1           1          filebeat   8.16.1    30h

If we got files in out bucket already we should see data in the Elastic Index immediately. If we don’t have files in the bucket we have to wait the Cloud Logging hourly batch ~1 Hour

Go to the Kibana Console and query the logs (Observability > Logs > Logs Explorer ):

A starry night sky.

Eso es todo amigos!

Feel free to open an issue in the repo if something is broken!

References:

  1. [1] https://cloud.google.com/logging/docs/export/storage