GCP, Terraform, GCS, Cloud Armor, GCP Load balancing

Cloud Armor Rate Limit rules on a GCS service backend

Hi there, We are going to explore in this post a way to use Cloud Armor rate limiting actions on a GCP load balancer backend bucket target.

As per the Cloud Armor documentation, the supported policy for a Backend bucket in a GCP Load balancer is the Edge security policy which only support the allow and deny actions. This is an inconvenience when we are, for example, serving files from a public bucket and want to use Cloud Armor to ban clients temporarily based on the number of requests. This can cause unexpected outbound traffic and, hence, a spike in our billing. However, we can apply rate limiting to backend services using a workaround, let’s see.

We have several options when creating a GCP Global external Application Load Balancer backend service, from the GCP Documentation:

Each backend service supports one of the following backend combinations:

  • All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends
  • All zonal NEGs: One or more GCEVM_IP_PORT type zonal NEGs
  • All hybrid connectivity NEGs: One or more NON_GCP_PRIVATE_IP_PORT type NEGs
  • A combination of zonal and hybrid NEGs: GCE_VM_IP_PORT and NON_GCP_PRIVATE_IP_PORT type NEGs †
  • All serverless NEGs: One or more App Engine, Cloud Run, or Cloud Run functions resources
  • One global internet NEG for an external backend
  • Private Service Connect NEGs:
    • Google APIs: a single Private Service Connect NEG
    • Managed services: one or more Private Service Connect NEGs

Aha! One of the options is a Private Service Connect Network Endpoint Group. Private Service Connect (PSC) is a service that allows us to consume managed services or Google APIs from our private network (VPC) without leaving the Google Cloud network.

So, we can use a network endpoint group pointing to the Storage Google API and serve it in a backend service, instead of a bucket service. And apply rate limiting. In essence is just a redirection from the Load balancer backend to the Google storage API.

Setup

TL;DR Repo link

We need the next resources:

  • An external Global Load balancer
  • A PSC Network Endpoint Group
  • A bucket

We don’t need a Private Service Connect (PSC) resource. Instead, we create a Network Endpoint Group resource (NEG) of type PRIVATE_SERVICE_CONNECT targeting the Storage API storage.googleapis.com

resource "google_compute_region_network_endpoint_group" "psc_neg_service_attachment" {
  name                  = "${local.prefix}-psc-neg"
  region                = var.region
  network_endpoint_type = "PRIVATE_SERVICE_CONNECT"
  psc_target_service    = "storage.googleapis.com"
}

Then we create the backend service with the PSC NEG as the backend. As you can see the security_policy here is injecting the Cloud Armor policy to the backend. We’ll explain the security policy in the next blocks.

resource "google_compute_backend_service" "default" {
  provider = google-beta
  name                            = "${local.prefix}-backend-service"
  enable_cdn                      = false
  security_policy = google_compute_security_policy.policy.id
  load_balancing_scheme = "EXTERNAL_MANAGED"
  log_config {
    enable = true
    sample_rate = 1.0
    optional_mode = "INCLUDE_ALL_OPTIONAL"
  }
  protocol = "HTTP"
  backend {
    group = google_compute_region_network_endpoint_group.psc_neg_service_attachment.id
    capacity_scaler = 1
  }
}

To speed up the process, we’ll use the Load balancer terraform module to create the Load balancer frontend. We will create the url map in a separate resource to define our host and path rules.

The frontend submodule can create an SSL certificate on our behalf. This is optional, it can be an HTTP frontend too.

module "lb-frontend" {
  source  = "terraform-google-modules/lb-http/google//modules/frontend"
  version = "13.0.1"

  project_id    = var.project_id
  name          = "${local.prefix}-global-lb"
  create_url_map = false
  url_map_resource_uri = google_compute_url_map.url_map.id
  ssl = true
  managed_ssl_certificate_domains = [ local.domain ]
}

For the url map we have just a host rule for our domain, and a route rule for our files path, that will match the path https://mydomain/files/* and route the request to our backend service. We need an url_rewrite action because as we mentioned before, the PSC NEG is targeting the storage.googleapis.com, and a request to https://mydomain/files/abc will serve http://storage.googleapis.com/files/abc through the PSC NEG, which is not the Storage API URI for our bucket. So, the url_rewrite action removes the /files path and adds the ID of our bucket. This is transparent for the client. The flow is something like this:

https://mydomain/files/objectid --> url_rewrite --> https://mydomain/mybucketname/objectid --> PSC NEG --> https://storage.googleapis.com/mybucketname/objectid
resource "google_compute_url_map" "url_map" {
  name        = "${local.prefix}-url-map"
  description = "a description"

  default_service = google_compute_backend_service.default.id

  host_rule {
    hosts        = [local.domain]
    path_matcher = "api"
  }

  path_matcher {
    name            = "api"
    default_service = google_compute_backend_service.default.id

    route_rules {
      priority = 1
      match_rules {
        prefix_match = "/files"
      }
      service = google_compute_backend_service.default.id
      route_action {
        url_rewrite {
          path_prefix_rewrite = "/${local.bucket_name}"
        }
      }
    }
  }
  path_matcher {
    name            = "otherpaths"
    default_service = google_compute_backend_service.default.id
  }
}

The bucket

module "gcs" {
  source  = "terraform-google-modules/cloud-storage/google//modules/simple_bucket"
  version = "10.0"

  project_id    = var.project_id
  location      = var.region
  name          = "${local.prefix}-files-bucket"
  force_destroy = true
  iam_members   = [{ member = "allUsers", role = "roles/storage.objectViewer" }]
}

And finally, the Cloud armor Security policy. For rate limiting you can do the following [1]:

  1. Throttle requests per client based on a threshold you configure.
  2. Temporarily ban clients that exceed a request threshold that you set for a configured duration.

We will filter the traffic coming to our backend by using the rate_limit_threshold.count parameter. After the client has reached the threshold, the exceed_action is returned by the backend to the client and the IP is banned for the ban_duration_sec time. After this time, the client will have available the rate_limit_threshold.count number of request.

resource "google_compute_security_policy" "policy" {
    name        = "${local.prefix}-rateban-policy"
    description = "ban IPs based on a count and a interval in seconds"
    rule {
      action   = "rate_based_ban"
      priority = "2147483647"
      match {
          versioned_expr = "SRC_IPS_V1"
          config {
              src_ip_ranges = ["*"]
          }
      }
      description = "default rule"
      rate_limit_options {
        ban_duration_sec = local.cloud_armor_ban_duration_sec
        conform_action = "allow"
        exceed_action = "deny(403)"
        enforce_on_key = ""
        enforce_on_key_configs {
          enforce_on_key_type = "IP"
        }
        rate_limit_threshold {
          interval_sec = local.cloud_armor_ban_requests_interval
          count = local.cloud_armor_ban_requests_count
        }
      }
    }
}

Let’t check if the policy is working. With the parameters:

  • cloud_armor_ban_requests_interval = "60"
  • cloud_armor_ban_requests_count = "20"
  • cloud_armor_ban_duration_sec = "60"

That means, that one IP can make 20 requests, after that will be banned for 60 seconds. While is banned the backend will respond with a 403 HTTP code.

Deploy the OpenTofu code, and use the following script to test:

#!/bin/bash
URL="https://cloudarmor.acloudfrog.com/files/hello-world"

for i in $(seq 1 30);
do
    echo "Sending request ${i}, Response Code: $(curl -s -o response.txt -w "%{response_code}" "${URL}")"
    sleep 2
done

We got the result:

$ ./test_rule.sh
Sending request 1, Response Code: 200
Sending request 2, Response Code: 200
Sending request 3, Response Code: 200
Sending request 4, Response Code: 200
Sending request 5, Response Code: 200
Sending request 6, Response Code: 200
Sending request 7, Response Code: 200
Sending request 8, Response Code: 200
Sending request 9, Response Code: 200
Sending request 10, Response Code: 200
Sending request 11, Response Code: 200
Sending request 12, Response Code: 200
Sending request 13, Response Code: 200
Sending request 14, Response Code: 200
Sending request 15, Response Code: 200
Sending request 16, Response Code: 200
Sending request 17, Response Code: 200
Sending request 18, Response Code: 200
Sending request 19, Response Code: 200
Sending request 20, Response Code: 200
Sending request 21, Response Code: 200
Sending request 22, Response Code: 200
Sending request 23, Response Code: 403
Sending request 24, Response Code: 403
Sending request 25, Response Code: 403
Sending request 26, Response Code: 403
Sending request 27, Response Code: 403
Sending request 28, Response Code: 403
Sending request 29, Response Code: 403
Sending request 30, Response Code: 403

It works! After ~20 requests, Cloud Armor banned my IP.

Eso es todo amigos!

[1] https://cloud.google.com/armor/docs/security-policy-overview#rate-limiting-rules