Basic config examples for AI Semantic Cache - Plugin

Basic configuration examples

Enable on a service

Enable on a route

Enable on a consumer

Enable on a consumer group

Enable globally

The following examples provide some typical configurations for enabling the ai-semantic-cache plugin on a .

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/services/{serviceName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large"
      }
    },
    "vectordb": {
      "strategy": "redis",
      "dimensions": 3072,
      "threshold": 0.1,
      "distance_metric": "cosine",
      "redis": {
        "host": "exampleredis.com",
        "port": 80
      }
    }
  }
}
    '

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and service ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"model":{"provider":"openai","name":"text-embedding-3-large"}},"vectordb":{"strategy":"redis","dimensions":3072,"threshold":0.1,"distance_metric":"cosine","redis":{"host":"exampleredis.com","port":80}}}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-semantic-cache-example
plugin: ai-semantic-cache
config:
  embeddings:
    model:
      provider: openai
      name: text-embedding-3-large
  vectordb:
    strategy: redis
    dimensions: 3072
    threshold: 0.1
    distance_metric: cosine
    redis:
      host: exampleredis.com
      port: 80
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the service as follows:

kubectl annotate service SERVICE_NAME konghq.com/plugins=ai-semantic-cache-example

Replace SERVICE_NAME with the name of the service that this plugin configuration will target. You can see your available ingresses by running kubectl get service.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-semantic-cache
  service: SERVICE_NAME|ID
  config:
    embeddings:
      model:
        provider: openai
        name: text-embedding-3-large
    vectordb:
      strategy: redis
      dimensions: 3072
      threshold: 0.1
      distance_metric: cosine
      redis:
        host: exampleredis.com
        port: 80

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
      }
    }
    vectordb = {
      strategy = "redis"
      dimensions = 3072
      threshold = 0.1
      distance_metric = "cosine"
      redis = {
        host = "exampleredis.com"
        port = 80
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

The following examples provide some typical configurations for enabling the ai-semantic-cache plugin on a .

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/routes/{routeName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large"
      }
    },
    "vectordb": {
      "strategy": "redis",
      "dimensions": 3072,
      "threshold": 0.1,
      "distance_metric": "cosine",
      "redis": {
        "host": "exampleredis.com",
        "port": 80
      }
    }
  }
}
    '

Replace ROUTE_NAME|ID with the id or name of the route that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and route ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"model":{"provider":"openai","name":"text-embedding-3-large"}},"vectordb":{"strategy":"redis","dimensions":3072,"threshold":0.1,"distance_metric":"cosine","redis":{"host":"exampleredis.com","port":80}}}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-semantic-cache-example
plugin: ai-semantic-cache
config:
  embeddings:
    model:
      provider: openai
      name: text-embedding-3-large
  vectordb:
    strategy: redis
    dimensions: 3072
    threshold: 0.1
    distance_metric: cosine
    redis:
      host: exampleredis.com
      port: 80
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the ingress as follows:

kubectl annotate ingress INGRESS_NAME konghq.com/plugins=ai-semantic-cache-example

Replace INGRESS_NAME with the name of the ingress that this plugin configuration will target. You can see your available ingresses by running kubectl get ingress.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-semantic-cache
  route: ROUTE_NAME|ID
  config:
    embeddings:
      model:
        provider: openai
        name: text-embedding-3-large
    vectordb:
      strategy: redis
      dimensions: 3072
      threshold: 0.1
      distance_metric: cosine
      redis:
        host: exampleredis.com
        port: 80

Replace ROUTE_NAME|ID with the id or name of the route that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
      }
    }
    vectordb = {
      strategy = "redis"
      dimensions = 3072
      threshold = 0.1
      distance_metric = "cosine"
      redis = {
        host = "exampleredis.com"
        port = 80
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  route = {
    id = konnect_gateway_route.my_route.id
  }
}

The following examples provide some typical configurations for enabling the ai-semantic-cache plugin on a .

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/consumers/{consumerName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large"
      }
    },
    "vectordb": {
      "strategy": "redis",
      "dimensions": 3072,
      "threshold": 0.1,
      "distance_metric": "cosine",
      "redis": {
        "host": "exampleredis.com",
        "port": 80
      }
    }
  }
}
    '

Replace CONSUMER_NAME|ID with the id or name of the consumer that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and consumer ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumers/{consumerId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"model":{"provider":"openai","name":"text-embedding-3-large"}},"vectordb":{"strategy":"redis","dimensions":3072,"threshold":0.1,"distance_metric":"cosine","redis":{"host":"exampleredis.com","port":80}}}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-semantic-cache-example
plugin: ai-semantic-cache
config:
  embeddings:
    model:
      provider: openai
      name: text-embedding-3-large
  vectordb:
    strategy: redis
    dimensions: 3072
    threshold: 0.1
    distance_metric: cosine
    redis:
      host: exampleredis.com
      port: 80
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the KongConsumer object as follows:

kubectl annotate KongConsumer CONSUMER_NAME konghq.com/plugins=ai-semantic-cache-example

Replace CONSUMER_NAME with the name of the consumer that this plugin configuration will target. You can see your available consumers by running kubectl get KongConsumer.

To learn more about KongConsumer objects, see Provisioning Consumers and Credentials.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-semantic-cache
  consumer: CONSUMER_NAME|ID
  config:
    embeddings:
      model:
        provider: openai
        name: text-embedding-3-large
    vectordb:
      strategy: redis
      dimensions: 3072
      threshold: 0.1
      distance_metric: cosine
      redis:
        host: exampleredis.com
        port: 80

Replace CONSUMER_NAME|ID with the id or name of the consumer that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
      }
    }
    vectordb = {
      strategy = "redis"
      dimensions = 3072
      threshold = 0.1
      distance_metric = "cosine"
      redis = {
        host = "exampleredis.com"
        port = 80
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer = {
    id = konnect_gateway_consumer.my_consumer.id
  }
}

The following examples provide some typical configurations for enabling the ai-semantic-cache plugin on a .

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/consumer_groups/{consumerGroupName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large"
      }
    },
    "vectordb": {
      "strategy": "redis",
      "dimensions": 3072,
      "threshold": 0.1,
      "distance_metric": "cosine",
      "redis": {
        "host": "exampleredis.com",
        "port": 80
      }
    }
  }
}
    '

Replace CONSUMER_GROUP_NAME|ID with the id or name of the consumer group that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and consumer group ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumer_groups/{consumerGroupId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"model":{"provider":"openai","name":"text-embedding-3-large"}},"vectordb":{"strategy":"redis","dimensions":3072,"threshold":0.1,"distance_metric":"cosine","redis":{"host":"exampleredis.com","port":80}}}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-semantic-cache-example
plugin: ai-semantic-cache
config:
  embeddings:
    model:
      provider: openai
      name: text-embedding-3-large
  vectordb:
    strategy: redis
    dimensions: 3072
    threshold: 0.1
    distance_metric: cosine
    redis:
      host: exampleredis.com
      port: 80
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the KongConsumerGroup object as follows:

kubectl annotate KongConsumerGroup CONSUMER_GROUP_NAME konghq.com/plugins=ai-semantic-cache-example

Replace CONSUMER_GROUP_NAME with the name of the consumer group that this plugin configuration will target. You can see your available consumer groups by running kubectl get KongConsumerGroup.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, consumer group, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-semantic-cache
  consumer_group: CONSUMER_GROUP_NAME|ID
  config:
    embeddings:
      model:
        provider: openai
        name: text-embedding-3-large
    vectordb:
      strategy: redis
      dimensions: 3072
      threshold: 0.1
      distance_metric: cosine
      redis:
        host: exampleredis.com
        port: 80

Replace CONSUMER_GROUP_NAME|ID with the id or name of the consumer group that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
      }
    }
    vectordb = {
      strategy = "redis"
      dimensions = 3072
      threshold = 0.1
      distance_metric = "cosine"
      redis = {
        host = "exampleredis.com"
        port = 80
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer_group = {
    id = konnect_gateway_consumer_group.my_consumer_group.id
  }
}

A plugin which is not associated to any service, route, consumer, or consumer group is considered global, and will be run on every request.

In self-managed Kong Gateway Enterprise, the plugin applies to every entity in a given workspace.
In self-managed Kong Gateway (OSS), the plugin applies to your entire environment.
In Konnect, the plugin applies to every entity in a given control plane.

Read the and the Plugin Precedence sections for more information.

The following examples provide some typical configurations for enabling the AI Semantic Cache plugin globally.

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large"
      }
    },
    "vectordb": {
      "strategy": "redis",
      "dimensions": 3072,
      "threshold": 0.1,
      "distance_metric": "cosine",
      "redis": {
        "host": "exampleredis.com",
        "port": 80
      }
    }
  }
}
    '

Make the following request, substituting your own access token, region, and control plane ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"model":{"provider":"openai","name":"text-embedding-3-large"}},"vectordb":{"strategy":"redis","dimensions":3072,"threshold":0.1,"distance_metric":"cosine","redis":{"host":"exampleredis.com","port":80}}}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

Create a KongClusterPlugin resource and label it as global:

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: <global-ai-semantic-cache>
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  embeddings:
    model:
      provider: openai
      name: text-embedding-3-large
  vectordb:
    strategy: redis
    dimensions: 3072
    threshold: 0.1
    distance_metric: cosine
    redis:
      host: exampleredis.com
      port: 80
plugin: ai-semantic-cache

Add a plugins entry in the declarative configuration file:

plugins:
- name: ai-semantic-cache
  config:
    embeddings:
      model:
        provider: openai
        name: text-embedding-3-large
    vectordb:
      strategy: redis
      dimensions: 3072
      threshold: 0.1
      distance_metric: cosine
      redis:
        host: exampleredis.com
        port: 80

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
      }
    }
    vectordb = {
      strategy = "redis"
      dimensions = 3072
      threshold = 0.1
      distance_metric = "cosine"
      redis = {
        host = "exampleredis.com"
        port = 80
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Previous AI Semantic Cache Configuration

Next Set up AI Semantic Cache with Mistral