You are browsing unreleased documentation.
Configuration
This plugin is compatible with DB-less mode.
Compatible protocols
The AI Semantic Cache plugin is compatible with the following protocols:
grpc
, grpcs
, http
, https
Parameters
Here's a list of all the parameters which can be used in this plugin's configuration:
-
name or plugin
string requiredThe name of the plugin, in this case
ai-semantic-cache
.- If using the Kong Admin API, Konnect API, declarative configuration, or decK files, the field is
name
. - If using the KongPlugin object in Kubernetes, the field is
plugin
.
- If using the Kong Admin API, Konnect API, declarative configuration, or decK files, the field is
-
instance_name
stringAn optional custom name to identify an instance of the plugin, for example
ai-semantic-cache_my-service
.The instance name shows up in Kong Manager and in Konnect, so it's useful when running the same plugin in multiple contexts, for example, on multiple services. You can also use it to access a specific plugin instance via the Kong Admin API.
An instance name must be unique within the following context:
- Within a workspace for Kong Gateway Enterprise
- Within a control plane or control plane group for Konnect
- Globally for Kong Gateway (OSS)
-
service.name or service.id
stringThe name or ID of the service the plugin targets. Set one of these parameters if adding the plugin to a service through the top-level
/plugins
endpoint. Not required if using/services/{serviceName|Id}/plugins
. -
route.name or route.id
stringThe name or ID of the route the plugin targets. Set one of these parameters if adding the plugin to a route through the top-level
/plugins
endpoint. Not required if using/routes/{routeName|Id}/plugins
. -
consumer.name or consumer.id
stringThe name or ID of the consumer the plugin targets. Set one of these parameters if adding the plugin to a consumer through the top-level
/plugins
endpoint. Not required if using/consumers/{consumerName|Id}/plugins
. -
consumer_group.name or consumer_group.id
stringThe name or ID of the consumer group the plugin targets. If set, the plugin will activate only for requests where the specified group has been authenticated
/plugins
endpoint. Not required if using/consumer_groups/{consumerGroupName|Id}/plugins
. -
enabled
boolean default:true
Whether this plugin will be applied.
-
config
record required-
message_countback
number default:1
between:1
1000
Number of messages in the chat history to Vectorize/Cache
-
ignore_system_prompts
boolean default:false
Ignore and discard any system prompts when Vectorizing the request
-
ignore_assistant_prompts
boolean default:false
Ignore and discard any assistant prompts when Vectorizing the request
-
ignore_tool_prompts
boolean default:false
Ignore and discard any tool prompts when Vectorizing the request
-
stop_on_failure
boolean required default:false
Halt the LLM request process in case of a caching system failure
-
cache_ttl
integer default:300
TTL in seconds of cache entities. Must be a value greater than 0.
-
cache_control
boolean required default:false
When enabled, respect the Cache-Control behaviors defined in RFC7234.
-
exact_caching
boolean required default:false
When enabled, a first check for exact query will be done. It will impact DB size
-
embeddings
record required-
auth
record-
header_name
string referenceableIf AI model requires authentication via Authorization or API key header, specify its name here.
-
header_value
string referenceable encryptedSpecify the full auth header value for ‘header_name’, for example ‘Bearer key’ or just ‘key’.
-
param_name
string referenceableIf AI model requires authentication via query parameter, specify its name here.
-
param_value
string referenceable encryptedSpecify the full parameter value for ‘param_name’.
-
param_location
string Must be one of:query
,body
Specify whether the ‘param_name’ and ‘param_value’ options go in a query string, or the POST form/JSON body.
-
azure_use_managed_identity
boolean default:false
Set true to use the Azure Cloud Managed Identity (or user-assigned identity) to authenticate with Azure-provider models.
-
azure_client_id
string referenceableIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client ID.
-
azure_client_secret
string referenceable encryptedIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client secret.
-
azure_tenant_id
string referenceableIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the tenant ID.
-
gcp_use_service_account
boolean default:false
Use service account auth for GCP-based providers and models.
-
gcp_service_account_json
string referenceable encryptedSet this field to the full JSON of the GCP service account to authenticate, if required. If null (and gcp_use_service_account is true), Kong will attempt to read from environment variable
GCP_SERVICE_ACCOUNT
.
-
aws_access_key_id
string referenceable encryptedSet this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_ACCESS_KEY_ID environment variable for this plugin instance.
-
aws_secret_access_key
string referenceable encryptedSet this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_SECRET_ACCESS_KEY environment variable for this plugin instance.
-
allow_override
boolean default:false
If enabled, the authorization header or parameter can be overridden in the request by the value configured in the plugin.
-
-
model
record required-
provider
string required Must be one of:openai
,mistral
AI provider format to use for embeddings API
-
name
string requiredModel name to execute.
-
options
recordKey/value settings for the model
-
upstream_url
stringupstream url for the embeddings
-
-
-
-
vectordb
record required-
strategy
string required Must be one of:redis
which vector database driver to use
-
dimensions
integer requiredthe desired dimensionality for the vectors
-
threshold
number requiredthe default similarity threshold for accepting semantic search results (float)
-
distance_metric
string required Must be one of:cosine
,euclidean
the distance metric to use for vector searches
-
redis
record required-
host
string default:127.0.0.1
A string representing a host name, such as example.com.
-
port
integer default:6379
between:0
65535
An integer representing a port number between 0 and 65535, inclusive.
-
connect_timeout
integer default:2000
between:0
2147483646
An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.
-
send_timeout
integer default:2000
between:0
2147483646
An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.
-
read_timeout
integer default:2000
between:0
2147483646
An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.
-
username
string referenceableUsername to use for Redis connections. If undefined, ACL authentication won’t be performed. This requires Redis v6.0.0+. To be compatible with Redis v5.x.y, you can set it to
default
.
-
password
string referenceable encryptedPassword to use for Redis connections. If undefined, no AUTH commands are sent to Redis.
-
sentinel_username
string referenceableSentinel username to authenticate with a Redis Sentinel instance. If undefined, ACL authentication won’t be performed. This requires Redis v6.2.0+.
-
sentinel_password
string referenceable encryptedSentinel password to authenticate with a Redis Sentinel instance. If undefined, no AUTH commands are sent to Redis Sentinels.
-
database
integer default:0
Database to use for the Redis connection when using the
redis
strategy
-
keepalive_pool_size
integer default:256
between:1
2147483646
The size limit for every cosocket connection pool associated with every remote server, per worker process. If neither
keepalive_pool_size
norkeepalive_backlog
is specified, no pool is created. Ifkeepalive_pool_size
isn’t specified butkeepalive_backlog
is specified, then the pool uses the default value. Try to increase (e.g. 512) this value if latency is high or throughput is low.
-
keepalive_backlog
integer between:0
2147483646
Limits the total number of opened connections for a pool. If the connection pool is full, connection queues above the limit go into the backlog queue. If the backlog queue is full, subsequent connect operations fail and return
nil
. Queued operations (subject to set timeouts) resume once the number of connections in the pool is less thankeepalive_pool_size
. If latency is high or throughput is low, try increasing this value. Empirically, this value is larger thankeepalive_pool_size
.
-
sentinel_master
stringSentinel master to use for Redis connections. Defining this value implies using Redis Sentinel.
-
sentinel_role
string Must be one of:master
,slave
,any
Sentinel role to use for Redis connections when the
redis
strategy is defined. Defining this value implies using Redis Sentinel.
-
sentinel_nodes
array of typerecord
len_min:1
Sentinel node addresses to use for Redis connections when the
redis
strategy is defined. Defining this field implies using a Redis Sentinel. The minimum length of the array is 1 element.-
host
string required default:127.0.0.1
A string representing a host name, such as example.com.
-
port
integer default:6379
between:0
65535
An integer representing a port number between 0 and 65535, inclusive.
-
-
cluster_nodes
array of typerecord
len_min:1
Cluster addresses to use for Redis connections when the
redis
strategy is defined. Defining this field implies using a Redis Cluster. The minimum length of the array is 1 element.-
ip
string required default:127.0.0.1
A string representing a host name, such as example.com.
-
port
integer default:6379
between:0
65535
An integer representing a port number between 0 and 65535, inclusive.
-
-
ssl
boolean default:false
If set to true, uses SSL to connect to Redis.
-
ssl_verify
boolean default:false
If set to true, verifies the validity of the server SSL certificate. If setting this parameter, also configure
lua_ssl_trusted_certificate
inkong.conf
to specify the CA (or server) certificate used by your Redis server. You may also need to configurelua_ssl_verify_depth
accordingly.
-
server_name
stringA string representing an SNI (server name indication) value for TLS.
-
cluster_max_redirections
integer default:5
Maximum retry attempts for redirection.
-
connection_is_proxied
boolean default:false
If the connection to Redis is proxied (e.g. Envoy), set it
true
. Set thehost
andport
to point to the proxy address.
-
-
-