> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Autoscaling with KEDA

[KEDA (Kubernetes Event-Driven Autoscaling)](https://keda.sh/) provides flexible, event-driven autoscaling for Fusion workloads as an alternative to the default Kubernetes Horizontal Pod Autoscaler (HPA).

Traditional HPA-based autoscaling relies primarily on CPU and memory utilization. KEDA extends this by enabling you to scale workloads based on events, schedules, and business metrics, allowing you to align infrastructure capacity more closely with actual demand.

## Benefits

KEDA provides these advantages for Fusion workloads:

* **Event-driven autoscaling** - Scale in response to external signals such as Prometheus metrics, queue depth, or pipeline execution load.
* **Scheduled scaling** - Automatically scale workloads based on predictable traffic patterns, such as business hours or peak usage windows.
* **Scale-to-zero capability** - Reduce infrastructure costs by scaling workloads down to zero during off-hours or periods of inactivity.
* **Improved operational efficiency** - Align infrastructure capacity with real business demand instead of relying solely on CPU or memory thresholds.

<Frame caption="KEDA combines scheduled scaling (cron trigger) with reactive scaling (CPU trigger) to align infrastructure capacity with actual demand throughout the day.">
  <img src="https://mintcdn.com/lucidworks/XQs4aU_9BEAAXe_9/assets/images/diagrams/keda_autoscaling_24h.png?fit=max&auto=format&n=XQs4aU_9BEAAXe_9&q=85&s=3922cfeb53bcc04c2c758dfa908dc860" alt="KEDA autoscaling over a 24-hour period" width="2400" height="1200" data-path="assets/images/diagrams/keda_autoscaling_24h.png" />
</Frame>

## Supported services

The following Fusion services support KEDA autoscaling:

* `api-gateway`
* `query-pipeline`
* `fusion-indexing`

## Before you begin

Before configuring KEDA for Fusion services, ensure you have:

* KEDA version 2.18.3 or later installed in your Kubernetes cluster
* Access to the Fusion Helm charts
* Ability to provide a custom values file for your Fusion deployment

If KEDA isn't already installed in your cluster, install it using Helm:

```bash theme={"dark"}
# Add the KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts

# Update your local Helm chart repository cache
helm repo update

# Install KEDA
helm install keda kedacore/keda --version 2.18.3 --namespace keda --create-namespace
```

For more information, see the [official KEDA documentation](https://keda.sh/docs/latest/deploy/).

## How KEDA autoscaling works

Fusion Helm charts support two mutually exclusive autoscaling mechanisms per workload:

1. **Kubernetes HPA** (default) - Scales based on CPU and memory metrics.
2. **KEDA ScaledObject** - Scales based on events, schedules, and custom metrics.

<Warning>
  Only one autoscaling mechanism can be active per workload at a time. You can't use both HPA and KEDA for the same service.
</Warning>

However, you can mix autoscaling approaches across different services. For example, use HPA for `api-gateway` and KEDA for `query-pipeline`.

### Autoscaling parameters

Each Fusion service has three key autoscaling parameters:

| Parameter                  | Description                         | Default |
| -------------------------- | ----------------------------------- | ------- |
| `autoscaling.enabled`      | Master switch to enable autoscaling | `false` |
| `autoscaling.hpa.enabled`  | Enables HPA autoscaling             | `true`  |
| `autoscaling.keda.enabled` | Enables KEDA autoscaling            | `false` |

### Default behavior

When `autoscaling.enabled` is `true` and you don't modify other settings, Fusion creates an HPA by default. This preserves backward compatibility with existing deployments.

### Mutual exclusion

The Helm chart enforces these rules:

* If `autoscaling.enabled` is `false` → No autoscaling (neither HPA nor KEDA)
* If `hpa.enabled` is `true` and `keda.enabled` is `false` → HPA autoscaling
* If `hpa.enabled` is `false` and `keda.enabled` is `true` → KEDA autoscaling
* If both `hpa.enabled` and `keda.enabled` are `true` → No autoscaling (conflict resolution)

<Info>
  When KEDA creates a ScaledObject with certain trigger types, it may create its own HPA prefixed with `keda-hpa-`. This is expected behavior and indicates that KEDA is managing autoscaling correctly.
</Info>

## Enable KEDA

You can enable KEDA for the `api-gateway`, `query-pipeline`, or `fusion-indexing` services by editing the Fusion values file before a Fusion deployment or upgrade, as explained in the next section below.

### Update the Fusion values file

The configuration for each Fusion service is the same, as shown below:

<CodeGroup>
  ```yaml api-gateway theme={"dark"}
  api-gateway:
    autoscaling:
      enabled: true
      hpa:
        enabled: false  # Disable HPA
      keda:
        enabled: true   # Enable KEDA
  ```

  ```yaml query-pipeline theme={"dark"}
  query-pipeline:
    autoscaling:
      enabled: true
      hpa:
        enabled: false  # Disable HPA
      keda:
        enabled: true   # Enable KEDA
  ```

  ```yaml fusion-indexing theme={"dark"}
  fusion-indexing:
    autoscaling:
      enabled: true
      hpa:
        enabled: false  # Disable HPA
      keda:
        enabled: true   # Enable KEDA
  ```
</CodeGroup>

### Deploy or upgrade Fusion

After preparing your custom values file (for example, `fusion-values.yaml`), deploy or upgrade Fusion:

```bash theme={"dark"}
helm upgrade --install <RELEASE_NAME> <FUSION_CHART_PATH> -f fusion-values.yaml
```

Replace `<RELEASE_NAME>` and `<FUSION_CHART_PATH>` with your specific values.

<Tip>
  To find your existing Fusion release name, run `helm list -n <namespace>`. The chart path can be a repository reference (for example, `lucidworks/fusion`) or a local path to the chart directory.
</Tip>

### Verify the configuration

After deployment, verify that KEDA is active for the configured services.

<AccordionGroup>
  <Accordion title="Verify api-gateway">
    Check for the ScaledObject:

    ```bash theme={"dark"}
    kubectl get scaledobject -n <FUSION_NAMESPACE> api-gateway
    ```

    The `api-gateway` ScaledObject should appear in the output.

    Verify that no conflicting HPA exists:

    ```bash theme={"dark"}
    kubectl get hpa -n <FUSION_NAMESPACE> api-gateway
    ```

    No HPA should exist for `api-gateway`, except those prefixed with `keda-hpa-` (which are managed by KEDA).
  </Accordion>

  <Accordion title="Verify query-pipeline">
    Check for the ScaledObject:

    ```bash theme={"dark"}
    kubectl get scaledobject -n <FUSION_NAMESPACE> query-pipeline
    ```

    The `query-pipeline` ScaledObject should appear in the output.

    Verify that no conflicting HPA exists:

    ```bash theme={"dark"}
    kubectl get hpa -n <FUSION_NAMESPACE> query-pipeline
    ```

    No HPA should exist for `query-pipeline`, except those prefixed with `keda-hpa-` (which are managed by KEDA).
  </Accordion>

  <Accordion title="Verify fusion-indexing">
    Check for the ScaledObject:

    ```bash theme={"dark"}
    kubectl get scaledobject -n <FUSION_NAMESPACE> fusion-indexing
    ```

    The `fusion-indexing` ScaledObject should appear in the output.

    Verify that no conflicting HPA exists:

    ```bash theme={"dark"}
    kubectl get hpa -n <FUSION_NAMESPACE> fusion-indexing
    ```

    No HPA should exist for `fusion-indexing`, except those prefixed with `keda-hpa-` (which are managed by KEDA).
  </Accordion>
</AccordionGroup>

<Info>
  KEDA may create an HPA prefixed with `keda-hpa-` followed by your release name and service name. For example: `keda-hpa-fusion-query-pipeline`. This is expected when KEDA uses certain trigger types.
</Info>

### Custom configuration parameters

You can customize the scaling behavior, metadata, and the triggers KEDA monitors using the parameters in the table below.

For the `api-gateway`, `query-pipeline`, or `fusion-indexing` services, replace `[service-name]` with the name of the service.

| Parameter                                               | Description                                                              | Default  |
| ------------------------------------------------------- | ------------------------------------------------------------------------ | -------- |
| `[service-name].autoscaling.keda.labels`                | Labels to add to the ScaledObject.                                       | `{}`     |
| `[service-name].autoscaling.keda.annotations`           | Annotations to add to the ScaledObject.                                  | `{}`     |
| `[service-name].autoscaling.keda.pollingInterval`       | Interval in seconds at which KEDA checks triggers.                       | `30`     |
| `[service-name].autoscaling.keda.cooldownPeriod`        | Time in seconds KEDA waits after scale-down before re-evaluating.        | `300`    |
| `[service-name].autoscaling.keda.initialCooldownPeriod` | Time in seconds KEDA waits after scale-out before re-evaluating.         | `0`      |
| `[service-name].autoscaling.keda.idleReplicaCount`      | Replicas to maintain when triggers are inactive (enables scale-to-zero). | Disabled |
| `[service-name].autoscaling.keda.minReplicas`           | Minimum number of replicas.                                              | `1`      |
| `[service-name].autoscaling.keda.maxReplicas`           | Maximum number of replicas.                                              | `5`      |
| `[service-name].autoscaling.keda.advanced`              | Advanced HPA behavior configuration.                                     | `{}`     |
| `[service-name].autoscaling.keda.fallback`              | Fallback HPA configuration if KEDA encounters errors.                    | `{}`     |
| `[service-name].autoscaling.keda.triggers`              | List of KEDA triggers (required).                                        | `[]`     |

<Warning>
  At least one trigger is required when KEDA is enabled.
  Without triggers, KEDA can't determine when to scale your workload.
</Warning>

This example demonstrates a custom KEDA configuration for `query-pipeline` with these components:

* **Cron trigger** - Scales to 6 replicas during business hours (Monday-Friday, 7:00 AM - 6:00 PM Central Time).
* **CPU trigger** - Scales up to 15 replicas when CPU utilization exceeds 60%.

```yaml expandable theme={"dark"}
query-pipeline:
  autoscaling:
    enabled: true
    hpa:
      enabled: false
    keda:
      enabled: true
      pollingInterval: 40  # Check triggers every 40 seconds
      minReplicas: 2
      maxReplicas: 15
      triggers:
        # Scheduled scaling
        - type: cron
          metadata:
            timezone: "America/Chicago"
            start: "0 7 * * 1-5"      # 7:00 AM weekdays
            end: "0 18 * * 1-5"       # 6:00 PM weekdays
            desiredReplicas: "6"
        # Reactive scaling
        - type: cpu
          metricType: Utilization
          metadata:
            value: "60"  # Scale when CPU exceeds 60%
```

For more information about available KEDA triggers, see the [KEDA scalers documentation](https://keda.sh/docs/latest/scalers/).

## Troubleshooting

This section provides troubleshooting steps for some common issues with KEDA configuration.

<AccordionGroup>
  <Accordion title="No ScaledObject created">
    1. Verify that `autoscaling.enabled` is `true`.
    2. Verify that `autoscaling.keda.enabled` is `true`.
    3. Verify that `autoscaling.hpa.enabled` is `false`.
    4. Check that KEDA is installed: `kubectl get pods -n keda`.
    5. Review Helm deployment logs for errors.
  </Accordion>

  <Accordion title="Both HPA and ScaledObject exist for the same workload">
    1. Check your configuration for conflicting settings.
    2. Ensure that both `hpa.enabled` and `keda.enabled` aren't set to `true`.
    3. Delete the unwanted resource manually if necessary.
  </Accordion>

  <Accordion title="KEDA not scaling as expected">
    1. Verify that triggers are configured correctly.
    2. Check KEDA operator logs: `kubectl logs -n keda -l app=keda-operator`
    3. Describe the ScaledObject to see its status: `kubectl describe scaledobject <name> -n <namespace>`
    4. Verify that the trigger source (metrics endpoint, queue, and so on) is accessible.
  </Accordion>
</AccordionGroup>