Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt

Use this file to discover all available pages before exploring further.

Spark versions in Fusion 5

Fusion 5 ships with different versions of Apache Spark depending on the release:
  • Fusion 5.9.10 and later: Ships with Spark 3.4.1 by default
  • Fusion 5.6.x through 5.9.9: Ships with Spark 3.2.2
  • Fusion 5.5.x: Ships with Spark 3.2.1
  • Fusion 5.4.x and earlier: Ships with Spark 2.4

Configure the Spark version

In Fusion releases 5.9.12 and later, you can configure which Spark version to use. This flexibility helps maintain compatibility with legacy Python and Scala environments, especially for applications that depend on specific Spark runtime behaviors. To switch from Spark 3.4.1 to Spark 3.2.2, update the fusion-spark image in the job-launcher configmap:
driver:
  container:
    image: example.com/dir/fusion-spark-3.2.2
executor:
  container:
    image: example.com/dir/fusion-spark-3.2.2

Python version requirements

The Spark version you use determines the Python version required for custom Python jobs:
  • Spark 3.4.1: Requires Python 3.10
  • Spark 3.2.2: Compatible with Python 3.7.3

Azure Blob Storage compatibility

The Spark 3.4.1 runtime is incompatible with Azure Blob Storage when accessed via the deprecated wasbs:// protocol due to a Jetty version conflict. Use Spark 3.2.2 instead if your jobs rely on wasbs://.
In the long term, we recommend migrating to the abfs:// protocol for Azure Blob Storage access, which is fully supported in Spark 3.4.1.

Reference Documentation