Run stages asynchronously
Some stages allow for asynchronous processing for work that may take a while, keeping the overall pipeline moving. Fusion does not block the request thread while the stage is waiting. Instead, it waits for the stage’s result right before it needs it to finish the response.When to use asynchronous processing
Use asynchronous processing for the following:- A stage calls out to a remote system.
- The stage reads or writes large files or blobs.
- The stage can run in parallel with other stages.
- The stage does only local work and typically finishes in under a couple of milliseconds.
- The stage coordinates tasks that must happen one at a time.
Benefits of asynchronous processing
Asynchronous processing allows:- Lower latency by overlapping slower calls with other work.
- Greater resilience because if one asynchronous call is slow or errors out, Fusion can time it out without stalling the whole pipeline.
- Easier error recognition and resolution because timeouts and cancellations are scoped to the stage.
Configure A Lucidworks AI Gateway integration
Configure A Lucidworks AI Gateway integration
- On the Integrations tab, click your integration. If you don’t see your integration, contact your Lucidworks representative.
-
Download or copy the YAML code and paste it into a file called
account.yaml. The file for a single integration should look similar to this one:For a configuration with multiple integrations, it should look like this:Non-admin users must have the following permissions to use Lucidworks AI integrations:PUT,POST,GET:/LWAI-ACCOUNT-NAME/**whereLWAI-ACCOUNT-NAMEmust match the value offusion.lwai.account[n].namein the integration YAML. -
Apply the file to your Fusion configuration file.
For example:
Managed Fusion
These Managed Fusion pipeline stages provide interfaces for configuring your Lucidworks AI models in Managed Fusion pipelines.- Vector search stages. These are the foundational stages that enable vector search features.
- Chunker stage. This stage asynchronously breaks down large text documents and vectorizes the chunks.
- RAG Bridge stage. This stage extracts relevant snippets from source documents used in RAG responses.
- Generative AI (Gen-AI) stages. See Generative AI for conceptual information about Gen-AI features.
Configure the LWAI Prediction index stage
Configure the LWAI Prediction index stage
PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured.To configure this stage:- Sign in to Fusion and click Indexing > Index Pipelines.
- Click Add+ to add a new pipeline.
- Enter the name in Pipeline ID.
- Click Add a new pipeline stage.
- In the AI section, click LWAI Prediction.
- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
- In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
-
In the Use Case field, select the Lucidworks AI use case to associate with this stage.
- To generate a list of the use cases for your organization, see Use Case API.
- If the Call Asynchronously? check box is selected, see available use cases described in Async Prediction API.
- If the Call Asynchronously? check box is not selected, see available use cases described in Prediction API.
-
In the Model field, select the Lucidworks AI model to associate with this stage.
Your Fusion account name must match the name of the account that you selected in the Account Name dropdown.
For more information about models, see: - In the Input context variable field, enter the name of the context variable to be used as input. Template expressions are supported.
-
In the Destination field name and context output field, enter the name that will be used as both the field name in the document where the prediction is written and the context variable that contains the prediction.
- If the Call Asynchronously? check box is selected and a value is entered in this field:
-
{destination name}_tis the full response. -
In the document:
-
_lw_ai_properties_sscontains the Lucidworks account, boolean setting for async, use case, input for the call, and the collection. -
_lw_ai_request_countis the number of GET requests bypredictionIdand_lw_ai_success_countis the number of responses without errors. These two fields are used for debugging only. Based on the deployment, the most useful measure is the ratio of_lw_ai_success_countdivided by_lw_ai_request_countand then adjusting as much as possible to achieve 1.0. -
enriched_sscontains the use case. This can be used as a boolean value to verify if the use case indexed successfully.
-
- If the Call Asynchronously? check box is not selected and a value is entered in this field:
-
{destination name}_tis the full response. - If no value is entered in this field (regardless of the Call Asynchronously? check box setting):
-
_lw_ai_{use case}_tis theresponse.responseobject, which is the raw model output. -
_lw_ai_{use case}_response_sis the full response.
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The
useCaseConfigparameter is only applicable to certain use cases.- If the Call Asynchronously? check box is selected,
useCaseConfiginformation for each applicable use case is described in Async Prediction API. - If the Call Asynchronously? check box is not selected,
useCaseConfiginformation for each applicable use case is described in Prediction API.
- If the Call Asynchronously? check box is selected,
-
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases.- If the Call Asynchronously? check box is selected,
modelConfiginformation is described in Async Prediction API. - If the Call Asynchronously? check box is not selected,
modelConfiginformation is described in Prediction API.
- If the Call Asynchronously? check box is selected,
-
In the API Key field, enter the secret value specified in the external model. For:
- OpenAI models,
"apiKey"is the value in the model’s"[OPENAI_API_KEY]"field. For more information, see Authentication API keys. - Azure OpenAI models,
"apiKey"is the value generated by Azure in either the model’s"[KEY1 or KEY2]"field. For requirements to use Azure models, see Generative AI models. - Google VertexAI models,
"apiKey"is the value in the model’s
"[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]"field. For more information, see Create and delete Google service account keys. - OpenAI models,
- To run the API call asynchronously, select the Call Asynchronously? check box to specify the stage is to use the Lucidworks AI Async Prediction API endpoints. If this is selected, the API call does not block the pipeline while waiting for a response from Lucidworks AI. If the check box is not selected, the API call uses the Prediction API, which uses the pipeline until a response is received from Lucidworks AI. Performance of other API calls can be impacted.
- In the Maximum Asynchronous Call Tries field, enter the maximum number of times to send an asynchronous API call before the system generates a failure error.
- Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
- Click Save.
Additional requirements
Additional requirements to use async calls include:- Use a V2 connector. Only V2 connectors work for this task and not other options, such as PBL or V1 connectors.
- Remove the
Apache Tikastage from your parser because it can cause datasource failures with the following error: “The following components failed: [class com.lucidworks.connectors.service.components.job.processor.DefaultDataProcessor : Only Tika Container parser can support Async Parsing.]” - Replace the
Solr Indexerstage with theSolr Partial Update Indexerstage with the following settings:Enable Concurrency Controlset to offReject Update if Solr Document is not Presentset to offProcess All Pipeline Doc Fieldsset to onAllow reserved fieldsset to on- A parameter with
Update Type,Field Name&ValueinUpdates
Configure the LWAI Prediction query stage
Configure the LWAI Prediction query stage
PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured.To configure this stage:- Sign in to Fusion and click Querying > Query Pipelines.
- Click Add+ to add a new pipeline.
- Enter the name in Pipeline ID.
- Click Add a new pipeline stage.
- In the AI section, click LWAI Prediction.
- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
-
Select Asynchronous Execution Config if you want to run this stage asynchronously. If this field is enabled, complete the following fields:
- Select Enable Async Execution. Fusion automatically assigns an Async ID value to this stage. Change this to a more memorable string that describes the asynchronous stages you are merging, such as
signalsoraccess_control. - Copy the Async ID value.
- Select Enable Async Execution. Fusion automatically assigns an Async ID value to this stage. Change this to a more memorable string that describes the asynchronous stages you are merging, such as
- In the Account Name field, enter your Lucidworks AI API Account Name as defined in the Lucidworks AI Gateway Service.
-
In the Use Case field, select the Lucidworks AI use case to associate with this stage.
- To generate a list of the use cases for your organization, see Use Case API.
- The available use cases are described in Prediction API.
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI.
-
The
useCaseConfigparameter is only applicable to certain use cases. For more information, see the Async Prediction API and the Prediction API. -
The
memoryUuidparameter is required in the Standalone Query Rewriter use case, and is optional in the RAG use case. For more information, see Prediction API.
-
The
-
In the Model field, select the Lucidworks AI model to associate with this stage.
If you do not see any model names and you are a non-admin Fusion user, verify with a Fusion administrator that your user account has these permissions:PUT,POST,GET:/LWAI-ACCOUNT-NAME/**Your Fusion account name must match the name of the account that you selected in the Account Name dropdown.
For more information about models, see: -
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases. For more information, see Prediction API. - In the Input context variable field, enter the name of the context variable to be used as input. Template expressions are supported.
-
In the Destination variable name and context output field, enter the name that will be used as both the query response header in the prediction results and the context variable that contains the prediction.
-
If a value is entered in this field:
-
{destination name}_tis the full response. -
In the context:
_lw_ai_properties_sscontains the Lucidworks account, boolean setting for async, use case, input for the call, and the collection.
-
-
If no value is entered in this field:
-
_lw_ai_{use case}_tis theresponse.responseobject, which is the raw model output. -
_lw_ai_{use case}_response_sis the full response.
-
-
If a value is entered in this field:
-
Grounding Options is only used for the RAG use case, and connects model output to the data source. This provides more trustworthy responses in a scalable, cost-efficient manner. If selected, enter appropriate values in the following options:
- In the Grounding Documents Location field, enter the location where response documents are stored for use cases that support grounding via attached documents.
-
In the Grounding Documents Key field, enter the key in the context variable that contains the grounding documents. If the value of the Grounding Document Location field is
SolrResponse, the value in this field is ignored and the response documents are used. - In the Number of Grounding Documents field, enter the number of documents to include in the RAG request.
-
In the Document Field Mappings section, enter the LW AI Document field name and its corresponding Response document field name to map from input documents to the fields accepted by the Prediction API RAG use case. The fields are described in the Prediction API.
If information is not entered in this section, the default mappings are used.
- The
bodyandsourcefields are required.body-description_tis the document content.source-link_tis the URL/ID of the document.
- The
titleanddatefields are optional.title-title_tis the title of the document.date-_lw_file_modified_tdtis the creation date of the document in epoch time format.
- The
- Select the Fail on Error checkbox to generate an exception if an error occurs during this stage.
-
In the API Key field, enter the secret value specified in the external model. For:
-
OpenAI models,
"apiKey"is the value in the model’s"[OPENAI_API_KEY]"field. For more information, see Authentication API keys. -
Azure OpenAI models,
"apiKey"is the value generated by Azure in either the model’s"[KEY1 or KEY2]"field. For requirements to use Azure models, see Generative AI models. -
Google VertexAI models,
"apiKey"is the value in the model’s"[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]"field. For more information, see Create and delete Google service account keys.
-
OpenAI models,
- Click Save.
Configure the LWAI Vectorize pipeline
Configure the LWAI Vectorize pipeline
Configure the pipeline
To add the Lucidworks AI (LWAI) Vectorize index pipeline:- Sign in to Managed Fusion and click Indexing > Index Pipelines.
- Select the default LWAI-vectorize pipeline.
- Configure the following stages included in the default pipeline.
Field Mapping
The Field Mapping stage customizes mapping of the fields in an index pipeline document to fields in the Solr scheme.To configure this stage for the index pipeline:- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Allow System Fields Mapping? checkbox to map system fields in this stage.
- In the Field Retention section, enter specific fields to either keep or delete.
- In the Field Value Updates section, enter specific fields and then designate the value to either add to the field, or set on the field. When a value is added, any values previously on the field are retained. When a value is set, any values previously on the field are overwritten by the new value entered.
- In the Field Translations section, enter specific fields to either move or copy to a different field. When a field is moved, the values from the source field are moved over to the target field and the source field is removed. When a field is copied, the values from the source field are copied over to the target field and the source field is retained.
- Select the Unmapped Fields checkbox to specify the operation on the fields not mapped in the previous sections. Select the Keep checkbox to keep all unmapped fields. This is the only option you need to select for the LWAI-vectorize stage.
- Click Save.
Solr Dynamic Field Name Mapping
The Solr Dynamic Field Name Mapping stage maps pipeline document fields to Solr dynamic fields.- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Duplicate Single-Valued Fields as Multi-Valued Fields checkbox to enable indexing of field data into both single-valued and multi-valued Solr fields. For example, if this option is selected, the
phonefield is indexed into both thephone_ssingle-valued field and thephone_ssmulti-valued field. If this option is not selected, thephonefield is indexed into only thephone_ssingle-valued field. - In the Field Not To Map section, enter the names of the fields that should not be mapped by this stage.
- Select the Text Fields Advanced Indexing checkbox to enable indexing of text data that doesn’t exceed a specific maximum length, into both tokenized and non-tokenized fields. For example, if this option is selected, the
nametext field with a value of John Smith is indexed into both thename_tandname_sfields allowing relevant search usingname_tfield (by matching to a Smith query) and also proper faceting and sorting usingname_sfield (using John Smith for sorting or faceting). If this option is not selected, thenametext field is indexed into only thename_ttext field by default. - In the Max Length for Advanced Indexing of Text Fields field, enter a value used to determine how many characters of the incoming text is indexed. For example, 100.
- Click Save.
LWAI Vectorize Field
The LWAI Vectorize stage invokes a Lucidworks AI model to encode a string field to a vector representation. This stage is skipped if the field to encode doesn’t exist or is null on the pipeline document.- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
-
In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
If you do not see your account name or you are unsure which one to select, contact the Managed Fusion team at Lucidworks. -
In the Model field, select the Lucidworks AI model to use for encoding.
If you do not see your model name or you are unsure which one to select, contact the Managed Fusion team at Lucidworks.
For more information, see: - In the Source field, enter the name of the string field where the value should be submitted to the model for encoding. If the field is blank or does not exist, this stage is not processed. Template expressions are supported.
-
In the Destination field, enter the name of the field where the vector value from the model response is saved.
- If a value is entered in this field, the following information added to the document:
-
{Destination Field}_bis the boolean value if the vector has been indexed. -
{Destination Field}is the vector field.
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The
useCaseConfigparameter that is common to embedding use cases isdataType, but each use case may have other parameters. The value for the query stage isquery. -
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases. For more information, see Prediction API. - Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
- Click Save.
- Index data using the new pipeline. Verify the vector field is indexed by confirming the field is present in documents.
Solr Indexer
The Solr Indexer stage transforms a Managed Fusion pipeline document into a Solr document, and sends it to Solr for indexing into a collection.To configure this stage for the index pipeline:- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Map to Solr Schema checkbox to select and add static and dynamic fields to map in this stage.
-
Select the Add a field listing all document fields checkbox to add the
_lw_fields_ssmulti-valued field to the document, which lists all fields that are being sent to Solr. - In the Additional Date Formats section, enter date formats to include in this stage.
- In the Additional Update Request Parameters section, enter the parameter names and values to update the request parameters.
- Select the Buffer Documents and Send Them To Solr in Batches checkbox to process the documents in batches for this stage.
- In the Buffer Size field, enter the number of documents in a batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
- In the Buffer Flush Interval (milliseconds) field, enter the maximum number of milliseconds to hold the batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
-
Select the Allow expensive request parameters checkbox to allow
commit=trueandoptimize=trueto be passed to Solr when specified as request parameters coming into this pipeline. Document commands that specify commit or optimize are still respected even if this checkbox is not selected. -
Select the Unmapped Fields Mapping checkbox to specify the information for all of the fields not mapped in the previous sections.
- In the Source Field, enter the name of the unmapped field to be mapped.
- In the Target Field, enter the name of the Solr field to which the unmapped field is mapped.
- In the Operation field, select how the field is mapped. The options are:
- Add the unmapped field to the Solr field.
- Copy the unmapped field to the Solr field and retain the value in the Source field.
- Delete the unmapped field.
- Keep the unmapped field and do not map it to a Solr field.
- Move (replace) the Solr field value with the unmapped field Source value and remove the value from the Source field.
- Set the value of the unmapped field to the value in the Solr field.
- Click Save.
Order the stages
For the pipeline to operate correctly, the stages must be in the following order:When you have ordered the stages, click Save.Self-hosted Fusion
These Fusion pipeline stages provide interfaces for configuring your Lucidworks AI models in Fusion pipelines.- Vector search stages.
- Chunker stage. This stage asynchronously breaks down large text documents and vectorizes the chunks.
- RAG Bridge stage. This stage extracts relevant snippets from source documents used in RAG responses.
- Generative AI stages. See Generative AI for conceptual information about Gen-AI features.
Configure the LWAI Prediction index stage
Configure the LWAI Prediction index stage
PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured.To configure this stage:- Sign in to Fusion and click Indexing > Index Pipelines.
- Click Add+ to add a new pipeline.
- Enter the name in Pipeline ID.
- Click Add a new pipeline stage.
- In the AI section, click LWAI Prediction.
- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
- In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
-
In the Use Case field, select the Lucidworks AI use case to associate with this stage.
- To generate a list of the use cases for your organization, see Use Case API.
- If the Call Asynchronously? check box is selected, see available use cases described in Async Prediction API.
- If the Call Asynchronously? check box is not selected, see available use cases described in Prediction API.
-
In the Model field, select the Lucidworks AI model to associate with this stage.
Your Fusion account name must match the name of the account that you selected in the Account Name dropdown.
For more information about models, see: - In the Input context variable field, enter the name of the context variable to be used as input. Template expressions are supported.
-
In the Destination field name and context output field, enter the name that will be used as both the field name in the document where the prediction is written and the context variable that contains the prediction.
- If the Call Asynchronously? check box is selected and a value is entered in this field:
-
{destination name}_tis the full response. -
In the document:
-
_lw_ai_properties_sscontains the Lucidworks account, boolean setting for async, use case, input for the call, and the collection. -
_lw_ai_request_countis the number of GET requests bypredictionIdand_lw_ai_success_countis the number of responses without errors. These two fields are used for debugging only. Based on the deployment, the most useful measure is the ratio of_lw_ai_success_countdivided by_lw_ai_request_countand then adjusting as much as possible to achieve 1.0. -
enriched_sscontains the use case. This can be used as a boolean value to verify if the use case indexed successfully.
-
- If the Call Asynchronously? check box is not selected and a value is entered in this field:
-
{destination name}_tis the full response. - If no value is entered in this field (regardless of the Call Asynchronously? check box setting):
-
_lw_ai_{use case}_tis theresponse.responseobject, which is the raw model output. -
_lw_ai_{use case}_response_sis the full response.
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The
useCaseConfigparameter is only applicable to certain use cases.- If the Call Asynchronously? check box is selected,
useCaseConfiginformation for each applicable use case is described in Async Prediction API. - If the Call Asynchronously? check box is not selected,
useCaseConfiginformation for each applicable use case is described in Prediction API.
- If the Call Asynchronously? check box is selected,
-
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases.- If the Call Asynchronously? check box is selected,
modelConfiginformation is described in Async Prediction API. - If the Call Asynchronously? check box is not selected,
modelConfiginformation is described in Prediction API.
- If the Call Asynchronously? check box is selected,
-
In the API Key field, enter the secret value specified in the external model. For:
- OpenAI models,
"apiKey"is the value in the model’s"[OPENAI_API_KEY]"field. For more information, see Authentication API keys. - Azure OpenAI models,
"apiKey"is the value generated by Azure in either the model’s"[KEY1 or KEY2]"field. For requirements to use Azure models, see Generative AI models. - Google VertexAI models,
"apiKey"is the value in the model’s
"[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]"field. For more information, see Create and delete Google service account keys. - OpenAI models,
- To run the API call asynchronously, select the Call Asynchronously? check box to specify the stage is to use the Lucidworks AI Async Prediction API endpoints. If this is selected, the API call does not block the pipeline while waiting for a response from Lucidworks AI. If the check box is not selected, the API call uses the Prediction API, which uses the pipeline until a response is received from Lucidworks AI. Performance of other API calls can be impacted.
- In the Maximum Asynchronous Call Tries field, enter the maximum number of times to send an asynchronous API call before the system generates a failure error.
- Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
- Click Save.
Additional requirements
Additional requirements to use async calls include:- Use a V2 connector. Only V2 connectors work for this task and not other options, such as PBL or V1 connectors.
- Remove the
Apache Tikastage from your parser because it can cause datasource failures with the following error: “The following components failed: [class com.lucidworks.connectors.service.components.job.processor.DefaultDataProcessor : Only Tika Container parser can support Async Parsing.]” - Replace the
Solr Indexerstage with theSolr Partial Update Indexerstage with the following settings:Enable Concurrency Controlset to offReject Update if Solr Document is not Presentset to offProcess All Pipeline Doc Fieldsset to onAllow reserved fieldsset to on- A parameter with
Update Type,Field Name&ValueinUpdates
Configure the LWAI Prediction query stage
Configure the LWAI Prediction query stage
PUT,POST,GET:/LWAI-ACCOUNT-NAME/** permission in Fusion, which is the Lucidworks AI API Account Name defined in Lucidworks AI Gateway when this stage is configured.To configure this stage:- Sign in to Fusion and click Querying > Query Pipelines.
- Click Add+ to add a new pipeline.
- Enter the name in Pipeline ID.
- Click Add a new pipeline stage.
- In the AI section, click LWAI Prediction.
- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
-
Select Asynchronous Execution Config if you want to run this stage asynchronously. If this field is enabled, complete the following fields:
- Select Enable Async Execution. Fusion automatically assigns an Async ID value to this stage. Change this to a more memorable string that describes the asynchronous stages you are merging, such as
signalsoraccess_control. - Copy the Async ID value.
- Select Enable Async Execution. Fusion automatically assigns an Async ID value to this stage. Change this to a more memorable string that describes the asynchronous stages you are merging, such as
- In the Account Name field, select the name of the Lucidworks AI integration defined when the integration was created.
-
In the Use Case field, select the Lucidworks AI use case to associate with this stage.
- To generate a list of the use cases for your organization, see Use Case API.
- The available use cases are described in Prediction API.
-
In the Model field, select the Lucidworks AI model to associate with this stage.
If you do not see any model names and you are a non-admin Fusion user, verify with a Fusion administrator that your user account has these permissions:PUT,POST,GET:/LWAI-ACCOUNT-NAME/**Your Fusion account name must match the name of the account that you selected in the Account Name dropdown.
For more information about models, see: - In the Input context variable field, enter the name of the context variable to be used as input. Template expressions are supported.
-
In the Destination variable name and context output field, enter the name that will be used as both the query response header in the prediction results and the context variable that contains the prediction.
- If a value is entered in this field:
-
{destination name}_tis the full response. -
In the context:
_lw_ai_properties_sscontains the Lucidworks account, boolean setting for async, use case, input for the call, and the collection.
- If no value is entered in this field:
-
_lw_ai_{use case}_tis theresponse.responseobject, which is the raw model output. -
_lw_ai_{use case}_response_sis the full response.
-
Select the Include Response Documents? check box to include the response documents in the Lucidworks AI request. This option is only available for certain use cases. If this is selected, run the Solr Query stage to ensure documents exist before running the LWAI Prediction query stage.
Response documents must be included in the RAG use case, which supports attaching a maximum of 3 response documents. To prevent errors, enter all of the entries described in the Document Field Mappings section. -
In the Document Field Mappings section, enter the LW AI Document field name and its corresponding Response document field name to map from input documents to the fields accepted by the Prediction API RAG use case. The fields are described in the Prediction API.
If information is not entered in this section, the default mappings are used.
- The
bodyandsourcefields are required. body-description_tis the document content.source-link_tis the URL/ID of the document.- The
titleanddatefields are optional. title-title_tis the title of the document.date-_lw_file_modified_tdtis the creation date of the document in epoch time format.
- The
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI.
- The
useCaseConfigparameter is only applicable to certain use cases. For more information, see the Async Prediction API and the Prediction API. - The
memoryUuidparameter is required in the Standalone Query Rewriter use case, and is optional in the RAG use case.
- The
-
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases. For more information, see Prediction API. -
In the API Key field, enter the secret value specified in the external model. For:
- OpenAI models,
"apiKey"is the value in the model’s"[OPENAI_API_KEY]"field. For more information, see Authentication API keys. - Azure OpenAI models,
"apiKey"is the value generated by Azure in either the model’s"[KEY1 or KEY2]"field. For requirements to use Azure models, see Generative AI models. - Google VertexAI models,
"apiKey"is the value in the model’s
"[BASE64_ENCODED_GOOGLE_SERVICE_ACCOUNT_KEY]"field. For more information, see Create and delete Google service account keys. - OpenAI models,
- Select the Fail on Error checkbox to generate an exception if an error occurs during this stage.
- Click Save.
Configure the LWAI Vectorize pipeline
Configure the LWAI Vectorize pipeline
Configure the pipeline
To add the Lucidworks AI (LWAI) Vectorize index pipeline:- Sign in to Fusion and click Indexing > Index Pipelines.
- Select the default LWAI-vectorize pipeline.
- Configure the following stages included in the default pipeline.
Field Mapping
The Field Mapping stage customizes mapping of the fields in an index pipeline document to fields in the Solr scheme.To configure this stage for the index pipeline:- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Allow System Fields Mapping? checkbox to map system fields in this stage.
- In the Field Retention section, enter specific fields to either keep or delete.
- In the Field Value Updates section, enter specific fields and then designate the value to either add to the field, or set on the field. When a value is added, any values previously on the field are retained. When a value is set, any values previously on the field are overwritten by the new value entered.
- In the Field Translations section, enter specific fields to either move or copy to a different field. When a field is moved, the values from the source field are moved over to the target field and the source field is removed. When a field is copied, the values from the source field are copied over to the target field and the source field is retained.
- Select the Unmapped Fields checkbox to specify the operation on the fields not mapped in the previous sections. Select the Keep checkbox to keep all unmapped fields. This is the only option you need to select for the LWAI-vectorize stage.
- Click Save.
Solr Dynamic Field Name Mapping
The Solr Dynamic Field Name Mapping stage maps pipeline document fields to Solr dynamic fields.- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Duplicate Single-Valued Fields as Multi-Valued Fields checkbox to enable indexing of field data into both single-valued and multi-valued Solr fields. For example, if this option is selected, the
phonefield is indexed into both thephone_ssingle-valued field and thephone_ssmulti-valued field. If this option is not selected, thephonefield is indexed into only thephone_ssingle-valued field. - In the Field Not To Map section, enter the names of the fields that should not be mapped by this stage.
- Select the Text Fields Advanced Indexing checkbox to enable indexing of text data that doesn’t exceed a specific maximum length, into both tokenized and non-tokenized fields. For example, if this option is selected, the
nametext field with a value of John Smith is indexed into both thename_tandname_sfields allowing relevant search usingname_tfield (by matching to a Smith query) and also proper faceting and sorting usingname_sfield (using John Smith for sorting or faceting). If this option is not selected, thenametext field is indexed into only thename_ttext field by default. - In the Max Length for Advanced Indexing of Text Fields field, enter a value used to determine how many characters of the incoming text is indexed. For example, 100.
- Click Save.
LWAI Vectorize Field
The LWAI Vectorize stage invokes a Lucidworks AI model to encode a string field to a vector representation. This stage is skipped if the field to encode doesn’t exist or is null on the pipeline document.- In the Label field, enter a unique identifier for this stage.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process.
-
In the Account Name field, select the Lucidworks AI API account name defined in Lucidworks AI Gateway.
If your account name does not appear in the list or you are unsure which one to select, check your Lucidworks AI Gateway configuration. -
In the Model field, select the Lucidworks AI model to use for encoding.
If your model does not appear in the list or you are unsure which one to select, check your Lucidworks AI Gateway configuration.
For more information, see: - In the Source field, enter the name of the string field where the value should be submitted to the model for encoding. If the field is blank or does not exist, this stage is not processed. Template expressions are supported.
-
In the Destination field, enter the name of the field where the vector value from the model response is saved.
- If a value is entered in this field, the following information is added to the document:
-
{Destination Field}_bis the boolean value if the vector has been indexed. -
{Destination Field}is the vector field.
-
In the Use Case Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. The
useCaseConfigparameter that is common to embedding use cases isdataType, but each use case may have other parameters. The value for the query stage isquery. -
In the Model Configuration section, click the + sign to enter the parameter name and value to send to Lucidworks AI. Several
modelConfigparameters are common to generative AI use cases. For more information, see Prediction API. - Select the Fail on Error checkbox to generate an exception if an error occurs while generating a prediction for a document.
- Click Save.
- Index data using the new pipeline. Verify the vector field is indexed by confirming the field is present in documents.
Solr Indexer
The Solr Indexer stage transforms a Fusion pipeline document into a Solr document, and sends it to Solr for indexing into a collection.To configure this stage for the index pipeline:- In the Label field, enter a unique identifier for this stage or leave blank to use the default value.
- In the Condition field, enter a script that results in true or false, which determines if the stage should process, or leave blank.
- Select the Map to Solr Schema checkbox to select and add static and dynamic fields to map in this stage.
-
Select the Add a field listing all document fields checkbox to add the
_lw_fields_ssmulti-valued field to the document, which lists all fields that are being sent to Solr. - In the Additional Date Formats section, enter date formats to include in this stage.
- In the Additional Update Request Parameters section, enter the parameter names and values to update the request parameters.
- Select the Buffer Documents and Send Them To Solr in Batches checkbox to process the documents in batches for this stage.
- In the Buffer Size field, enter the number of documents in a batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
- In the Buffer Flush Interval (milliseconds) field, enter the maximum number of milliseconds to hold the batch before sending the batch to Solr. If no value is specified, the default value for this search cluster is used.
-
Select the Allow expensive request parameters checkbox to allow
commit=trueandoptimize=trueto be passed to Solr when specified as request parameters coming into this pipeline. Document commands that specify commit or optimize are still respected even if this checkbox is not selected. -
Select the Unmapped Fields Mapping checkbox to specify the information for all of the fields not mapped in the previous sections.
- In the Source Field, enter the name of the unmapped field to be mapped.
- In the Target Field, enter the name of the Solr field to which the unmapped field is mapped.
- In the Operation field, select how the field is mapped. The options are:
- Add the unmapped field to the Solr field.
- Copy the unmapped field to the Solr field and retain the value in the Source field.
- Delete the unmapped field.
- Keep the unmapped field and do not map it to a Solr field.
- Move (replace) the Solr field value with the unmapped field Source value and remove the value from the Source field.
- Set the value of the unmapped field to the value in the Solr field.
- Click Save.
Order the stages
For the pipeline to operate correctly, the stages must be in the following order:When you have ordered the stages, click Save.Debugging
If your pipeline is not producing the desired result, these debugging tips can help you identify the source of the issue and resolve it. For general pipeline debugging, these procedures are helpful:- View parameters to confirm that the right ones are being passed from one stage to the next.
- Enable Fail on Error to generate an exception when a pipeline error occurs.
- Verify field definitions to confirm that they are accurate.
- Check query parsers to ensure that the right ones are enables.
- Confirm vector usage in queries to make sure that vectors are being generated and used.
View parameters
When debugging a pipeline, it helps to see the parameters that are being passed to or from each stage. There are several ways to exposed those parameters:Add a Logging stage to view parameters
Add a Logging stage to view parameters
- In Fusion UI, navigate to Indexing > Index Pipelines (for index pipelines) or Querying > Query Pipelines (for query pipelines).
- Click Add a new pipeline stage.
- Select Logging from the Troubleshooting section.
- In the Label field, enter a descriptive name (for example, “Debug After Vectorize”).
- Set the detailed property to
trueto print the full Request or PipelineDocument object. - Place the Logging stage after the stage you want to debug.
- Click Save.
- Run your pipeline and check the appropriate log file for your pipeline type:
- Query pipelines:
https://FUSION_HOST/var/log/api/api.log - Index pipelines:
https://FUSION_HOST/var/log/api/fusion-indexing.log
- Query pipelines:
Use JavaScript to inspect context variables
Use JavaScript to inspect context variables
- Add a JavaScript stage to your pipeline.
- Use the
ctxvariable to inspect context data: - Check logs at
https://FUSION_HOST/var/log/api/api.log.
Use debug info in Query Workbench
Use debug info in Query Workbench
- Navigate to Querying > Query Workbench.
- Enter a test query and click Search.
- Click the Debug tab. The debug view displays the following information:
- Request parameters
- Pipeline stage execution details
- Response data including
responseHeaderanddebug.explain
- Switch to View As: JSON to see the full response structure.
Enable Fail on Error
The Fail on Error setting determines whether silent failures can occur in your pipeline. By enabling Fail on Error during development, testing, and troubleshooting, you ensure that configuration issues, authentication problems, or model errors are immediately visible rather than producing incomplete or incorrect data that can be difficult to troubleshoot later.Configure the Fail on Error setting in LWAI stages
Configure the Fail on Error setting in LWAI stages
- In Fusion UI, navigate to your pipeline (Index or Query).
- Click the LWAI stage you want to configure (for example, LWAI Vectorize Field or LWAI Vectorize Query).
- Locate the Fail on Error checkbox at the bottom of the stage configuration.
- Select the checkbox to enable any of the following behaviors:
- Stop pipeline processing and throw an exception on errors
- Get immediate feedback when LWAI models fail or are misconfigured
- Guarantee data quality by preventing indexing of documents without vectors
- Click Save.
Test the Fail on Error configuration
Test the Fail on Error configuration
- Trigger an intentional error (for example, use an invalid model name or account).
- Verify that the pipeline fails (when Fail on Error is enabled) or continues (when Fail on Error is disabled).
- Review logs at
https://FUSION_HOST/var/log/api/api.logfor error messages.
Verify field definitions
For Neural Hybrid Search pipelines, accurate field definitions are critical. Use the following procedures to verify your field configuration:Inspect indexed data for vector fields
Inspect indexed data for vector fields
- Navigate to Querying > Query Workbench.
- Enter a wildcard query:
*:* - In the fl (field list) parameter, add your vector field name (for example,
text_v,text_v_b). - Click Search.
- Verify the following conditions:
- The vector field contains array values (for example,
[0.123, -0.456, ...]) - The boolean field
{field}_bistruefor vectorized documents
- The vector field contains array values (for example,
Verify that Vector Query Field matches Destination Field
Verify that Vector Query Field matches Destination Field
- In your query pipeline, open your Neural Hybrid Query or Hybrid Query stage.
- Note the Vector Query Field value (for example,
text_v). - In your index pipeline, open your vectorization stage (LWAI Vectorize Field, LWAI Batch Vectorize, or Ray/Seldon Vectorize Field).
- Verify that the Destination field matches the Vector Query Field from your query pipeline.
Verify the Source field configuration
Verify the Source field configuration
- In your vectorization index stage, verify that the Source field meets the following requirements:
- Points to a valid string field in your documents
- Uses correct template expression syntax if needed (for example,
<doc.getFirstFieldValue("title")>)
- Test the configuration by indexing a sample document and confirming that the configured Source field has content.
Re-index data when vector fields are missing
Re-index data when vector fields are missing
- Stop the datasource job if it is running.
- Verify that your index pipeline includes a vectorization stage.
- Optional: Clear the collection if you want to start fresh.
- Re-run the datasource job.
- Verify that vector fields appear in the indexed documents.
Check query parsers
Neural Hybrid Search requires specific query parsers in your Solr configuration. Use the following procedures to verify and configure them:Verify that required query parsers exist in solrconfig.xml
Verify that required query parsers exist in solrconfig.xml
| Query Stage | Required Parser |
|---|---|
| Chunking Neural Hybrid Query | _lw_chunk_wrap |
| Neural Hybrid Query | neuralHybrid |
| Hybrid Query (5.9.9 and earlier) | xVecSim |
- In Fusion UI, navigate to System > Solr Admin.
- Select your collection.
- Click Files > solrconfig.xml.
- Search for
<queryParsertags. - Verify that you have entries similar to the following:
Add missing query parsers to solrconfig.xml
Add missing query parsers to solrconfig.xml
- In the solrconfig.xml editor, locate an appropriate insertion point. Typically, you can add parsers after other
<queryParser>entries or before</config>. - Add the following snippet:
- Click Save.
- Reload the collection by completing the following steps:
- Navigate to Collections.
- Select your collection.
- Click Reload.
Test the query parser configuration
Test the query parser configuration
- Navigate to Query Workbench.
- Run a test query using your hybrid query stage.
- Click the Debug tab and check for any parser-related errors.
- If the configuration is correct, the
debug.parsedqueryfield displays the hybrid query.
Confirm vector usage in queries
To ensure that vectors are being used in your Neural Hybrid Search queries, use the following verification and troubleshooting procedures:Check the query response for vector syntax
Check the query response for vector syntax
- Navigate to Querying > Query Workbench.
- Enter a test query and click Search.
- Click View As: JSON to see the full response.
- Look for the following indicators in the response:
responseHeader.params.q: Should contain vector query syntax (for example,{!neuralHybrid ...}or{!knn ...})debug.explain: Should reference vector similarity calculationsdebug.parsedquery: Should show the parsed hybrid query
Use Solr's explain feature to view scoring details
Use Solr's explain feature to view scoring details
- In Query Workbench, add the debug parameter:
debug=true - Click Search.
- In the Debug tab, examine the explain field for each document.
- Look for the following information:
- Vector similarity scores
- Neural hybrid score calculations
- Both lexical and semantic components
Verify that vectors are available in pipeline context
Verify that vectors are available in pipeline context
- Add a JavaScript query stage after your LWAI Vectorize Query stage.
-
Add the following code to the stage:
-
Check logs at
https://FUSION_HOST/var/log/api/api.log.
Troubleshoot missing or incorrect vector information
Troubleshoot missing or incorrect vector information
| Issue | Possible solutions |
|---|---|
| No vector in response | Verify that the LWAI Vectorize Query stage runs before the hybrid stage. Verify that the Output context variable is set (default: vector). Ensure that LWAI Gateway is configured and accessible. |
| Query uses only lexical search | Verify that the hybrid query stage has the correct Vector Input (for example, <ctx.vector>). Verify that the Vector Query Field matches your indexed vector field. Ensure that the vectorization stage completed successfully. |