> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Get a model

> Get the details about a specific model.


## OpenAPI

````yaml /api-reference/saas/machine-learning-models.json get /customers/{CUSTOMER_ID}/ai/models/{MODEL_ID}
openapi: 3.1.0
info:
  title: Lucidworks AI Models API
  version: '1.0'
  description: >-
    The Lucidworks AI Models API is used to discover available pre-trained and
    custom models, as well as train and deploy those models.


    The endpoints require an authentication token with scope
    `machinelearning.model`.
  contact:
    name: Lucidworks
    url: https://lucidworks.com/
    email: support@lucidworks.com
  termsOfService: https://lucidworks.com/legal/developer-license-agreement/
  license:
    name: Lucidworks
    url: https://lucidworks.com/legal/developer-license-agreement/
servers:
  - url: https://api.lucidworks.com
security: []
tags:
  - name: Manage models
    description: View and create models.
  - name: Manage deployments
    description: View and create custom model deployments.
paths:
  /customers/{CUSTOMER_ID}/ai/models/{MODEL_ID}:
    parameters:
      - schema:
          type: string
        name: CUSTOMER_ID
        in: path
        required: true
        description: Unique identifier derived from confidential client information.
      - schema:
          type: string
        name: MODEL_ID
        in: path
        required: true
        description: >-
          Unique identifier for the model. Use `GET
          /customers/{CUSTOMER_ID}/ai/models` to get the list of models and
          their IDs.
    get:
      tags:
        - Manage models
      summary: Get a model
      description: Get the details about a specific model.
      operationId: get-modelId
      parameters:
        - schema:
            type: boolean
          in: query
          name: metrics
          description: Information about the metrics returned in the response.
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/getModelId'
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                anyOf:
                  - $ref: '#/components/schemas/responseGetModelIdShared'
                  - $ref: '#/components/schemas/responseGetModelIdCustom'
components:
  schemas:
    getModelId:
      title: getModelId
      type: object
      description: >-
        GET
        https://api.lucidworks.dev/customers/{CUSTOMER_ID}/ai/models/{MODEL_ID}
      properties:
        id:
          type: string
          description: >-
            The identifier of the model. For:


            * Pre-trained models, the value options are `multilinguallm`,
            `text-encoder`, or `clip-encoder`.


            * Custom models, the value is the universally unique identified
            (UUID) that is the primary key for the model.
      required:
        - id
    responseGetModelIdShared:
      title: responseGetModelIdShared
      type: object
      properties:
        id:
          type: string
          description: >-
            The identifier of the model. The value options are `multilinguallm`,
            `text-encoder`, or `clip-encoder`.
        modelType:
          type: string
          description: >-
            The name of the `modelType`. The value is the same as the `id`
            value. The value options are `multilinguallm`, `text-encoder`, or
            `clip-encoder`.
        description:
          type: string
          description: The description of the model.
        state:
          type: string
          description: >-
            This field specifies the current status of the model. The only value
            is `AVAILABLE`. The model was successfully trained and is ready to
            be deployed.
    responseGetModelIdCustom:
      title: responseGetModelIdCustom
      type: object
      properties:
        id:
          type: string
          description: >-
            The identifier of the model. The value is the universally unique
            identified (UUID) that is the primary key for the model.
        name:
          type: string
          description: The user-friendly name of the model.
        modelType:
          type: string
          description: The name of the custom model.
        category:
          type: string
          description: The object that specifies the model is `custom`.
        description:
          type: string
          description: The description of the model.
        region:
          type: string
          description: The geographic region specified when the custom model is deployed.
        vectorSize:
          type: integer
          description: >-
            The number of elements and objects in the custom model. This field
            only applies to custom models.
        trainingData:
          $ref: '#/components/schemas/trainingData'
          description: The location of the training data in Google Cloud Storage (GCS).
        config:
          $ref: '#/components/schemas/modelConfig'
          description: The configuration parameters passed to the model training job.
        state:
          type: string
          description: >-
            This field specifies the current status of the custom model. The
            value can be:

                * `TRAINING` - The custom model is being trained.

                * `TRAINING_FAILED` - The model training failed.

                * `AVAILABLE` - The model was successfully trained and is ready to be deployed.
        trainingStarted:
          type: string
          description: >-
            The date and time the training started. This field only applies to
            custom models.
          format: date-time
        trainingCompleted:
          type: string
          description: >-
            The date and time the training completed. This field only applies to
            custom models.
          format: date-time
        trainingMetrics:
          type: object
          description: >-
            Information about the metrics returned in the response of the
            training model.
        deployments:
          type: array
          description: >-
            This array is only returned for deployed custom models, so these
            fields are not included for pre-trained models.
          items:
            type: object
            properties:
              id:
                type: string
                description: >-
                  The identifier for the deployed custom model. The value is the
                  universally unique identified (UUID) that is the primary key
                  for the model. 
              region:
                type: string
                description: >-
                  The geographic region specified when the custom model is
                  deployed.
              state:
                type: string
                description: >
                  This field specifies the current status of the custom model
                  deployment. Value options include:


                  * `DEPLOYING` -The model is in the process of being deployed.


                  * `DEPLOYED` - The model is deployed and available for
                  predictions.


                  * `DEPLOY_FAILED` - The model failed to deploy.


                  * `DELETING` - The model deployment is being deleted. The
                  `custom_model_deployment` record is also deleted if the
                  deployment is successfully deleted.


                  * `DELETE_FAILED` - The model deployment deletion failed. The
                  model is still deployed and available for predictions.
      examples:
        - id: c604c6eb-5589-4911-8840-939021113bff
          name: example model
          modelType: ecommerce-rnn
          category: CUSTOM
          description: Custom model tuned for e-commerce training
          region: us-iowa
          vectorSize: 256
          trainingData:
            catalog: >-
              gs://ml-platform-model-parameters-region/customer-test/test-data/l_index.parquet
            signals: >-
              gs://ml-platform-model-parameters-region/customer-test/test-data/l_query.parquet
          config:
            dataset_config: mlp_ecommerce
            trainer_config: mlp_ecommerce
            trainer_config.num_epochs: 1
          state: AVAILABLE
          trainingStarted: '2023-06-14T15:28:40.201Z'
          trainingCompleted: '2023-06-14T15:36:55.320Z'
          trainingMetrics:
            summary:
              best_epoch: 1
              index_size: 3885
              vector_size: 256
              training_time: 45.730143308639526
              num_trn_queries: 17730
              num_val_queries: 1969
              num_unique_training_pairs: 41380
            epoch_metrics:
              hit:
                trn:
                  '1':
                    - 0.22955815134586086
                  '3':
                    - 0.4154393092940579
                  '5':
                    - 0.5073641442356526
                  '10':
                    - 0.6140172676485526
                val:
                  '1':
                    - 0.21736922295581512
                  '3':
                    - 0.4245810055865922
                  '5':
                    - 0.510411376333164
                  '10':
                    - 0.6069070594210259
              map:
                trn:
                  '1':
                    - 0.22955815134586086
                  '3':
                    - 0.27385587720783217
                  '5':
                    - 0.29456097285706195
                  '10':
                    - 0.3109160985416306
                val:
                  '1':
                    - 0.21736922295581512
                  '3':
                    - 0.2779752835618753
                  '5':
                    - 0.29659288414874957
                  '10':
                    - 0.31074798141986865
              mrr:
                trn:
                  '1':
                    - 0.22955815134586086
                  '3':
                    - 0.3121719993228371
                  '5':
                    - 0.3333248687997291
                  '10':
                    - 0.3479900763420316
                val:
                  '1':
                    - 0.21736922295581512
                  '3':
                    - 0.3103944472659555
                  '5':
                    - 0.3298713390892162
                  '10':
                    - 0.34282711391649967
              ndcg:
                trn:
                  '1':
                    - 0.19706682592715863
                  '3':
                    - 0.29341225996373577
                  '5':
                    - 0.3320502228116081
                  '10':
                    - 0.36721667192191393
                val:
                  '1':
                    - 0.18814826421287148
                  '3':
                    - 0.2952855597323055
                  '5':
                    - 0.3329328847559609
                  '10':
                    - 0.3656142086267142
              recall:
                trn:
                  '1':
                    - 0.1762454094867557
                  '3':
                    - 0.34738050817763444
                  '5':
                    - 0.4339384181765327
                  '10':
                    - 0.5392753552805389
                val:
                  '1':
                    - 0.17493309084726633
                  '3':
                    - 0.36196991256560396
                  '5':
                    - 0.4451409616903886
                  '10':
                    - 0.5429232985058972
            final_metrics:
              hit:
                trn:
                  '1': 0.22955815134586086
                  '3': 0.4154393092940579
                  '5': 0.5073641442356526
                  '10': 0.6140172676485526
                val:
                  '1': 0.21736922295581512
                  '3': 0.4245810055865922
                  '5': 0.510411376333164
                  '10': 0.6069070594210259
              map:
                trn:
                  '1': 0.22955815134586086
                  '3': 0.27385587720783217
                  '5': 0.29456097285706195
                  '10': 0.3109160985416306
                val:
                  '1': 0.21736922295581512
                  '3': 0.2779752835618753
                  '5': 0.29659288414874957
                  '10': 0.31074798141986865
              mrr:
                trn:
                  '1': 0.22955815134586086
                  '3': 0.3121719993228371
                  '5': 0.3333248687997291
                  '10': 0.3479900763420316
                val:
                  '1': 0.21736922295581512
                  '3': 0.3103944472659555
                  '5': 0.3298713390892162
                  '10': 0.34282711391649967
              ndcg:
                trn:
                  '1': 0.19706682592715863
                  '3': 0.29341225996373577
                  '5': 0.3320502228116081
                  '10': 0.36721667192191393
                val:
                  '1': 0.18814826421287148
                  '3': 0.2952855597323055
                  '5': 0.3329328847559609
                  '10': 0.3656142086267142
              recall:
                trn:
                  '1': 0.1762454094867557
                  '3': 0.34738050817763444
                  '5': 0.4339384181765327
                  '10': 0.5392753552805389
                val:
                  '1': 0.17493309084726633
                  '3': 0.36196991256560396
                  '5': 0.4451409616903886
                  '10': 0.5429232985058972
    trainingData:
      title: trainingData
      type: object
      description: >-
        Both the `catalog index file` and `signals query file` are required, and
        must contain the same `pkid` field, which links relevant queries to
        documents.
      properties:
        catalog:
          type: string
          description: >-
            The location of the catalog of the training data in Google Cloud
            Storage (GCS).


            The catalog file contains documents (products) that will be
            searched. The file must have a `pkid` (product key ID) column which
            contains the document ID or product ID. The `pkid` is a unique value
            for each document, so entries with a duplicate `pkid` are filtered
            out. However, since not every `pkid` entry is associated with a
            query, there may be entries in the `catalog index file` that are not
            associated with a `signals query entry`.


            The index file content format is different based on the model type
            to be trained. For example, a general model or an eCommerce model.

            - The general index file format contains:
                - `pkid` - The unique product key ID. Required field. This must match an entry in the `signals query file`. 
                - `text` - A freeform text field.
            - The eCommerce index file format contains:
                - `pkid` - The unique product key ID. Required field. This must match an entry in the `signals query file`. 
                - `name` - The freeform text field that contains the product name.
        signals:
          type: string
          description: >-
            The location of signals in the training data in Google Cloud Storage
            (GCS).


            The signals file must have a `pkid` (product key ID) column which
            refers to the relevant document or product ID. The file may contain
            multiple duplicates of any `pkid` because each document could be
            associated with several relevant queries.


            NOTE: For evaluation purposes, 10% of unique queries (50 minimum and
            5000 maximum) are automatically sampled into a validation set from
            the training query file.


            * The general query file format contains:
                - `pkid` - The unique product key ID. Required field. This must match an entry in the `catalog index file`. 
                - `query` - A freeform text field.
            - The eCommerce query file format contains:
                - `pkid` - The unique product key ID. Required field. This must match an entry in the `catalog index file`. 
                - `query` - A freeform text field.
                - `aggr_count` - The number of documents that match the query criteria. In most cases, this value is used as a weight and must be greater than zero (0). If you do not use weights or there is no value, set this value to `1`. The weight is used for training pairs sampling and to compute normalized discounted cumulative gain (NDCG) metrics. If all values are `1.0`, binary NDCG is computed.
      required:
        - catalog
        - signals
    modelConfig:
      title: modelConfig
      type: object
      properties:
        dataset_config:
          type: string
          description: >-
            The options for the dataset format used for training are:

            - `mlp_general` - This is used for the general recurrent neural
            networks (RNN) model type.

            - `mlp_ecommerce` - This used for an eCommerce RNN model type.
          example: mlp_ecommerce
        trainer_config:
          type: string
          description: >-
            The options for the trainer type used for training are:

            - `mlp_general` - This is used for the general recurrent neural
            networks (RNN) model type.

            - `mlp_ecommerce` - This used for an eCommerce RNN model type.
          example: mlp_ecommerce
        trainer_config/text_processor_config:
          type: string
          description: >-
            This determines which type of tokenization and embedding is used as
            the base for the recurrent neural network (RNN) model. For example,
            word or bype-pair encoding (BPE). 

            The word text processor defaults to English, and uses word-based
            tokenization and English pre-trained word embeddings. The maximum
            word vocabulary result is 100000.

            The BPE versions use the same tokenization, but different vocabulary
            sizes:
              * bpe_*_small embeddings have 10000 vocabulary tokens
              * bpe_*_large embeddings have 100000 vocabulary tokens
              * bpe_multi multilingual embeddings have 320000 vocabulary tokens
              
             The options for text processors are:
            - English
                - `word_en` (default)
                - `bpe_en_small`
                - `bpe_en_large`
                - `all_minilm_l6`
                - `e5_small_v2`
                - `e5_base_v2`
                - `e5_large_v2`
                - `gte_small`
                - `gte_base`
                - `gte_large`
                - `snowflake_arctic_embed_xs`
            - Multilingual
                - `bpe_multi`
                - `multilingual_e5_small`
                - `multilingual_e5_base`
                - `multilingual_e5_large`
            - Bulgarian
                - `bpe_bg_small`
                - `bpe_bg_large`
            - German
                - `bpe_de_small`
                - `bpe_de_large`
            - Spanish
                - `bpe_es_small`
                - `bpe_es_large`
            - French
                - `bpe_fr_small`
                - `bpe_fr_large`
            - Italian
                - `bpe_it_small`
                - `bpe_it_large`
            - Japanese
                - `bpe_ja_small`
                - `bpe_ja_large`
            - Korean
                - `bpe_ko_small`
                - `bpe_ko_large`
            - Dutch
                - `bpe_nl_small`
                - `bpe_nl_large`
            - Romanian
                - `bpe_ro_small`
                - `bpe_ro_large`
            - Chinese
                - `bpe_zh_small`
                - `bpe_zh_large`
            - Custom
                - `word_custom`
                - `bpe_custom`
          example: word_en
        trainer_config.encoder_config.rnn_names_list:
          type: array
          description: >-
            This determines which bi-directional recurrent neural network (RNN)
            layers are used. Options include `gru` and `lstm`.
          items:
            type: string
            example: gru
        trainer_config.encoder_config.rnn_units_list:
          type: array
          description: >-
            The number of units for each recurrent neural network (RNN) layer. 


            IMPORTANT: You must specify the same number of units for
            `trainer_config.encoder_config.rnn_units_list` and its
            similarly-named `trainer_config.encoder_config.rnn_names_list` RNN
            layer. For example, `rnn_units_list` needs to be the same size as
            `rnn_names_list`. 


            Because this is a bi-directional RNN, the encoder's vector size is
            two times larger than the number of units in the last layer. For
            example, if one layer is 128 units, the output vector size is 256.
          items:
            type: integer
            example: 128
        trainer_config.trn_batch_size:
          type: integer
          description: >-
            The batch size to be used for a single model training update. By
            default, an appropriate batch size is automatically determined based
            on the dataset size. If the field is set to `null`, the batch size
            is also automatically determined based on the dataset size.
        trainer_config.num_epochs:
          type: integer
          description: >-
            The number of epochs the training data must complete. An epoch is a
            full cycle where training data passes through the designated
            algorithms. During one epoch, the model processes all the training
            data examples (queries and index documents) at least one time.
          example: 1
          minimum: 1
          maximum: 64
        trainer_config.monitor_patience:
          type: integer
          description: >-
            The number of epochs the training passes before it stops if there is
            no validation metric improvement during the epochs. The best model
            state based on the monitor validation metric is used as the final
            model. 


            * For the general RNN, the `mrr@3` metric is monitored and the
            `monitor_patience` default value is 8. 


            * For the eCommerce RNN, the `ndcg@5` metric is monitored and the
            `monitor_patience` default value is 16.
          example: 8
        trainer_config.encoder_config.emb_spdp:
          type: number
          description: >-
            This field provides a regularization effect, which is the process to
            simplify result answers. The regularization is applied between the
            token embeddings layer and the first recurrent neural network (RNN)
            layer.
          example: 0.3
          format: float
        trainer_config.encoder_config.emb_trainable:
          type: boolean
          description: >-
            This field determines if fine-tuning of the token embeddings, such
            as word or byte pair encoding (BPE) token vectors, is enabled. If
            set, it can improve the quality of the model if the query contains
            less natural language, and training is negatively affected. Because
            the embeddings layer is the largest layer in the network, the
            process to improve the model requires enough training data to
            prevent overfitting.

````