Job configuration specifications
Train a Smart Answers Supervised Model
GPU | CPU |
|
|
model_training_input
. Otherwise you can use it directly from the cloud storage.
word_custom
or bpe_custom
. This trains Word2vec on the provided data and specified fields. It might be useful in cases when your content includes unusual or domain-specific vocabulary.
If you have content in addition to the query/response pairs that can be used to train the model, then specify it in the Texts Data Path.
When you use the pre-trained embeddings, the log shows the percentage of processed vocabulary words. If this value is high, then try using custom embeddings.
The job trains a few (configurable) RNN layers on top of word embeddings or fine-tunes a BERT model on the provided training data. The result model uses an attention mechanism to average word embeddings to obtain the final single dense vector for the content.
Encoder output dim size:
line. You might need this information when creating collections in Milvus.random_*
dynamic field defined in its managed-schema.xml
. This field is required for sampling the data. If it is not present, add the following entry to the managed-schema.xml
alongside other dynamic fields <dynamicField name="random_*" type="random"/>
and <fieldType class=“solr.RandomSortField” indexed=“true” name=“random”/> alongside other field types.