Tokenization by MODEL_ID
import requests
url = "https://application_id.applications.lucidworks.com/ai/tokenization/{MODEL_ID}"
payload = {
"batch": [{ "text": "Mr. and Mrs. Dursley and O'\''Malley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much" }],
"useCaseConfig": { "dataType": "passage" },
"modelConfig": {
"vectorQuantizationMethod": "max-scale",
"dimReductionSize": 256
}
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
print(response.text){
"generatedTokens": [
{
"tokens": [
"\"[CLS]\", \"query\", \":\", \"mr\", \".\", \"and\", \"mrs\", \".\", \"du\", \"##rs\", \"##ley\", \"and\", \"o\", \"'\", \"malley\", \",\", \"of\", \"number\", \"four\", \",\", \"pri\", \"##vet\", \"drive\", \",\", \"were\", \"proud\", \"to\", \"say\", \"that\", \"they\", \"were\", \"perfectly\", \"normal\", \",\", \"thank\", \"you\", \"very\", \"much\", \".\", \"[SEP]\""
]
}
],
"tokensUsed": {
"inputTokens": 40,
"promptTokens": 148,
"completionTokens": 0,
"totalTokens": 175
}
}Tokenization by MODEL_ID
The tokenization request for the pre-trained and custom embedding use cases and specified embedding modelId (model name) sends text to return results in formats supported by embedding models.
POST
/
ai
/
tokenization
/
{MODEL_ID}
Tokenization by MODEL_ID
import requests
url = "https://application_id.applications.lucidworks.com/ai/tokenization/{MODEL_ID}"
payload = {
"batch": [{ "text": "Mr. and Mrs. Dursley and O'\''Malley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much" }],
"useCaseConfig": { "dataType": "passage" },
"modelConfig": {
"vectorQuantizationMethod": "max-scale",
"dimReductionSize": 256
}
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
print(response.text){
"generatedTokens": [
{
"tokens": [
"\"[CLS]\", \"query\", \":\", \"mr\", \".\", \"and\", \"mrs\", \".\", \"du\", \"##rs\", \"##ley\", \"and\", \"o\", \"'\", \"malley\", \",\", \"of\", \"number\", \"four\", \",\", \"pri\", \"##vet\", \"drive\", \",\", \"were\", \"proud\", \"to\", \"say\", \"that\", \"they\", \"were\", \"perfectly\", \"normal\", \",\", \"thank\", \"you\", \"very\", \"much\", \".\", \"[SEP]\""
]
}
],
"tokensUsed": {
"inputTokens": 40,
"promptTokens": 148,
"completionTokens": 0,
"totalTokens": 175
}
}Headers
The authentication and authorization access token.
application/json
Example:
"application/json"
Path Parameters
The name of the pre-trained or custom embedding model.
Example:
"e5-small-v2"
Body
application/json
Was this page helpful?
⌘I