Class HuggingFaceServiceSettings

java.lang.Object
co.elastic.clients.elasticsearch.inference.HuggingFaceServiceSettings
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class HuggingFaceServiceSettings extends Object implements JsonpSerializable
See Also:
  • Field Details

  • Method Details

    • of

    • apiKey

      public final String apiKey()
      Required - A valid access token for your HuggingFace account. You can create or find your access tokens on the HuggingFace settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key.

      API name: api_key

    • rateLimit

      @Nullable public final RateLimitSetting rateLimit()
      This setting helps to minimize the number of rate limit errors returned from Hugging Face. By default, the hugging_face service sets the number of requests allowed per minute to 3000 for all supported tasks. Hugging Face does not publish a universal rate limit — actual limits may vary. It is recommended to adjust this value based on the capacity and limits of your specific deployment environment.

      API name: rate_limit

    • url

      public final String url()
      Required - The URL endpoint to use for the requests. For completion and chat_completion tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (see the linked external documentation for details). The endpoint URL for the request must include /v1/chat/completions. If the model supports the OpenAI Chat Completion schema, a toggle should appear in the interface. Enabling this toggle doesn't change any model behavior, it reveals the full endpoint URL needed (which should include /v1/chat/completions) when configuring the inference endpoint in Elasticsearch. If the model doesn't support this schema, the toggle may not be shown.

      API name: url

    • modelId

      @Nullable public final String modelId()
      The name of the HuggingFace model to use for the inference task. For completion and chat_completion tasks, this field is optional but may be required for certain models — particularly when using serverless inference endpoints. For the text_embedding task, this field should not be included. Otherwise, the request will fail.

      API name: model_id

    • serialize

      public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
      Serialize this object to JSON.
      Specified by:
      serialize in interface JsonpSerializable
    • serializeInternal

      protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • rebuild

      Returns:
      New HuggingFaceServiceSettings.Builder initialized with field values of this instance
    • setupHuggingFaceServiceSettingsDeserializer

      protected static void setupHuggingFaceServiceSettingsDeserializer(ObjectDeserializer<HuggingFaceServiceSettings.Builder> op)