Overview of the Generative AI functionality in Danswer
Danswer supports a large range of LLM hosting services and local/custom such as:
OpenAI
, Azure OpenAI
, HuggingFace
, Anthropic
, Replicate
, AWS Bedrock
, Cohere
, amongst many others.
For each of these, Danswer supports multiple model options as well, such as gpt-3.5-turbo, gpt-4, text-davinci-003 for OpenAI and so on.
Danswer also supports local LLMs such as Ollama and GPT4All.
Finally, Danswer supports custom model hosting servers that don’t conform to standard APIs. For this option however, the user is required to implement a small LLM class which can call out to your custom model hosting server.
Note: Most of the different LLM support is provided by the LiteLLM Langchain library and are configured accordingly (see the following sections for some examples).
The Large Language Models are used to interpret the contents from the most relevant documents retrieved via Search. These models extract out the useful knowledge from your documents and generates the AI Answer.
By default, Danswer uses GPT-3.5-Turbo
from OpenAI. This is the most accessible model/hosting service since it does
not require any access approval process like GPT-4
or Llama2
variants.
OpenAI also hosts the models behind an API which makes it easy to use and much more cost efficient than hosting a model
yourself on dedicated hardware.
All Danswer Gen AI configs are done through deployment environment variables. For docker compose
this means
overwriting the default values in the .env file during deployment. For kubernetes
this means updating the service
deployment yaml files (specifically, the api_server and background services).
The environment variables that impact the Gen AI models are as follows:
gpt-4
, Falcon-180B-Chat-GPTQ
, etc.
https://danswer.openai.azure.com/
for Azure OpenAI.See the next sections for some examples on how to configure different options.
As always, don’t hesitate to reach out to the Danswer team if you have any questions or issues.