Refer to Model Configs for how to set the
environment variables for your particular deployment.Note: While we support self hosted LLMs, you will get significantly better responses with a more powerful
model like GPT-4.
# Pick any model served by FastChatGEN_AI_MODEL_VERSION=vicuna-7b-v1.5# Hint: To point Docker containers to http://localhost, use http://host.docker.internal# Don't forget to include the /v1 belowGEN_AI_API_ENDPOINT=http://<your-FastChat-server>/v1GEN_AI_LLM_PROVIDER_TYPE=openai # Since it's an OpenAI compatible API# Let's also make some changes to accommodate the weaker locally hosted LLMQA_TIMEOUT=120 # Set a longer timeout, running models on CPU can be slow# Always run search, never skipDISABLE_LLM_CHOOSE_SEARCH=True# Don't use LLM for reranking, the prompts aren't properly tuned for these modelsDISABLE_LLM_CHUNK_FILTER=True# Don't try to rephrase the user query, the prompts aren't properly tuned for these modelsDISABLE_LLM_QUERY_REPHRASE=True# Don't use LLM to automatically discover time/source filtersDISABLE_LLM_FILTER_EXTRACTION=True# Use only 1 section from the documents and do not require quotesQA_PROMPT_OVERRIDE=weak