RAG support in Helix
Patch fix - 0.9.1 fixes streaming in the UI for plain inference sessions.
0.9 release notes
We now support RAG in Helix. You can upload documents and perform RAG over them from the homepage:
We have also switched "inference" and "finetune" to the more generic and user friendly "chat" and "learn":
The default Learn mode is now RAG, because it's much, much faster than fine-tuning. RAG is better at retrieving specific facts, whereas fine-tuning is better at answering general questions about the documents uploaded.
You can still fine-tune, either choose fine tuning from the app homepage, or use the settings button:
Using RAG and fine tuned data sources in Helix Apps
You can now also specify RAG and finetune data sources in Helix Apps' helix.yaml to customize an assistant with a RAG data source or fine tuned LLM. To do this, run a RAG or finetune session which will now create a "data source ID". Retrieve the rag_source_data_entity_id from the info button in a RAG session, like this:
"rag_source_data_entity_id": "c6cc22d3-23a6-4b2d-acdd-6f561158e0c0",
And place it in a helix.yaml file in a GitHub repo like this:
name: My Test Helix RAG App description: This is a test Helix RAG app assistants: - name: My Example RAG Assistant description: This is an example assistant with a rag source rag_source_id: 8b4ff837-b42e-41d2-a5cd-fc7f6c26e08f
Then proceed to use Helix Apps as documented here
This rag_source_id can also be overriden as an API parameter when making an API call.
You can do the same with finetune data sources, named finetune_data_entity_id in the info panel and specified in the helix.yaml as lora_id .
PRs in this release
Fix/bump msg limit by @rusenask in #303
fix(llamaindex): use text for filename in database to accommodate long filenames by @philwinder in #304
Feature/basic data entities by @lukemarsden in #300