RAG support in Helix

Patch fix - 0.9.1 fixes streaming in the UI for plain inference sessions.

0.9 release notes

We now support RAG in Helix. You can upload documents and perform RAG over them from the homepage:

We have also switched "inference" and "finetune" to the more generic and user friendly "chat" and "learn":

The default Learn mode is now RAG, because it's much, much faster than fine-tuning. RAG is better at retrieving specific facts, whereas fine-tuning is better at answering general questions about the documents uploaded.

You can still fine-tune, either choose fine tuning from the app homepage, or use the settings button:

Using RAG and fine tuned data sources in Helix Apps

You can now also specify RAG and finetune data sources in Helix Apps' helix.yaml to customize an assistant with a RAG data source or fine tuned LLM. To do this, run a RAG or finetune session which will now create a "data source ID". Retrieve the rag_source_data_entity_id from the info button in a RAG session, like this:

"rag_source_data_entity_id": "c6cc22d3-23a6-4b2d-acdd-6f561158e0c0",

And place it in a helix.yaml file in a GitHub repo like this:

name: My Test Helix RAG App description: This is a test Helix RAG app assistants: - name: My Example RAG Assistant description: This is an example assistant with a rag source rag_source_id: 8b4ff837-b42e-41d2-a5cd-fc7f6c26e08f

Then proceed to use Helix Apps as documented here

This rag_source_id can also be overriden as an API parameter when making an API call.

You can do the same with finetune data sources, named finetune_data_entity_id in the info panel and specified in the helix.yaml as lora_id .

PRs in this release

Fix/bump msg limit by @rusenask in #303

fix(llamaindex): use text for filename in database to accommodate long filenames by @philwinder in #304

Feature/basic data entities by @lukemarsden in #300