I'm a software engineer and have experience shipping projects that utilize AI models via APIs. However, I had no direct experience with the specifics of training models. My goal was to take a local LLM and have it answer questions based on a custom dataset, a common task that turned out to have more nuance than I initially expected. This post documents my process, the initial incorrect assumptions I made, and the working RAG (Retrieval-Augmented Generation) pipeline I ended up with.
Note : I haven't written the code presented in this post myself. The entire process was an exercise in prompting; the scripts and conclusions are the result of asking an LLM, like ChatGPT, a series of basic questions.
Attempt 1: Context Stuffing with Ollama's Modelfile
My first attempt involved Ollama. I found a Python script that appeared to "fine-tune" a model by taking text files as input. I ran it with a few documents, and it quickly produced a new model that could answer questions...