While large language models (LLMs) are capable of incredible feats of summarization and translation, deploying them in mission critical ways is beset with problems – even for the largest tech companies in the world.
While they are trained on huge volumes of data, LLM’s are still limited by their training data and the quality of the prompt. Even then, there is always a chance that the model will “hallucinate” and make things up when it doesn’t know the correct answer.
Enter retrieval augmented generation (RAG), a fast-emerging technique to solving these problems. Let’s dig in and look at what it does, where it’s effective, and the limitations and costs to employing it.