How to use retrieval-augmented generation (RAG) effectively
![](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67aced5c81ba682c3d5a4479_Frame%2015.png)
Companies across industries can use retrieval-augmented generation (RAG) to build powerful, reliable, and personalized AI features in their products.
We’ll break these use cases down by highlighting how real-world companies use this approach to power their AI features and products.
But first, let’s align on the RAG workflow.
How RAG works
Put simply, RAG is a process that allows large language models (LLMs) to fetch and use context to generate more accurate and helpful outputs.
The RAG workflow consists of a few steps:
![Visualization of how RAG works](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67aceb17d28273eda297ceb7_Screenshot%202025-02-12%20at%201.40.10%E2%80%AFPM.png)
- A specific query—potentially from a user—gets embedded, or turned into a vector, and added to a vector database via an embedding algorithm
- All of the embeddings that are semantically similar to the embedded query are identified in the vector database and, along with the embedded query, get fed to an LLM
- The LLM uses all of these embeddings to generate an output. In many cases, the outputs include links to sources in case the user wants to learn more
RAG use cases
Here are just a few impactful ways that companies currently use RAG to build differentiated AI products and features.
Assembly
Assembly, an employee recognition software, offers an enterprise AI search solution (dubbed “DoraAI”) that’s built on RAG.
Essentially, once a user asks DoraAI a question (e.g., what’s the company’s PTO policy), the query gets embedded.
Semantically-similar embeddings get identified (e.g., the company’s PTO policy documents), and both the embedded query and the identified embeddings get fed to the LLM Assembly uses, which generates a response for the user (e.g., “The company’s PTO policy is…”).
Here’s more on how Assembly’s enterprise AI search works and how they power it via Merge:
Juicebox
Juicebox is an AI-powered platform that helps recruiters identify and connect with high-fit candidates.
RAG enables a key workflow in their product: Once a user selects a job they want to find candidates for within Juicebox, the platform will embed the job description for that role and search for semantically-similar candidate profiles in a vector database.
The LLM can then use the retrieved profiles and the embedded job description to refine, rank, and enrich the results—surfacing thousands of ideal candidates for the role.
Here’s more from Juicebox’s CEO and co-founder:
Ema
Ema is a Universal AI Employee that lets customers deploy AI agents to perform all kinds of tasks across the business—from troubleshooting and resolving customer issues to helping employees access company information and perform actions based on that information.
For example, Ema can help sales reps use their Proposal Writer AI Agent to create proposals for prospective customers.
The query could be whatever inputs the rep gives the agent. The agent can then go on to identify semantically-similar embeddings—such as examples of previous proposals drafted for similar prospects—in a vector database.
From there, the agent can use the relevant documents as well as the initial query to generate a highly-relevant and accurate proposal.
![How Ema's Proposal Writer AI Agent works](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67acead8ebd5b6b2d60ec1d7_AD_4nXe73SxPl7m4DmBpisu_CovJidIzUu_JCFnhJX73bDkF70RKMj6mrNW1HtwkypCDf2pKIHB3REFiTqpnWitULaxtsqrzhgMyP81Vtl37xGOKJ0skmyIhVzM7mBtp_syP8NIHpyK6nw.png)
How to implement RAG effectively
While the best practices for RAG can vary based on your use cases, there are generally a few worth following across the board.
Normalize the majority of your data before embedding it
Normalized data, which is data that’s standardized and transformed into a consistent format across multiple systems or platforms, is critical for RAG.
![How file creation date fields can be normalized](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67a3aed073b506325a7e7809_AD_4nXdzsfgxAmxn6FMMuE5A2_qJR3blk74kgL8SciZBKVP1nBHt9vOwsukTI9GWMLQ0HUX80nYMPbnf5a_LGeOX6y45Ind6BaALlRDpaVv_usFEAEl0wzktV0iVrpwTIbQd5rkPAQVCkA.png)
The process of normalizing data lets you remove sensitive and duplicate information. This, in turn, enables the LLM that uses the data to avoid generating outputs with potentially compromising and/or redundant information.
![How normalization can remove sensitive data](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67a23e9966ead6fcee07e828_AD_4nXcphYE3L-VVwhvxOOfLL_g0XTVZxACR4q7S6knhLYjkAsOBtSKdXx6VKwMvMCG9MgGhziSGcN_L_m-wfK91VCM_HEX_FMHpEIQFpTkCuRGZiVdsTN8IZaRtTkmDcqFxayyFBrsjfw.png)
Just as important, since normalized data doesn’t have unnecessary information and is consistently formatted the same, an embedding algorithm is more likely to embed it accurately.
This means the embedded query can successfully identify the most relevant embeddings, which allows the LLM that receives these embeddings to then generate precise outputs.
![How normalized data leads to more accurate outputs](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67acead8b5a078afd5a2bd7b_AD_4nXdItXhwZZrMwlPmL8JjRLdV8xd4uSkBNh9kdhkDhum66wo7rLC1UgCZ8jy3leMSOZK4bdgg-8vwPZy7FgMoTfzPP583fBLnw62fUEbltsGj02x8KD6PBY4y2FTH2cUAJQz6e_zT.png)
Use raw data for edge cases
Your customers might use unique types of data to support their specific business processes.
![Examples of custom fields across systems of record](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/67acec0152c48428bae1a93b_Screenshot%202025-02-12%20at%201.43.58%E2%80%AFPM.png)
It wouldn’t make sense to normalize the data that meets this criteria, but you may want to incorporate it into your RAG workflow when it’s relevant to a RAG use case you support for a customer.
In short, it’s worth paying close attention to your customers’ data sets so that you can identify and account for these scenarios in your RAG implementation.
Leverage a unified API platform
A unified API platform allows you to add hundreds of integrations to your product through a single integration build. These integrations also span several software categories—from file storage to CRM to ticketing to ERP solutions.
![How a unified API solution can work](https://cdn.prod.website-files.com/62796ab9647626cbab663f42/671bdd04ebbcbfd23dd7cc67_AD_4nXehAsV4PlCaB9ENDqTPoZrefLrbwaNZRi-K_LTJ6w_LuJwkKz63n0qE9vrodpSvdLtaKzpZBcO-Sv9UOMSf5MVTw9KryqEd9eVsQ0Y8s--idq190Wg2PznVbdEW4jC8X_5KCqjNE7S8P4nC_uaeFePrixWb.png)
The platform can also normalize customer data automatically through its defined data models and access raw data from customers’ applications as needed.
Taken together, your platform can support any RAG use case effectively.
{{this-blog-only-cta}}