Skip to main contentArrow Right

Table of Contents

This tutorial was written by Kevin Kimani, a passionate developer and technical writer who enjoys explaining complicated concepts in a simple way. You can check out his GitHub or X account to connect and see more of his work!


As applications powered by artificial intelligence (AI) continue to reshape almost all industries, combining intelligent functionality with robust security is more important than ever. Retrieval-augmented generation (RAG) enhances the accuracy and relevance of AI-generated responses by integrating external knowledge into the process. Combining Descope for authentication, Supabase for the backend infrastructure, and pgvector for embedding storage and retrieval allows developers to create scalable, high-performance apps prioritizing functionality and security.

In this two-part series, you’ll learn how to combine all these tools to build a secure and intelligent RAG application. In this first part, you will set up the backend using Supabase and utilize pgvector to manage embeddings. In the second part of this series, you will integrate Descope for authentication as a custom Security Assertion Markup Language (SAML) provider and leverage the Supabase Row-Level Security (RLS) to implement granular permissions to make sure that data is accessible only to authorized users.

You can skip to the second part of the series here.

RAG app overview

You’re going to build a RAG app that allows users to query information on the Descope website, such as product info and documentation. The application will have two roles: developer and marketer. Users with the marketer role will be able to query product info–related data, while users with the developer role will be able to query information from the docs.

Here’s an overview of the whole process you’ll follow in this first part of the series:

  1. Preprocess the knowledge base (in this case, Descope developer docs) and generate embeddings for each of these pages.

  2. Store the page content, along with the generated embeddings in Supabase.

  3. Prompt the user for input.

  4. Generate embeddings for the user input and use it to perform a similarity search against the embeddings in the database.

  5. Return the most similar pages and pass them to OpenAI API to generate a response for the user’s query.

Prerequisites

To complete this tutorial, you need the following:

Setting up Supabase as your backend

Supabase is an open source backend-as-a-service built on PostgreSQL that offers features like authentication, real-time capabilities, and RLS. Its integration with pgvector and support for SAML-based single sign-on (SSO) via Descope make it ideal for securely managing embeddings in this RAG application.

To create a new Supabase app, navigate to the New Project page, provide all the required details, and click Create new project:

Fig: Providing new project details
Fig: Providing new project details

Once the new project is created, you need to enable the pgvector extension so that you can store and query embeddings. To do this, select SQL Editor on the sidebar, paste the following SQL code inside the editor, and execute it by clicking the Run button:

Fig: Enabling vector extension
Fig: Enabling vector extension

The extension is not enabled. You can go ahead and create a table to store your embeddings.

Configuring pgvector for embeddings

pgvector is an open source PostgreSQL extension that adds support for storing and querying embeddings directly in the database, which eliminates the need for a separate vector database. With pgvector, you can perform vector similarity searches to find data points that are most similar to a given vector. This is achieved using distance metrics, such as Euclidean distance, cosine similarity, or inner product.

You also need to create a table in the database that stores embeddings alongside other relevant information. To do this, paste the following SQL query inside the SQL Editor:

This SQL query creates a table called documents with six columns. All the columns have been explained in the snippets, but the most important ones here are role, content, and embeddings. The role column is crucial for defining access permissions, ensuring only authorized users can interact with specific data. The content column stores the original document text that is used to create the embedding and passed to the AI model. The embeddings column is defined with the VECTOR data type and a size of 384, specifying the number of dimensions each vector holds. This size is set to 384 to align with the output dimensions of the Supabase/gte-small model, which you’ll use to generate embeddings.

Highlight the query in the editor and execute it by clicking the Run selected button.

Next, you need to perform a similarity search over the embeddings stored in your database to find those most closely related to the user’s query embedding. To do this, paste this code inside the SQL Editor to create a database function:

This code defines a SQL function named get_relevant_docs that retrieves relevant documents based on their similarity to a given query embedding. It takes three parameters:

  • query_embedding is the vector representing the query.

  • threshold is the minimum similarity score for a document to be considered relevant.

  • docs_count is the number of documents to return.

The function calculates the similarity between each document’s embedding. Meanwhile, the query using the <=> operator filters the documents based on the similarity threshold. Then, it returns the top docs_count documents, ordered by similarity. The results include the document’s ID, content, and calculated similarity.

You will call this function later on from the client code.

Remember to run this query to create the function.

Generating and inserting embeddings into the database

Before inserting data into the database, you need to create a Node.js project, which allows you to use the supabase-js library to insert data and query the database.

To create a Node.js project, create a folder named descope-supabase-rag-app, open it in the terminal, and execute the following command:

Add "type": "module" into the package.json to ensure Node.js treats all .js files in your project as ES modules. This eliminates the need for .mjs extensions and allows you to use modern import and export syntax seamlessly throughout the project.

Install the relevant dependencies using the following command:

Here’s how you use each of these dependencies:

  • dotenv is for loading environment variables from a .env file.

  • @supabase/supabase-js is used to interact with Supabase.

  • @xenova/transformers is for using Transformer models for embedding generation.

  • openai is used to interact with OpenAI APIs.

  • puppeteer is for scraping the Descope website for data.

  • p-limit is used to run multiple async functions with limited concurrency.

  • strip-indent is used to remove leading white space from each line in a string.

You also need to define which data you want to scrape from the Descope website. For this, create a file named lib/urls.js in the project root folder. This file holds all the URLs to be scraped. Add the following code to this file:

This code defines two arrays, marketerUrls and devUrls, each containing objects with a role and a URL. The URLs in marketerUrls are for marketing-related resources on Descope, while devUrls contains links to technical documentation for developers. Both arrays are combined into one urls array for centralized access.

After the URLs are scraped, you need to insert the data into Supabase. To authenticate your requests so you can interact with the Supabase API, create a new file named .env in the project root folder and add the following code:

Replace the placeholder values with the actual values that are available on your project’s API settings on the Supabase dashboard, which you can access by selecting Project Settings > API on the sidebar:

Fig: Retrieving Supabase credentials
Fig: Retrieving Supabase credentials

The project URL is available under Project URL, while the project anon key is available under Project API Keys.

Create a file named supabase.js in the lib folder and add the following code to create a Supabase client instance that is exported for use in other parts of the application:

Your app also needs a function for generating embeddings. Create a new file named extractor.js in the lib folder and add the following code:

This code imports the pipeline function from the @xenova/transformers package and creates an extractor using the Supabase/gte-small model for feature extraction. The runExtractor function uses this extractor to generate embeddings for a given text input. It processes the text, applies mean pooling and normalization to the resulting embeddings, and returns the embedding data along with its size. If an error occurs during the extraction process, it logs the error and throws it.

Then, create a script that scrapes the pages and uploads the data to Supabase, create a file named scrape-and-upload.js in the project root folder, and add the following code:

This code scrapes web pages, extracts content, generates embeddings for the text, and pushes the data to a Supabase database. It defines utility functions chunkArray, used to split data into manageable batches, and runExtractor, used to extract embeddings for a given text. The scrapePage function navigates to a given URL, scrapes the page content, generates embeddings, and returns the results. The main scrapeAndPushData function orchestrates the scraping process using concurrency limits, collects the scraped data, and pushes it to the Supabase database in chunks. If any errors occur during the process, they are logged, and the function ensures the browser is closed after completion.

Execute the code using the command node scrape-and-upload.js.

Querying and retrieving similar vectors based on embeddings

Now that you have all the necessary data for the RAG app from the Descope website in the database, you can try to retrieve similar vectors based on embeddings. To do this, create a new file in the project root folder named retrieve-relevant-docs.js and add the following code:

This code imports the runExtractor function to generate embeddings for a query and uses Supabase’s rpc method to call the database function named get_relevant_docs that you created earlier to retrieve documents related to the query. It first generates embeddings for the input query. It then passes the embeddings, a similarity threshold, and a limit on the number of documents to Supabase to get relevant documents. If successful, it logs the retrieved documents; otherwise, it handles and logs any errors.

Execute the code using the command node retrieve-relevant-docs.js. You should then get three documents that are related to your query.

Building the RAG component

At this point, you’ve set up everything except the RAG component. As mentioned, RAG enhances language model responses by incorporating external knowledge. You’ve already prepared a knowledge base and implemented a way to fetch relevant documents based on user queries. Now, you pass these documents to the OpenAI GPT-3.5 Turbo model, enabling it to generate more accurate and contextually relevant responses.

Navigate to the API keys on your OpenAI developer dashboard and click the + Create new secret key button to launch the Create new secret key modal. Fill out the required details, click the Create secret key, and copy your secret key:

Fig: Obtaining OpenAI secret key
Fig: Obtaining OpenAI secret key

Open the .env file in the project root folder and paste the following code, replacing the placeholder value with the secret key you just copied:

Create a new file named openai.js in the lib folder. Then add the following code to initialize an OpenAI client instance with the secret key from environment variables and export it for use in other parts of the application:

Create a file named generate_response.js in the project root folder and add the following code to pass the contextually relevant documents to the OpenAI model and generate a response:

This code first generates embeddings for a user query using the runExtractor. It then uses the Supabase rpc function to fetch relevant documents from a database based on the generated embeddings. The retrieved documents are concatenated to form the context for the response. The code then formats the context and user query into a message for the OpenAI API to generate a response using the GPT-3.5 Turbo model. The assistant’s response is based on the relevant documents, and if the context does not provide an answer, it defaults to saying, “Sorry, I can’t help with that.” Finally, any errors in the process are logged.

Run this code using the command node generate_response.js. You will then get an output similar to the following:

If you change the query to something like “Why is the sky blue?”, you should get the following output:

This confirms that the RAG application is working as expected!

Conclusion

In this blog, you prepared your Supabase database for embedding data, used pgvector to store and retrieve embeddings, and created a database function to retrieve relevant documents. These documents were then passed to the language model as context, enabling it to generate accurate and contextually relevant responses.

In the next part of the series, you will learn how to integrate Descope as a SAML provider for secure login and access control, implement RLS, and build the frontend for the RAG application. Subscribe to our blog or follow us on LinkedIn and Bluesky to stay up to date when the next blog is published!