Table of Contents
RAG app overview
This tutorial was written by Kevin Kimani, a passionate developer and technical writer who enjoys explaining complicated concepts in a simple way. You can check out his GitHub or X account to connect and see more of his work!
As applications powered by artificial intelligence (AI) continue to reshape almost all industries, combining intelligent functionality with robust security is more important than ever. Retrieval-augmented generation (RAG) enhances the accuracy and relevance of AI-generated responses by integrating external knowledge into the process. Combining Descope for authentication, Supabase for the backend infrastructure, and pgvector for embedding storage and retrieval allows developers to create scalable, high-performance apps prioritizing functionality and security.
In this two-part series, you’ll learn how to combine all these tools to build a secure and intelligent RAG application. In this first part, you will set up the backend using Supabase and utilize pgvector to manage embeddings. In the second part of this series, you will integrate Descope for authentication as a custom Security Assertion Markup Language (SAML) provider and leverage the Supabase Row-Level Security (RLS) to implement granular permissions to make sure that data is accessible only to authorized users.
You can skip to the second part of the series here.
RAG app overview
You’re going to build a RAG app that allows users to query information on the Descope website, such as product info and documentation. The application will have two roles: developer and marketer. Users with the marketer role will be able to query product info–related data, while users with the developer role will be able to query information from the docs.
Here’s an overview of the whole process you’ll follow in this first part of the series:
Preprocess the knowledge base (in this case, Descope developer docs) and generate embeddings for each of these pages.
Store the page content, along with the generated embeddings in Supabase.
Prompt the user for input.
Generate embeddings for the user input and use it to perform a similarity search against the embeddings in the database.
Return the most similar pages and pass them to OpenAI API to generate a response for the user’s query.
Prerequisites
To complete this tutorial, you need the following:
A basic understanding of pgvector
Familiarity with AI concepts, like embeddings and vector similarity search
OpenAI developer account with credits
Node.js installed on your local machine
Setting up Supabase as your backend
Supabase is an open source backend-as-a-service built on PostgreSQL that offers features like authentication, real-time capabilities, and RLS. Its integration with pgvector and support for SAML-based single sign-on (SSO) via Descope make it ideal for securely managing embeddings in this RAG application.
To create a new Supabase app, navigate to the New Project page, provide all the required details, and click Create new project:

Once the new project is created, you need to enable the pgvector extension so that you can store and query embeddings. To do this, select SQL Editor on the sidebar, paste the following SQL code inside the editor, and execute it by clicking the Run button:

The extension is not enabled. You can go ahead and create a table to store your embeddings.
Configuring pgvector for embeddings
pgvector is an open source PostgreSQL extension that adds support for storing and querying embeddings directly in the database, which eliminates the need for a separate vector database. With pgvector, you can perform vector similarity searches to find data points that are most similar to a given vector. This is achieved using distance metrics, such as Euclidean distance, cosine similarity, or inner product.
You also need to create a table in the database that stores embeddings alongside other relevant information. To do this, paste the following SQL query inside the SQL Editor:
This SQL query creates a table called documents
with six columns. All the columns have been explained in the snippets, but the most important ones here are role
, content
, and embeddings
. The role
column is crucial for defining access permissions, ensuring only authorized users can interact with specific data. The content
column stores the original document text that is used to create the embedding and passed to the AI model. The embeddings
column is defined with the VECTOR
data type and a size of 384, specifying the number of dimensions each vector holds. This size is set to 384 to align with the output dimensions of the Supabase/gte-small model, which you’ll use to generate embeddings.
Highlight the query in the editor and execute it by clicking the Run selected button.
Next, you need to perform a similarity search over the embeddings stored in your database to find those most closely related to the user’s query embedding. To do this, paste this code inside the SQL Editor to create a database function:
This code defines a SQL function named get_relevant_docs
that retrieves relevant documents based on their similarity to a given query embedding. It takes three parameters:
query_embedding
is the vector representing the query.threshold
is the minimum similarity score for a document to be considered relevant.docs_count
is the number of documents to return.
The function calculates the similarity between each document’s embedding. Meanwhile, the query using the <=>
operator filters the documents based on the similarity threshold. Then, it returns the top docs_count
documents, ordered by similarity. The results include the document’s ID, content, and calculated similarity.
You will call this function later on from the client code.
Remember to run this query to create the function.
Generating and inserting embeddings into the database
Before inserting data into the database, you need to create a Node.js project, which allows you to use the supabase-js library to insert data and query the database.
To create a Node.js project, create a folder named descope-supabase-rag-app
, open it in the terminal, and execute the following command:
Add "type": "module"
into the package.json
to ensure Node.js treats all .js
files in your project as ES modules. This eliminates the need for .mjs
extensions and allows you to use modern import
and export
syntax seamlessly throughout the project.
Install the relevant dependencies using the following command:
Here’s how you use each of these dependencies:
dotenv
is for loading environment variables from a .env file.@supabase/supabase-js
is used to interact with Supabase.@xenova/transformers
is for using Transformer models for embedding generation.openai
is used to interact with OpenAI APIs.puppeteer
is for scraping the Descope website for data.p-limit
is used to run multiple async functions with limited concurrency.strip-indent
is used to remove leading white space from each line in a string.
You also need to define which data you want to scrape from the Descope website. For this, create a file named lib/urls.js
in the project root folder. This file holds all the URLs to be scraped. Add the following code to this file:
This code defines two arrays, marketerUrls
and devUrls
, each containing objects with a role and a URL. The URLs in marketerUrls
are for marketing-related resources on Descope, while devUrls
contains links to technical documentation for developers. Both arrays are combined into one urls
array for centralized access.
After the URLs are scraped, you need to insert the data into Supabase. To authenticate your requests so you can interact with the Supabase API, create a new file named .env
in the project root folder and add the following code:
Replace the placeholder values with the actual values that are available on your project’s API settings on the Supabase dashboard, which you can access by selecting Project Settings > API on the sidebar:

The project URL is available under Project URL, while the project anon key is available under Project API Keys.
Create a file named supabase.js in the lib folder and add the following code to create a Supabase client instance that is exported for use in other parts of the application:
Your app also needs a function for generating embeddings. Create a new file named extractor.js
in the lib
folder and add the following code:
This code imports the pipeline
function from the @xenova/transformers
package and creates an extractor using the Supabase/gte-small
model for feature extraction. The runExtractor
function uses this extractor to generate embeddings for a given text input. It processes the text, applies mean pooling and normalization to the resulting embeddings, and returns the embedding data along with its size. If an error occurs during the extraction process, it logs the error and throws it.
Then, create a script that scrapes the pages and uploads the data to Supabase, create a file named scrape-and-upload.js
in the project root folder, and add the following code:
This code scrapes web pages, extracts content, generates embeddings for the text, and pushes the data to a Supabase database. It defines utility functions chunkArray
, used to split data into manageable batches, and runExtractor
, used to extract embeddings for a given text. The scrapePage
function navigates to a given URL, scrapes the page content, generates embeddings, and returns the results. The main scrapeAndPushData
function orchestrates the scraping process using concurrency limits, collects the scraped data, and pushes it to the Supabase database in chunks. If any errors occur during the process, they are logged, and the function ensures the browser is closed after completion.
Execute the code using the command node scrape-and-upload.js
.
Querying and retrieving similar vectors based on embeddings
Now that you have all the necessary data for the RAG app from the Descope website in the database, you can try to retrieve similar vectors based on embeddings. To do this, create a new file in the project root folder named retrieve-relevant-docs.js
and add the following code:
This code imports the runExtractor
function to generate embeddings for a query and uses Supabase’s rpc
method to call the database function named get_relevant_docs
that you created earlier to retrieve documents related to the query. It first generates embeddings for the input query. It then passes the embeddings, a similarity threshold, and a limit on the number of documents to Supabase to get relevant documents. If successful, it logs the retrieved documents; otherwise, it handles and logs any errors.
Execute the code using the command node retrieve-relevant-docs.js
. You should then get three documents that are related to your query.
Building the RAG component
At this point, you’ve set up everything except the RAG component. As mentioned, RAG enhances language model responses by incorporating external knowledge. You’ve already prepared a knowledge base and implemented a way to fetch relevant documents based on user queries. Now, you pass these documents to the OpenAI GPT-3.5 Turbo model, enabling it to generate more accurate and contextually relevant responses.
Navigate to the API keys on your OpenAI developer dashboard and click the + Create new secret key button to launch the Create new secret key modal. Fill out the required details, click the Create secret key, and copy your secret key:

Open the .env
file in the project root folder and paste the following code, replacing the placeholder value with the secret key you just copied:
Create a new file named openai.js
in the lib
folder. Then add the following code to initialize an OpenAI client instance with the secret key from environment variables and export it for use in other parts of the application:
Create a file named generate_response.js
in the project root folder and add the following code to pass the contextually relevant documents to the OpenAI model and generate a response:
This code first generates embeddings for a user query using the runExtractor
. It then uses the Supabase rpc
function to fetch relevant documents from a database based on the generated embeddings. The retrieved documents are concatenated to form the context for the response. The code then formats the context and user query into a message for the OpenAI API to generate a response using the GPT-3.5 Turbo model. The assistant’s response is based on the relevant documents, and if the context does not provide an answer, it defaults to saying, “Sorry, I can’t help with that.” Finally, any errors in the process are logged.
Run this code using the command node generate_response.js
. You will then get an output similar to the following:
If you change the query to something like “Why is the sky blue?”, you should get the following output:
This confirms that the RAG application is working as expected!
Conclusion
In this blog, you prepared your Supabase database for embedding data, used pgvector to store and retrieve embeddings, and created a database function to retrieve relevant documents. These documents were then passed to the language model as context, enabling it to generate accurate and contextually relevant responses.
In the next part of the series, you will learn how to integrate Descope as a SAML provider for secure login and access control, implement RLS, and build the frontend for the RAG application. Subscribe to our blog or follow us on LinkedIn and Bluesky to stay up to date when the next blog is published!