Gouranga's Tech Blog: 23ai

Showing posts with label 23ai. Show all posts

Jul 29, 2025

RAG (Retrieval-Augmented Generation) explained

RAG (Retrieval-Augmented Generation) is a framework in Artificial Intelligence that combines retrieval-based methods with generative models (like GPT or BERT) to produce more accurate, informative, and context-aware responses.

RAG by by NVIDIA:
Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with information from specific and relevant data sources. In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM’s parameters essentially represent the general patterns of how humans use words to form sentences.

RAG by G-Cloud:
RAG (Retrieval-Augmented Generation) is an AI framework that combines the strengths of traditional information retrieval systems (such as search and databases) with the capabilities of generative large language models (LLMs). By combining your data and world knowledge with LLM language skills, grounded generation is more accurate, up-to-date, and relevant to your specific needs.

RAG by AWS:
Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

May be I can explain RAG as ...
Retrieval-Augmented Generation (RAG) is a hybrid AI architecture.
It augments a language model with an external knowledge base (retrieved at runtime).
This helps the model access up-to-date or domain-specific information that may not be in its training data.

How it Works?
Query → User asks a question.
Retrieval → A retriever (like FAISS, Elasticsearch, etc.) fetches relevant documents from a knowledge base.
Augmentation → Retrieved documents are appended to the prompt for a generative model.
Generation → The generative model (e.g., GPT, T5) produces an answer using both the query and the retrieved context.

Real-World Use Case Example
Use Case: Customer Support Bot for an Insurance Company

Problem:
The chatbot should answer policy-related questions like:
“What does my health insurance plan cover for cancer?”
Traditional model might say:
“Health insurance may or may not cover cancer based on the plan.”

RAG-based chatbot does:
Retrieves actual policy documents from the company’s internal knowledge base.
Generates answer like:
“According to your Health Secure Plan (Policy ID: XXZZ2025), cancer benefits cover up to ₹500,000, including pre and post-natal expenses.”

Personalized
Accurate
Trustworthy

When to Use RAG?

When information is frequently changing (e.g., news, prices, health guidelines).
When custom domain knowledge is critical.
When you want to combine accuracy with natural language fluency.

Agentic RAG
Agentic AI and LLM agents have been around for a few months, and the idea is simple: AI systems that don’t just give answers but actively assist with tasks, adapt to new information, and work independently when needed. The challenge, though, is making these systems reliable and up-to-date, especially in areas where the information changes all the time.

RAG was one of the first approaches to tackle this problem. It combines two things: the ability to fetch real-time, relevant data (retrieval) and the power to generate responses using that data (generation). As soon as agentic AI became a focus, RAG quickly stood out as a natural fit. It gave AI systems the ability to stay current and respond with information that made sense for the situation.

What are RAG agents?
RAG agents are AI tools designed to do more than retrieve and generate—they’re built for doing specific tasks. Think of them as goal-oriented assistants that know where to find the information they need and how to use it. Instead of generic answers, they’re tailored for real-world situations.

For example:
A RAG agent in customer support doesn’t only tell you the refund policy; it finds the exact details for your specific order.
In healthcare, a RAG agent doesn’t only summarize medical studies; it pulls the most relevant research based on a patient’s case.
So, if an LLM-based RAG would only answer questions, RAG agents fit into workflows and make decisions based on fresh, relevant data.

RAG agent frameworks (Most popular in 2025)
Some of the famous agentic RAG frameworks are LangChain, LlamaIndex Agents, DB GPT, Quadrant Rag Eval, MetaGPT, Ragapp, GPT RAG by Azure, IBM Granite, Langflow, AgentGPT, CrewAI etc.

RAG vs. Semantic Search

Feature	RAG (Retrieval-Augmented Generation)	Semantic Search
Definition	Combines document retrieval with a generative LLM to answer queries in natural language	Retrieves documents or chunks that semantically match a query (no generation)
Output	Synthesized, human-like response using retrieved content	List of relevant documents or passages
Use Case	Question answering, summarization, chatbots	Search engines, document finders, internal knowledge lookup
Model Role	Uses both retriever + generator (e.g., vector search + GPT)	Only uses retriever (e.g., vector embeddings + similarity search)
Latency	Higher (retrieval + generation step)	Lower (just retrieval)
Explainability	Moderately explainable (shows retrieved docs and generation)	High explainability (results are directly retrieved)
Example Prompt	"Summarize all recent changes in our HR policy."	"Show me documents related to recent HR policy updates."

Important Update as on 2nd October 2024:

Cohere.command, cohere.command-light, meta.llama-2-70b-chat (and few more) AI models got retired (and in the last few months), so any application using these AI models will not work and give a response error message “Entity with key <model-name> not found” or “Error 400 <model-name> does not support TextGeneration”, Please see the list of models already retired or sooner to retire.

Jul 27, 2025

Vector Search and Vector Index - working steps (demo)

-- Setup Oracle 23ai

Click Here : Oracle 23ai Database setup

Small Introduction about vector Database and Vector Search:

- Vector Database

A vector database is a specialized type of database designed to store, index, and query data represented as high-dimensional vectors, also known as vector embeddings. These embeddings are numerical representations of various data types, such as text, images, audio, or other complex objects, generated by machine learning models like large language models (LLMs)

- Vector Search:

Vector search is a method of information retrieval that leverages these vector embeddings to find data points that are semantically similar to a given query. Instead of relying on exact keyword matches, vector search calculates the "distance" or "similarity" between the query vector and the vectors stored in the database. This allows for more intelligent and context-aware searches, enabling use cases like:

1) Semantic Search
2) Image and Audio Recognition
3) Retrieval-Augmented Generation (RAG)
4) Recommendation Systems

Vector databases and vector search are crucial components in modern AI applications, facilitating efficient and effective handling of unstructured and semi-structured data for various machine learning tasks.

Most Important part in AI is Retrieval-augmented generation (RAG). RAG is a technique that enhances the performance of large language models (LLMs) by combining them with an external knowledge base. Instead of relying solely on their pre-trained knowledge, RAG systems retrieve relevant information from external sources and incorporate it into the generation process, resulting in more accurate, informed, and up-to-date responses.

Note: Create your own file system and directories to test the scenarios

Let us start the POC on loading LLM, Creating vector and generate test case on vector search.

Step#01 - create a directory to hold the model, download it, and unzip the model into that directory.

$ mkdir -p /u04/23ai_activity/models
$ cd /u04/23ai_activity/models
$ wget https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/VBRD9P8ZFWkKvnfhrWxkpPe8K03-JIoM5h_8EJyJcpE80c108fuUjg7R5L5O7mMZ/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2_augmented.zip

unzip -oq all_MiniLM_L12_v2_augmented.zip

Step#02 - Connect to the database, create a test user, create a directory object pointing to the physical directory, and grant access to it for our test user.

SQL> alter session set container=FREEPDB1;
Session altered.

- create user to test scenarios ( use strong password for the user)

create user if not exists testuser1 identified by testuser1 quota unlimited on users;
grant create session, db_developer_role, create mining model to testuser1;
create or replace directory model_dir as '/u04/23ai_activity/models';
grant read, write on directory model_dir to testuser1;

Step#3 - Now load the model into the database using the DBMS_VECTOR package.

sqlplus /nolog
conn testuser1/testuser1@//localhost:1521/freepdb1

SQL>
begin
dbms_vector.drop_onnx_model (
model_name => 'ALL_MINILM_L12_V2',
force => true);
dbms_vector.load_onnx_model (
directory => 'model_dir',
file_name => 'all_MiniLM_L12_v2.onnx',
model_name => 'ALL_MINILM_L12_V2');
end;
/
PL/SQL procedure successfully completed.

- Now see the model information in the USER_MINING_MODELS view.

column model_name format a30
column algorithm format a10
column mining_function format a15
select model_name, algorithm, mining_function
from user_mining_models
where model_name = 'ALL_MINILM_L12_V2';
MODEL_NAME ALGORITHM MINING_FUNCTION
------------------------------ ---------- ---------------
ALL_MINILM_L12_V2 ONNX EMBEDDING

Step#4 - Generating Vectors (VECTOR Data Type)

Generate vectors using the VECTOR_EMBEDDING function. In the following example we generate a vector for the text "Quick test". As you can see, the resulting vector is really big considering the size of the text.

set lines 160
set long 1000000
select vector_embedding(all_minilm_l12_v2 using 'Quick test' as data) AS my_vector;
MY_VECTOR
----------------------------------------------------------------------------------------------------------------------------------------------------------------
[-3.86444256E-002,7.27762803E-002,-6.99377712E-003,-7.29618035E-003,8.81512091E-003,-6.36086613E-002,4.39666817E-003,-4.20215651E-002,-1.32307202E-001,-5.837616
05E-003,-1.3236966E-002,-1.62914731E-002,6.54898351E-003,-4.983522E-002,-1.98450536E-002,-4.69920225E-002,1.03937663E-001,-8.96753445E-002,-2.77861813E-003,4.13
947664E-002,-6.51626661E-002,-1.09901905E-001,-8.73053819E-003,2.533352E-002,-1.42030632E-002,-2.42071245E-002,1.91591978E-002,4.93748812E-003,6.30869251E-003,-
1.24127813E-001,-7.17297941E-003,3.7317384E-002,4.97635901E-002,4.52162437E-002,1.49683114E-002,-2.21795831E-002,-3.67936082E-002,-6.20233943E-004,7.16803819E-0
02,5.33913262E-003,1.92087106E-002,-9.91346017E-002,3.90679464E-002,2.22725421E-002,5.04363105E-002,1.81943253E-002,5.34031466E-002,1.44161871E-002,-1.99907795E
-002,-1.20323608E-002,-2.63888389E-002,-4.14667316E-002,6.2473774E-002,-4.68838662E-002,1.16748568E-002,-2.43180972E-002,-3.11982706E-002,-7.5750039E-003,2.2546
6359E-002,-4.17359956E-002,1.23237111E-002,4.31706831E-002,-7.83750787E-002,1.24918511E-002,5.42060807E-002,4.33742851E-002,2.52278382E-003,-1.15482137E-002,-9.
98658361E-004,-2.12613102E-002,1.00960173E-002,3.17986645E-002,-1.13146752E-002,-1.26893371E-002,2.66182758E-002,-7.50683714E-003,-3.70341949E-002,1.94851588E-0
02,-2.9213747E-002,-2.61210538E-002,2.86212545E-002,-9.15900841E-002,1.50552345E-002,-4.98168021E-002,2.29324233E-002,7.82517716E-003,4.22972552E-002,3.37974802
E-002,-4.2345725E-002,-6.32970557E-002,3.84949856E-002,-1.93851739E-002,1.96233811E-003,-3.91593436E-004,7.80334743E-003,5.63595518E-002,4.45814878E-002,-4.9701
6348E-002,1.36384079E-002,2.76547611E-001,6.3580215E-002,-1.69337653E-002,-3.25948671E-002,2.74621621E-002,-1.84809547E-002,-3.58916223E-002,3.18280957E-003,-3.

===========
EXAMPLE-2
===========
Step#1 - Download below csv file from hugging face

https://huggingface.co/datasets/ygorgeurts/movie-quotes/resolve/main/movie_quotes.csv?download=true -O movie_quotes.csv

Step#2 - Connect to the database and create a new table from the CSV file of movie quotes.

sqlplus /nolog
conn testuser1/testuser1@//localhost:1521/freepdb1
drop table if exists movie_quotes purge;
create table movie_quotes as
select movie_quote, movie, movie_type, movie_year
from external (
(
movie_quote varchar2(400),
movie varchar2(200),
movie_type varchar2(50),
movie_year number(4)
)
type oracle_loader
default directory model_dir
access parameters (
records delimited by newline
skip 1
badfile model_dir
logfile model_dir:'moview_quotes_ext_tab_%a_%p.log'
discardfile model_dir
fields csv with embedded terminated by ',' optionally enclosed by '"'
missing field values are null
(
movie_quote char(400),
movie,
movie_type,
movie_year
)
)
location ('movie_quotes.csv')
reject limit unlimited
);


Table created.
SQL>

SQL> desc movie_quotes
Name Null? Type
----------------------------------------- -------- ----------------------------
MOVIE_QUOTE VARCHAR2(400)
MOVIE VARCHAR2(200)
MOVIE_TYPE VARCHAR2(50)
MOVIE_YEAR NUMBER(4)
SQL>

Step#4 - add a new column to hold the vector data for each movie quote. we are using the new VECTOR data type.

SQL> alter table movie_quotes add (
movie_quote_vector vector
);
Table altered.
SQL> desc movie_quotes
Name Null? Type
----------------------------------------- -------- ----------------------------
MOVIE_QUOTE VARCHAR2(400)
MOVIE VARCHAR2(200)
MOVIE_TYPE VARCHAR2(50)
MOVIE_YEAR NUMBER(4)
MOVIE_QUOTE_VECTOR VECTOR(*, *)
SQL>

Step#5 - populate the new column by generating vectors from the movie quotes.

SQL> update movie_quotes
set movie_quote_vector = vector_embedding(ALL_MINILM_L12_V2 using movie_quote as data);
732 rows updated.
SQL> commit;
Commit complete.

Step#5 - Vector Search using VECTOR_DISTANCE

Now perform a search using the VECTOR_DISTANCE function.
This accepts two vectors and as the name suggests, returns a distance between them.
Since the model we used generates vectors for text data,
we would expect the vector distance to be smaller for two vectors that are similar.
In the following examples we create a vector from our search text and order the output
of the query by the vector distance between our search text and the quote text.
First we ask for "Films with motivational speaking in them".

SQL> variable search_text varchar2(100);
exec :search_text := 'Films with motivational speaking in them';SQL>
PL/SQL procedure successfully completed.
SQL>
set pages 100
set linesize 200
column movie format a50
column movie_quote format a100
SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
movie,
movie_quote
FROM movie_quotes
order by 1
fetch approximate first 5 rows only;

OUT-PUT:

DISTANCE MOVIE MOVIE_QUOTE
---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------
6.786E-001 Once Upon a Time in Hollywood That was the best acting i've ever seen in my whole life.
6.979E-001 Dead Poets Society You must strive to find your own voice because the longer you wait to begin, the less likely you are
going to find it at all.
7.169E-001 The Pursuit of Happyness Walk that walk and go forward all the time. Don't just talk that talk, walk it and go forward. Also,
the walk didn't have to be long strides; baby steps counted too. Go forward.
7.186E-001 Joker My mother always tells me to smile and put on a happy face. She told me I had a purpose to bring lau
ghter and joy to the world.
7.234E-001 Blazing Saddles Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you me
n will only be risking your lives, whilst I will be risking an almost certain Academy Award nominati
on for Best Supporting Actor.

Next we ask for "Films about war".
-------------------------------

variable search_text varchar2(100);
exec :search_text := 'Films about war';
set linesize 200
column movie format a50
column movie_quote format a100
SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
movie,
movie_quote
FROM movie_quotes
order by 1
fetch approximate first 5 rows only;

DISTANCE MOVIE MOVIE_QUOTE
---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------
5.682E-001 Dr. Strangelove Gentlemen, you can't fight in here! This is the War Room!
6.346E-001 Blazing Saddles Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you me
n will only be risking your lives, whilst I will be risking an almost certain Academy Award nominati
on for Best Supporting Actor.
6.587E-001 Fury Ideals are peaceful; history is violent.
7.243E-001 The Kill Team You give me your loyalty, and I?ll guarantee that each and every one of you will have a chance to be
a warrior, to actually be a part of history.
7.253E-001 Dr. No Bond. James Bond

Step#6 - (Optional) Create a Vector Index

If we want to create a vector index, we must set the VECTOR_MEMORY_SIZE parameter for the root container. The amount of memory we need depends on the size and complexity of the data being indexed.

conn / as sysdba
SQL> show parameter vector_memory_size;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
vector_memory_size big integer 0

alter system set vector_memory_size = 1G scope=spfile;
shutdown immediate;
startup;

-- There are two types of vector indexes.
-- Neighbor Partition Vector Indexes
-- In-Memory Neighbor Graph Vector Index

conn testuser1/testuser1@//localhost:1521/freepdb1
drop index if exists movie_quotes_vector_idx;

-- Neighbor Partition Vector Index

create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization neighbor partitions
distance cosine
with target accuracy 95;

drop index if exists movie_quotes_vector_idx;

-- In-Memory Neighbor Graph Vector Index

create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization inmemory neighbor graph
distance cosine
with target accuracy 95;

Note: Vectors are big. This has to be taken into account from both a storage perspective, and a memory perspective, especially when creating vector indexes.

-----
-- Search the query again
variable search_text varchar2(100);
exec :search_text := 'Films about war';
set linesize 200
column movie format a50
column movie_quote format a100
SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
movie,
movie_quote
FROM movie_quotes
order by 1
fetch approximate first 5 rows only;

Now see the difference when you have large amount of data.

Follow my blog and share with others.

Jul 24, 2025

Oracle Database 23ai Free version setup on Linux - rpm based (simplified)

- rpms to download

oracle-database-free-23ai-1.0-1.el8.x86_64.rpm

oracle-database-preinstall-23ai-1.0-2.el8.x86_64.rpm

-- run the pre-install rpm

# dnf -y install oracle-database-preinstall-23ai-1.0-2.el8.x86_64.rpm

-- verify

# rpm -qa | grep -i database

oracle-database-preinstall-23ai-1.0-2.el8.x86_64

-- Install database via rpm

# dnf -y install oracle-database-free-23ai-1.0-1.el8.x86_64.rpm

Note: your last message must be like below:

Installed:

oracle-database-free-23ai-1.0-1.x86_64

Complete!

-- Now verify:

# rpm -qa | grep -i database

oracle-database-free-23ai-1.0-1.x86_64

oracle-database-preinstall-23ai-1.0-2.el8.x86_64

-- Oracle Database 23ai Creation and Configuration

# cd /opt/oracle/product/23ai/dbhomeFree

# /etc/init.d/oracle-free-23ai configure

e.g.,

# /etc/init.d/oracle-free-23ai configure

Specify a password to be used for database accounts. Oracle recommends that the password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9]. Note that the same password will be used for SYS, SYSTEM and PDBADMIN accounts:
Password cannot be null. Enter password:
The password you entered contains invalid characters. Enter password:
Confirm the password:
Configuring Oracle Listener.
Listener configuration succeeded.
Configuring Oracle Database FREE.
Enter SYS user password:
**********
Enter SYSTEM user password:
********
Enter PDBADMIN User Password:
************
Prepare for db operation
7% complete
Copying database files
29% complete
Creating and starting Oracle instance
30% complete
33% complete
36% complete
39% complete
43% complete
Completing Database Creation
47% complete
49% complete
50% complete
Creating Pluggable Databases
54% complete
71% complete
Executing Post Configuration Actions
93% complete
Running Custom Scripts
100% complete
Database creation complete. For details check the logfiles at:
/opt/oracle/cfgtoollogs/dbca/FREE.
Database Information:
Global Database Name:FREE
System Identifier(SID):FREE
Look at the log file "/opt/oracle/cfgtoollogs/dbca/FREE/FREE.log" for further details.
Connect to Oracle Database using one of the connect strings:
Pluggable database: ugbu-mum-98/FREEPDB1
Multitenant container database: ugbu-mum-98
#

-- Database status check
# /etc/init.d/oracle-free-23ai status
Status of the Oracle FREE 23ai service:
LISTENER status: RUNNING
FREE Database status: RUNNING
#
-- verify pmon
$ ps -ef | grep pmon
oracle 39598 1 0 07:16 ? 00:00:00 db_pmon_FREE
oracle 44957 44346 0 07:36 pts/0 00:00:00 grep --color=auto pmon

-- Check the database

$ sqlplus / as sysdba
SQL*Plus: Release 23.0.0.0.0 - for Oracle Cloud and Engineered Systems on Thu Oct 24 07:22:07 2024
Version 23.5.0.24.07
Copyright (c) 1982, 2024, Oracle. All rights reserved.
Connected to:
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.5.0.24.07
SQL>

Note: From 23ai onwards, non-container database option not available. Default is container based database

SQL> show con_name
CON_NAME
------------------------------
CDB$ROOT
SQL> select name,open_mode,database_role,log_mode from v$database;
NAME OPEN_MODE DATABASE_ROLE LOG_MODE
--------- -------------------- ---------------- ------------
FREE READ WRITE PRIMARY NOARCHIVELOG
SQL> select BANNER from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
SQL>
SQL> show pdbs;
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 FREEPDB1 READ WRITE NO
SQL> alter session set container=FREEPDB1;
Session altered.
SQL>

-- To connect from client / application / oracle developer, you can add an entry to tnsnames.ora file
e.g.,

FREEPDB1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = <your-server-ip/host>)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FREEPDB1)
)
)

Gouranga's Tech Blog

Pages