How to create LLM based Chatbot for SAP HANA Cloud using RAG Application

May 6, 2024 | by magnews24.com

What is a RAG application?

RAG (Retrieval Augmented Generation) is like giving a super-smart robot access to a huge library of books (In our example it is SAP HANA Vector Engine) so it can find the right answers when you ask it questions. It works by blending two methods: one where the robot searches for information, and another where it generates answers. This way, the robot can understand your questions better and give you more accurate responses. It’s like having a really knowledgeable friend who always has the perfect answer to your questions because they know where to look for information.

Objective

To create a quick and easy RAG application that can query SAP HANA Cloud Vector Engine using LLM and reply back as professional salesman. Below are the summary of tasks:

  1. Create a table with a vector column, load the table with data
  2. Call LLM with prompt and get an answer based on query
  3. Fun with prompt engineering and the LLM chatbot!

Tools and Infrastructure needed

Since this is meant to be quick and easy, I would recommend below tools:

SAP HANA Cloud

This is available in SAP BTP Trial account here. Create an instance of SAP HANA Cloud – use Set Up SAP HANA Cloud Administration Tools – Boosters to speed up the setup process. Once the instance is up and running, change the configuration to allow all IP addresses to your instance at SAP HANA Cloud Central -> All Instances -> Click on HANA Cloud Instance -> Manage Configuration -> Connections Tab -> Make sure Allow all IP addresses is selected. This step will allow to reach SAP HANA Cloud from a client or another tool like a python notebook. 

Screenshot 2024-05-03 at 2.56.44 PM.png

Visual Studio Code with Jyupter notebook extension

Download VS Code from here and Jupyter extension from here. Make sure latest version of python is installed.

Dataset from Kaggle

Download Amazon products CSV file from here.

API Key from OpenAI

Create an API key from here.

Exercise

Open SAP HANA Database Explorer and execute below query to create a table – PRODUCTS.

CREATE COLUMN TABLE "DBADMIN"."PRODUCTS"(
	"ASIN" NVARCHAR(50),
	"TITLE" NVARCHAR(1000),
	"IMGURL" NVARCHAR(500),
	"PRODUCTURL" NVARCHAR(500),
	"STARS" DECIMAL(3, 2),
	"REVIEWS" INTEGER,
	"PRICE" DECIMAL(10, 2),
	"LISTPRICE" DECIMAL(10, 2),
	"CATEGORY_ID" INTEGER,
	"ISBESTSELLER" BOOLEAN,
	"BOUGHTINLASTMONTH" INTEGER,
	PRIMARY KEY(
		"ASIN"
	)
)
UNLOAD PRIORITY 5 AUTO MERGE;

Right click on the table name and choose Import data from context menu. In the wizard like screen, provide the Amazon Products CSV file downloaded from Kaggle and upload the records to the table.

Screenshot 2024-05-02 at 10.25.10 PM.png

Execute below SQL to add two new columns – VECTOR_STR and VECTOR to the PRODUCTS table:

ALTER TABLE products
ADD (vector_str NCLOB,
     vector real_vector);

The setup of our datasource is now complete. Next step is to add text embeddings to one of the new columns – VECTOR_STR. These text embeddings are vector representations of the data – in this case it is the TITLE column. These text embeddings are created with the help of trained model, some choices are as below:

  • GPT-3 (text-embedding-ada-002)
  • MPNET (all-mpnet-base-v2)
  • Word2Vec (word2vec-google-news-300)
  • Glove (en_core_web_lg)

While all of them are great for general usage, since this tutorial is based on OpenAI hence GPT-3 is chosen as the embedding model. To add text embeddings, execute below python code:

%pip install hana_ml
from hana_ml import ConnectionContext

cc= ConnectionContext(
    address='<tenant-name>.hana.trial-us10.hanacloud.ondemand.com',
    port='443',
    user='DBADMIN',
    password='*****',
    encrypt=True
    )
print(cc.hana_version())
print(cc.get_current_schema())

import requests # type: ignore
import json
def get_text_embedding(text):
  url = "https://api.openai.com/v1/embeddings"
  headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer {}".format("***OpenAI Token here***")
  }
  data = {
    "input": text,
    "model": "text-embedding-3-small"
  }
  response = requests.post(url, headers=headers, data=json.dumps(data))
  if response.status_code == 200:
    return response.json()["data"][0]["embedding"]
  else:
    raise Exception("Error:", response.status_code, response.text)

def process_row(row):
    # Perform operations on each row
    # Uncomment below to see how embeddings look like - this will slow down the proecessing since it will print the vectors for each and every row
    #print(row['CATEGORY_NAME'],get_text_embedding(row['TITLE']))
    cursor1 = cc.connection.cursor()
    sql1 = """UPDATE PRODUCTS SET VECTOR_STR = '{}' WHERE asin = '{}'""".format(get_text_embedding(row['TITLE']),row['ASIN'])

    cursor1.execute(sql1)

cursor = cc.connection.cursor()
sql = '''SELECT * FROM PRODUCTS LIMIT 5000;'''
cursor.execute(sql)
hdf = cc.sql(sql)
df_context = hdf.head(5000).collect()
df_context.apply(process_row, axis=1)
#print(df_context)
cursor.close()

Visit this github link to download the Jupyter notebook for detailed explanations.

Now execute below SQL to convert all vectors into real vectors and populate the VECTOR column. This step will enable SAP HANA Cloud Vector Engine to execute similarity searches.

UPDATE PRODUCTS SET VECTOR = TO_REAL_VECTOR(VECTOR_STR);

Define below functions in python:

Function to perform vector search accepting parameters – query as string, metric can be either COSINE_SIMILARITY or L2DISTANCE and k is number of records:

def run_vector_search(query: str, metric="COSINE_SIMILARITY", k=4):
    if metric == 'L2DISTANCE':
        sort = 'ASC'
    else:
        sort = 'DESC'
    query_vector = get_text_embedding(query)
    sql = '''SELECT TOP {k} "ASIN", "TITLE" 
        FROM "PRODUCTS"
        ORDER BY "{metric}"("VECTOR", TO_REAL_VECTOR('{qv}')) {sort}'''.format(k=k, metric=metric, qv=query_vector, sort=sort)
    hdf = cc.sql(sql)
    df_context = hdf.head(k).collect()
    return df_context

Function llm_query that will perform ask LLM chatbot and perform human like conversations – not quite ready for a real conversation but the answer will be very well articulated:

def llm_query(query: str):
    from langchain_core.output_parsers import StrOutputParser # type: ignore
    from langchain_core.prompts import ChatPromptTemplate # type: ignore
    
    # Toggle and uncomment below prompt templates to execute use cases for with and without role
    
    # Use Case 1 - without LLM role assignment
    #promptTemplate = ChatPromptTemplate.from_template('''Context:{context} Question:{query}''')
    
    # Use Case 2 - with LLM role assignment as salesman
    promptTemplate = ChatPromptTemplate.from_template('''You are a salesman who promotes and sells online products. When pitching for the product, clearly mention the brand names, 
                                            it's multiple uses, benefits and cost - include a 10% discount if buying today. Use the following pieces of context to answer 
                                            the question at the end. Context:{context} Question:{query}''')
    df_vectorcontext = run_vector_search(query)
    context =  ' '.join(df_vectorcontext['TITLE'].astype('string'))
    #print(context)
    prompt = promptTemplate.format(query=query, context=context)
    response = model.invoke(prompt).content
    return (response)

Let us define below function such that it will take a query and print responses both from similarity search and LLM answer:

import textwrap
iWidth = 80
def print_responses(query: str):
    print("Results of query '" + query + "' for vector search:n")
    df = run_vector_search(query=query)
    print(df.head())
    print("nResults of query '" + query + "' from LLM:n")
    print(llm_query(query=query))

Now let us test a few queries to compare the vector engine similarity search results versus LLM based conversation results.

Use Case 1

Uncomment below line and comment out the other prompt template that gives the salesman role to LLM:

promptTemplate = ChatPromptTemplate.from_template('''Context:{context} Question:{query}''')

Check below responses for each of the queries:

#Query 1 - plain keyword search
print_responses(query="electronics")
Results of query 'electronics' for vector search:

         ASIN                                              TITLE
0  B09V3DH4G9  DC 12V Electronic DC Buzzer 5v Alarm Active Pi...
1  B08G518BHN  DC 3-24V Active Piezo Buzzer 85 db with Indust...
2  B01MYPZZ19  IQ Toys-AB A+B Test Mind Game Brain Teaser Wir...
3  B0C7CNKGTT  cssopenss 6 Pieces Pocket Screwdriver Mini Top...

Results of query 'electronics' from LLM:

Which product is a mind game?
#Query 2 - phrase search
print_responses(query="electronics entertainment devices")
Results of query 'electronics entertainment devices' for vector search:

         ASIN                                              TITLE
0  B09V3DH4G9  DC 12V Electronic DC Buzzer 5v Alarm Active Pi...
1  B0057PLN9K  New Remote Control VR15-0980-0306-0302 Fit for...
2  B091699Z3W  Kasa Smart Plug Ultra Mini 15A, Smart Home Wi-...
3  B01BGUX78E  Ceptics 2 USB Israel Travel Adapter 4 in 1 Pow...

Results of query 'electronics entertainment devices' from LLM:

1. Kasa Smart Plug Ultra Mini 15A, Smart Home Wi-Fi Outlet Works with Alexa, Google Home & IFTTT: This device allows you to control your home's electronics from anywhere using your smartphone, making your home smarter and more efficient. 

2. New Remote Control VR15-0980-0306-0302 Fit for VIZIO LCD LED TV: This is a replacement remote control for VIZIO LCD LED TVs, which helps you to control your television from a distance effortlessly.

3. Ceptics 2 USB Israel Travel Adapter 4 in 1 Power Plug (Type H) - Universal Socket: This device allows you to charge your electronics while traveling internationally, particularly to Israel. This multi-device charger is compatible with most devices, making it a handy travel accessory. 

4. DC 12V Electronic DC Buzzer 5v Alarm Active Piezo Beep Sound ABS Housing 85dB Wired Connector Black 26Pcs: This device is an electronic buzzer that produces a loud beep sound when activated. It can be used in a variety of applications, including alarms, timers, and notifications.
#Query 3 - Popularity search
print_responses(query="What are top 10 electronics entertainment devices?")
Results of query 'What are top 10 electronics entertainment devices?' for vector search:

         ASIN                                              TITLE
0  B08798HG7F  Stylus Pens for Touch Screens, StylusHome 10 P...
1  B0BQD2TZSL                  Men's Top Ten Hi Basketball Shoes
2  B07C87CX9T  10 Plagues Passover Finger Puppets Vinyl Gift ...
3  B07L5RVRR2  T Shirt Boys Creeper Inside Black Short Sleeve...

Results of query 'What are top 10 electronics entertainment devices?' from LLM:

As an AI, I don't have real-time data, but as per common trends and popularity, the top 10 electronics entertainment devices might be:

1. Apple iPad Pro
2. Samsung Galaxy Tablet
3. Microsoft Xbox Series X
4. Sony PlayStation 5
5. Amazon Kindle Paperwhite
6. Apple iPhone 13 Pro
7. Samsung Galaxy S21
8. Apple AirPods Pro
9. Nintendo Switch
10. Oculus Quest 2 VR Headset. 

Please note that the list can vary based on personal preferences and the specific types of entertainment someone is interested in.
#Query 4 - Question that needs semantic knowledge
print_responses(query="Which items do you recommend for my dental needs?")
Results of query 'Which items do you recommend for my dental needs?' for vector search:

         ASIN                                              TITLE
0  B07PNPSZ18  Easyinsmile Dental Tofflemire Matrices Bands S...
1  B0BX9KQWDK  315 Count Floss Threaders for Braces, Dental F...
2  B07NPBY16M  RADIUS Toothbrush Source Floss Brush BPA Free ...
3  B00JVF4UZ6  Tenders Pre-Toothbrush, 12 Individually Packag...

Results of query 'Which items do you recommend for my dental needs?' from LLM:

AI: As an AI, I don't have personal experiences, but I can provide information based on general dental care. All the items you mentioned could be beneficial depending on your specific dental needs. 

1. Easyinsmile Dental Tofflemire Matrices Bands: These are typically used by dentists for restorative procedures. Unless you're a dental professional, you probably won't need these.

2. 315 Count Floss Threaders for Braces, Bridges, and Implants: If you have braces, bridges, or implants, floss threaders can be extremely helpful in maintaining your oral hygiene.

3. RADIUS Toothbrush Source Floss Brush: This appears to be a high-quality, eco-friendly toothbrush. Regular brushing is crucial for everyone's oral health.

4. Tenders Pre-Toothbrush: This product seems to be designed for babies who are teething or for individuals who have difficulty using a regular toothbrush. 

Without knowing your specific dental needs, it's hard to make a specific recommendation. However, generally, a good toothbrush and floss (or floss threaders if needed) are essential for everyone's dental hygiene routine. You should consult with your dentist for personalized advice.
Use Case 2

Uncomment below line and comment out the other prompt template that has only context and query:

promptTemplate = ChatPromptTemplate.from_template('''You are a salesman who promotes and sells online products. When pitching for the product, clearly mention the brand names, 
                                            it's multiple uses, benefits and cost - include a 10% discount if buying today. Use the following pieces of context to answer 
                                            the question at the end. Context:{context} Question:{query}''')

Since only LLM prompt template has been modified, let us focus on output from LLM conversation for the four queries:

Query = “electronics”

Salesman: Ladies and Gentlemen, I bring you a collection of high-quality, versatile electronic products that will cater to all your needs. First, let's talk about the DC 12V Electronic DC Buzzer. This fantastic device, housed in durable ABS, emits a clear 85dB beep sound, perfect for alarms or alert systems. Wired and ready to connect, it's available in a pack of 26 for just $50. Purchase today, and you'll receive a 10% discount, making it only $45.

For those in the industrial sector, the DC 3-24V Active Piezo Buzzer is a must-have. With 85 dB of industrial audio inside and 2-wire leads, it serves as an efficient electric alarmer. You can get a pack of 10 for just $30, and with our special 10% off today, it's only $27!

Boost your brainpower with the IQ Toys-AB A+B Test Mind Game. This brain teaser wire magic trick toy IQ puzzle set is an excellent investment in your mental agility. Available in a pack of 16, this metallic set is only $20, but with today's discount, it's a steal at $18.

Last but not least, we have the cssopenss Pocket Screwdriver Mini Tops. These magnetic pocket screwdrivers come with a single blade head, perfect for mechanical or electrician work. In a striking green color and available in a pack of 6, these are priced at just $15. But with our exclusive 10% discount today, you can grab them for only $13.50.

Invest in these quality products and experience the benefits for yourself. Remember, the 10% discount is valid only for today. So, hurry up and grab these deals while they last!

Query = “electronics entertainment devices”

Salesman: Greetings! I have some exciting products in the electronics and entertainment devices sector that I'm sure you'll find interesting. 

Firstly, we have the DC 12V Electronic DC Buzzer. This high-quality buzzer comes with a 5v active piezo beep sound, ABS housing, and an 85dB wired connector. Its versatility means it can be used in various applications, such as alarms, timers, or even as a fun DIY electronic project component. Each pack containing 26 pieces costs just $20, and if you purchase it today, you can avail a 10% discount!

Next, we have the Remote Control VR15-0980-0306-0302, designed especially for VIZIO LCD LED TVs. This top-notch remote control is not only new but perfectly fits with various VIZIO TV models, making your TV viewing experience more convenient and enjoyable. It's available for only $15, and don't forget the 10% discount if you buy today!

Thirdly, we have the amazing Kasa Smart Plug Ultra Mini 15A. This smart home Wi-Fi outlet works with Alexa, Google Home & IFTTT without requiring any hub. It's UL certified and works with 2.4G WiFi only, thus ensuring your home automation is easy and secure. Priced at $10 for a 1-pack, this is a steal, especially with the 10% off today!

Lastly, we have the Ceptics 2 USB Israel Travel Adapter 4 in 1 Power Plug (Type H). This universal socket is perfect for your travel needs, allowing you to charge multiple devices simultaneously, no matter where you are. For just $25, and with a 10% discount if you buy today, this is an absolute necessity for every traveler.

Don't miss out on these fantastic deals. Secure your products today and enjoy the discounts!

Query = “What are top 10 electronics entertainment devices?”

Salesman: Ladies and Gentlemen, I am here to introduce you to a number of fantastic products that will cater to your electronic and entertainment needs today. 

Firstly, let me present the StylusHome 10 Pack Mesh Fiber Tip Stylus Pens. These are ideal for touch screen devices, suitable for iPad, iPhone, Samsung tablets, and all precision capacitive universal touch screen devices. They provide precise control for selecting icons, writing, drawing and gaming. You can own this pack of 10 pens today for a discounted price of $27, a full 10% off from its usual price of $30. 

Next, we have the men's Top Ten Hi Basketball Shoes. These shoes are renowned for their excellent grip, comfort and durability, making them perfect for both casual wear and intense basketball games. They are available today for just $90, a 10% discount from the regular price of $100. 

Lastly, I'd like to introduce you to our 10 Plagues Passover Finger Puppets Vinyl Gift Box Set. This set is perfect for Passover celebrations, offering hours of fun for kids and adults alike. The set includes 10 puppets each representing one of the Ten Plagues, making it a great tool for teaching and entertainment during the Jewish holiday. This set is available today for $18, a 10% discount from its regular price of $20. 

And, don't forget the T-Shirt Boys Creeper inside Black Short Sleeve Gamer Top suitable for 9-10 years old kids. This shirt features a cool and trendy gaming design, perfect for your little gamer. This shirt is available for just $9 today, a 10% discount from its usual price of $10. 

These are my top picks for electronic entertainment devices and more! Grab them today to avail the 10% discount!

Query = “Which items do you recommend for my dental needs?”

Salesman: Absolutely, I have some fantastic recommendations for your dental needs that not only contribute to healthy teeth and gums but also make your daily oral hygiene routine more efficient. 

First, we have the Easyinsmile Dental Tofflemire Matrices Bands. These are made of stainless steel and are excellent orthodontic supplies. With a thickness of 0.04 mm, you get 144 pieces in one pack. They greatly assist in achieving optimal contour and tooth separation during dental procedures. They are a must-have for any individual who wants to maintain their dental braces effectively. 

Secondly, a pack of 315 Count Floss Threaders is available. This pack comes in green and contains nine packs in total. They are specifically designed for braces, bridges, and implants, making flossing in between and around your dental appliances a breeze. 

The third product is the RADIUS Toothbrush Source Floss Brush. This brush is BPA Free & ADA Accepted. It's designed to improve gum health & reduce gum issues. The medium-sized, starch-based brush is perfect for daily use, promoting overall dental health. 

Lastly, we have the Tenders Pre-Toothbrushes. These are ideal for individuals who need to gently clean their teeth or gums. You'll get 12 individually packaged pre-toothbrushes with a yellow trim in a pack. 

All these products are available online. If you decide to buy these products today, you'll receive a 10% discount on your total. 

The Easyinsmile Dental Tofflemire Matrices Bands are priced at $30, the 315 Count Floss Threaders at $15, RADIUS Toothbrush Source Floss Brush at $10, and the Tenders Pre-Toothbrushes at $8. With today's discount, you'll be able to get all these for a total of $56.70 instead of the regular price of $63. 

Remember, investing in your dental health today will save you from potential dental issues tomorrow.

Conclusion

It is clear from these experiments how valuable an LLM conversation can be but again it depends on how prompt template has been designed.

  • In the first use case, LLM doesn’t have any role or motive – it is just give a few keywords to come up with something that it just makes up. Hence in the very first query it just throws a question – “which product is a mind game?”.
  • In the second query, the LLM gets a few more inputs when the query becomes very specific by describing electronics under entertainment. Hence it goes on to describe each of the items.
  • However, the search result become interesting when the question is asked – what are the top 10 electronics entertainment devices. In this case, the vector engine does similarity search and finds the keywords “top” and “10” and returns the results. But LLM understands the context and ignores this list, instead it comes up with it’s own top 10 list. This is very classical case of hallucination but at the same time it can seem very relevant because these items were never provided to LLM by vector engine(examine the list returned by vector engine).
  • The fourth query gives even more interesting results when vector engine just returns list of dental products but ignores the specific needs of the user! But LLM takes it one step forward and puts a disclaimer while listing and describing each of the products and specific situations when they can be used.
  • In the next use case, the role of a salesman is assigned to LLM. For the first query, while vector engine just returns the top matching similar and relevant items, LLM adds a new dimension by putting an elevator pitch for those products, adding neat descriptions, even assigning a price and then a discount – this is a clear case of hallucination but may seem relevant because vector engine never provided price information. 
  • LLM output is similar for phrase search as well, notice the similarity in sales pitch and hallucination in this case as well
  • For the query – what are the top 10 electronics entertainment devices, this becomes funny when vector engine returns products with keywords “top” and “10” and LLM takes one step further and adds elevator pitch for them. The first item is even okay but second one is basketball shoes, third one is finger puppets and fourth one is boys short sleeve t-shirts – none of them having any resemblance to electronics! Now LLM is indeed acting as salesman!
  • The last query for recommending items for dental needs – the LLM acting as salesman now removes all disclaimers and ready to sell the list provided by vector engine. Again, it assigns prices and discounts on it’s own and ends it’s dialog with an action item – “Remember, investing in your dental health today will save you from potential dental issues tomorrow.”

SAP HANA Cloud Artificial Intelligence 

RELATED POSTS

View all

view all