Iqbal Fakhriza
Mar 14
In this tutorial, we'll walk you through the entire process of building a powerful LLM chatbot, starting from scratch—without any cost involved! By the end of this guide, you'll have your own custom chatbot powered by a large language model that you can deploy and interact with.
Before we dive into creating the LLM chatbot, let’s make sure we have everything we need. The following tools and libraries will help us build our model:
In this step, we will use Firecrawl, a powerful tool that allows us to scrape content from the web. For this example, we’ll scrape articles from Tech in Asia, you can change this later with your own needs.
To use Firecrawl, you'll need an API key from Firecrawl, which allows you to authenticate your requests. Once you have the key, we can set up the scraper, define the target URL, and specify the data format we need (in this case, markdown and HTML).
Go to Firecrawl and sign in to your account. You can see the API Key provided for you by firecrawl. You can paste it to the input provided after running the cell below, click Enter.
import getpass
FIRECRAWL_API_KEY = getpass.getpass("Firecrawl API Key: ")
In this part, we initialize the firewrawl app and begin "scraping" with the desired formats the data we want in. In this tutorial, we'll use HTML.
from firecrawl import FirecrawlApp
# initialize firecrawl
app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)
# begin scraping Tech in Asia
tech_in_asia = app.scrape_url(
'https://www.techinasia.com/',
params={'formats': ['markdown', 'html']})
markdown = tech_in_asia["markdown"]
print(markdown)
The variable "context" below is set up to structure the scraped data in a way that can be used to train our language model later on. We wrap the scraped content in tags for clarity.
context = f"""The following are Tech in Asia \n\n<tech_in_asia>{markdown}</tech_in_asia>\n"""
In this step, we will use Firecrawl, a powerful tool that allows us to scrape content from the web. For this example, we’ll scrape articles from Tech in Asia, you can change this later with your own needs.
To use Firecrawl, you'll need an API key from Firecrawl, which allows you to authenticate your requests. Once you have the key, we can set up the scraper, define the target URL, and specify the data format we need (in this case, markdown and HTML).
Go to Firecrawl and sign in to your account. You can see the API Key provided for you by firecrawl. You can paste it to the input provided after running the cell below, click Enter.
import getpass
FIRECRAWL_API_KEY = getpass.getpass("Firecrawl API Key: ")
In this part, we initialize the firewrawl app and begin "scraping" with the desired formats the data we want in. In this tutorial, we'll use HTML.
from firecrawl import FirecrawlApp
# initialize firecrawl
app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)
# begin scraping Tech in Asia
tech_in_asia = app.scrape_url(
'https://www.techinasia.com/',
params={'formats': ['markdown', 'html']})
markdown = tech_in_asia["markdown"]
print(markdown)
The variable "context" below is set up to structure the scraped data in a way that can be used to train our language model later on. We wrap the scraped content in tags for clarity.
context = f"""The following are Tech in Asia \n\n<tech_in_asia>{markdown}</tech_in_asia>\n"""
Now that we have our scraped data, it's time to optimize our LLM for faster and more efficient inference using Groq. Groq is a powerful platform designed for accelerating AI models, and in this step, we’ll use it to enhance the performance of our chatbot.
You'll need to provide your own Groq API key for authentication. You can login to your account from Groq. Go to API Keys Menu and Create one then copy it and paste it after running the cell below. Don't forget to click Enter :))
from groq import Groq
import getpass
GROQ_API_KEY=getpass.getpass("Groq API Key: ")
client = Groq(api_key=GROQ_API_KEY)
We define a system message that tells the model how to behave. In this case, our assistant will focus on answering questions about the articles it will be short and straight to the .
system_message = f"""
You're a helpful intelligence assistant.
You're going to look at tech-related content such as tools, programming languages, courses, and tech prices.
Your answer is always extremely short and straight to the point.
"""
def get_response(query: str, context: str):
completion = client.chat.completions.create(
model="llama-3.2-11b-vision-preview",
messages=[
{
"role": "system",
"content": system_message
},
{
"role": "user",
"content": context
},
{
"role": "user",
"content": query
}
],
temperature=0,
max_tokens=1470,
top_p=1,
stream=False,
stop=None,
)
return completion.choices[0].message.content
Let's test it with simple question, i limited the amount of scraped data or text included in context to 3000 to avoid rate limit issue.
short_context = context[:3000]
answer = get_response(
query="what is the average price of graphic card?",
context=short_context,
)
print(answer)
In this step, we will use Gradio to create a simple, interactive web interface for our LLM chatbot. Gradio makes it easy to deploy machine learning models and allows users to interact with them through a browser.
import gradio as gr
# Fungsi Gradio
def chat_interface(query, context=""):
try:
response = get_response(query, context)
return response
except Exception as e:
return f"Error: {str(e)}"
# Antarmuka Gradio
iface = gr.Interface(
fn=chat_interface,
inputs=[
gr.Textbox(label="Your Query", placeholder="Type your query here..."),
gr.Textbox(label="Context (Optional)", placeholder="Provide context here (optional)..."),
],
outputs=gr.Textbox(label="Assistant Response"),
title="LLM Chat Interface",
description="Chat with the LLM using your query and optional context.",
)
# Untuk Google Colab, gunakan share=True
iface.launch(pwa=True, share=True)
You’ve now explored the key concepts, but there’s more under the hood! 🚀 To see the full version of the code—plus some extra optimizations and hidden gems—check it out on Google Colab at this link: here. Dive in, experiment, and let AI do the heavy lifting!