Create Your Own LLm for Free: A Complete Guide to Building Your Own LLM Chatbot from Scratch

Iqbal Fakhriza

Mar 14

Create Your Own LLm for Free: A Complete Guide to Building Your Own LLM Chatbot from Scratch

In this tutorial, we'll walk you through the entire process of building a powerful LLM chatbot, starting from scratch—without any cost involved! By the end of this guide, you'll have your own custom chatbot powered by a large language model that you can deploy and interact with.

Prerequisites

Before we dive into creating the LLM chatbot, let’s make sure we have everything we need. The following tools and libraries will help us build our model:

  • Google Colab: The primary environment for running and experimenting with your code.
  • Firecrawl: A Python library to gather text data from the web, essential for training your language model.
  • Groq: An optimization library to enhance the efficiency of your model’s computations.


Step 1: Setting Up Your Google Colab Environment

In this step, we will use Firecrawl, a powerful tool that allows us to scrape content from the web. For this example, we’ll scrape articles from Tech in Asia, you can change this later with your own needs.

To use Firecrawl, you'll need an API key from Firecrawl, which allows you to authenticate your requests. Once you have the key, we can set up the scraper, define the target URL, and specify the data format we need (in this case, markdown and HTML).

Go to Firecrawl and sign in to your account. You can see the API Key provided for you by firecrawl. You can paste it to the input provided after running the cell below, click Enter.

import getpass

FIRECRAWL_API_KEY = getpass.getpass("Firecrawl API Key: ")

In this part, we initialize the firewrawl app and begin "scraping" with the desired formats the data we want in. In this tutorial, we'll use HTML.

from firecrawl import FirecrawlApp

# initialize firecrawl

app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)

# begin scraping Tech in Asia

tech_in_asia = app.scrape_url(

    'https://www.techinasia.com/',

    params={'formats': ['markdown', 'html']})

markdown = tech_in_asia["markdown"]

print(markdown)

The variable "context" below is set up to structure the scraped data in a way that can be used to train our language model later on. We wrap the scraped content in tags for clarity.

context = f"""The following are Tech in Asia \n\n<tech_in_asia>{markdown}</tech_in_asia>\n"""


Step 2: Collecting data with Firecrawl

In this step, we will use Firecrawl, a powerful tool that allows us to scrape content from the web. For this example, we’ll scrape articles from Tech in Asia, you can change this later with your own needs.

To use Firecrawl, you'll need an API key from Firecrawl, which allows you to authenticate your requests. Once you have the key, we can set up the scraper, define the target URL, and specify the data format we need (in this case, markdown and HTML).

Go to Firecrawl and sign in to your account. You can see the API Key provided for you by firecrawl. You can paste it to the input provided after running the cell below, click Enter.

import getpass

FIRECRAWL_API_KEY = getpass.getpass("Firecrawl API Key: ")

In this part, we initialize the firewrawl app and begin "scraping" with the desired formats the data we want in. In this tutorial, we'll use HTML.

from firecrawl import FirecrawlApp

# initialize firecrawl

app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)

# begin scraping Tech in Asia

tech_in_asia = app.scrape_url(

    'https://www.techinasia.com/',

    params={'formats': ['markdown', 'html']})

markdown = tech_in_asia["markdown"]

print(markdown)

The variable "context" below is set up to structure the scraped data in a way that can be used to train our language model later on. We wrap the scraped content in tags for clarity.

context = f"""The following are Tech in Asia \n\n<tech_in_asia>{markdown}</tech_in_asia>\n"""


Step 3: Optimizing with Groq

Now that we have our scraped data, it's time to optimize our LLM for faster and more efficient inference using Groq. Groq is a powerful platform designed for accelerating AI models, and in this step, we’ll use it to enhance the performance of our chatbot.

You'll need to provide your own Groq API key for authentication. You can login to your account from Groq. Go to API Keys Menu and Create one then copy it and paste it after running the cell below. Don't forget to click Enter :))

from groq import Groq

import getpass

GROQ_API_KEY=getpass.getpass("Groq API Key: ")

client = Groq(api_key=GROQ_API_KEY)

We define a system message that tells the model how to behave. In this case, our assistant will focus on answering questions about the articles it will be short and straight to the .

system_message = f"""

You're a helpful intelligence assistant.

You're going to look at tech-related content such as tools, programming languages, courses, and tech prices.

Your answer is always extremely short and straight to the point.

"""

def get_response(query: str, context: str):

  completion = client.chat.completions.create(

      model="llama-3.2-11b-vision-preview",

      messages=[

          {

              "role": "system",

              "content": system_message

          },

          {

              "role": "user",

              "content": context

          },

          {

              "role": "user",

              "content": query

          }

      ],

      temperature=0,

      max_tokens=1470,

      top_p=1,

      stream=False,

      stop=None,

  )

  return completion.choices[0].message.content

Let's test it with simple question, i limited the amount of scraped data or text included in context to 3000 to avoid rate limit issue.

short_context = context[:3000]

answer = get_response(

    query="what is the average price of graphic card?",

    context=short_context,

    )

print(answer)


Final part: Building a Chat Interface with Gradio

In this step, we will use Gradio to create a simple, interactive web interface for our LLM chatbot. Gradio makes it easy to deploy machine learning models and allows users to interact with them through a browser.

import gradio as gr

# Fungsi Gradio

def chat_interface(query, context=""):

    try:

        response = get_response(query, context)

        return response

    except Exception as e:

        return f"Error: {str(e)}"

# Antarmuka Gradio

iface = gr.Interface(

    fn=chat_interface,

    inputs=[

        gr.Textbox(label="Your Query", placeholder="Type your query here..."),

        gr.Textbox(label="Context (Optional)", placeholder="Provide context here (optional)..."),

    ],

    outputs=gr.Textbox(label="Assistant Response"),

    title="LLM Chat Interface",

    description="Chat with the LLM using your query and optional context.",

)

# Untuk Google Colab, gunakan share=True

iface.launch(pwa=True, share=True)

You’ve now explored the key concepts, but there’s more under the hood! 🚀 To see the full version of the code—plus some extra optimizations and hidden gems—check it out on Google Colab at this link: here. Dive in, experiment, and let AI do the heavy lifting!