VOOZH about

URL: https://dev.to/flux8labs/building-minyut-an-embeddable-rag-chatbot-in-one-script-tag-c1h

⇱ Building Minyut: An Embeddable RAG Chatbot in One Script Tag - DEV Community


A client needed their customers to be able to query a 40-page policy document without reading through it.

We built the first version of what became Minyut in a weekend.

It used a basic embedding approach, answered from an OpenAI endpoint, and—in testing—confidently responded to questions that had no answer in the document at all.

It made things up.

Fluently.

Completely wrong.

That was the founding problem.

Every document chatbot we tested made things up. Not because the models were bad, but because they weren't constrained to answer only from the documents.

Minyut is built around a single architectural decision:

Every answer must come from uploaded content—or the chatbot says, "I don't know."

Everything else follows from that constraint.

Today, Minyut processes queries for chatbots embedded on Webflow sites, WordPress installs, Shopify stores, React apps, and plain HTML pages.

Documents are stored in Supabase's Mumbai region, and the widget can be embedded with a single script tag in under ten minutes.

Here's how it's built.

The Problem That Refused to Go Away: Knowledge Isolation

Standard AI chatbots answer from their training data.

For general knowledge, that's exactly what you want.

For support chatbots, legal documents, policy manuals, product specifications, and consultancy websites, accuracy becomes a liability issue rather than a convenience feature.

A chatbot that invents policy details isn't a support tool.

It's a liability.

The solution is Retrieval-Augmented Generation (RAG).

At query time:

  1. The user's question becomes a vector embedding.
  2. Relevant document chunks are retrieved.
  3. Only those chunks are sent to the language model.
  4. The model answers using the retrieved context.

If the answer isn't present in the uploaded documents, the chatbot says so.

The language model can only answer as well as the passages you retrieve.

Good retrieval is most of the battle.

The Chunking and Embedding Pipeline

Documents arrive as:

  • PDF
  • Markdown
  • Plain text

File limits:

  • Free plan: 5 MB
  • Paid plans: 25 MB

After extraction, documents are chunked.

Attempt 1: Sentence-Level Chunks

Each sentence became its own chunk.

Retrieval was precise but context disappeared.

Example:

Question: What is the refund window?

Retrieved:

Refunds are processed in 7 days.

Technically correct.

Practically useless.

Attempt 2: Full Paragraphs

Context improved.

Retrieval consistency did not.

Short and long paragraphs behaved very differently during similarity search.

Final Approach: Fixed Chunks With Overlap

Current strategy:

  • 600-token chunks
  • 80-token overlap

The overlap ensures sentences crossing chunk boundaries remain complete in at least one retrieved section.

For Minyut's document types, answer quality improved significantly.

Each chunk is embedded using:

sentence-transformers/all-MiniLM-L6-v2

via the HuggingFace Inference API.

The model generates:

  • 384-dimensional vectors
  • Fast indexing
  • Strong semantic search performance

Vectors are stored in:

  • PostgreSQL
  • pgvector extension
  • HNSW index

inside Supabase.

The Widget: One Script Tag, No CSS Conflicts

The hard problem wasn't loading a script.

It was ensuring the widget worked everywhere.

Different host websites bring:

  • Different CSS frameworks
  • Different z-index rules
  • Different positioning systems

Our first approach used scoped CSS.

It failed repeatedly.

Examples:

  • WordPress themes overriding positioning
  • Global CSS affecting widget layout
  • Z-index conflicts hiding the chat button

The solution was Shadow DOM.

The widget creates a completely isolated DOM tree.

Host styles cannot leak in.

Widget styles cannot leak out.

const host = document.createElement('div');
document.body.appendChild(host);

const shadow = host.attachShadow({
 mode: 'open'
});

Everything lives inside the shadow root.

Style conflicts effectively disappear.

The widget is delivered as a single async script:

<script async src="https://minyut.com/widget.js"></script>

Advanced users can control behavior through:

window.__minyut__

including:

  • Opening the widget
  • Closing the widget
  • Prefilling messages
  • Listening to events

Infrastructure

Minyut's stack is intentionally simple.

Supabase

Handles:

  • PostgreSQL
  • Authentication
  • Storage
  • pgvector
  • Edge Functions

Netlify

Hosts:

  • Marketing website
  • Dashboard
  • Widget CDN

Razorpay

Handles subscription billing.

HuggingFace

Provides embedding generation.

Groq

Handles language model inference.

Storage Security

Documents are stored in:

  • Private buckets
  • Account-scoped access
  • No cross-account visibility

BYOK

Bring Your Own Key support allows users to connect:

  • Groq
  • OpenAI

Keys are encrypted using AES-256.

We never need to operate GPU infrastructure ourselves.

At Minyut's current scale, that's exactly the tradeoff we want.

The Pricing Decision: No Token Meters

The most common complaint about chatbot SaaS products isn't quality.

It's billing uncertainty.

Usage-based pricing creates anxiety.

A traffic spike should not become a surprise invoice.

Minyut uses fixed monthly plans.

Users receive notifications at:

  • 80% usage
  • 100% usage

Nothing fails silently.

Tinkerers (Free)

  • 1 chatbot
  • 3 documents
  • 100 queries/month

Starter ($5/month)

  • 3 chatbots
  • 10 documents
  • 500 queries/month

Pro ($12/month)

  • 10 chatbots
  • Unlimited documents
  • 2,000 queries/month
  • Custom domains
  • Analytics
  • Priority support

BYOK ($3/month)

Unlimited queries through the user's own OpenAI or Groq key.

We handle:

  • Storage
  • Dashboard
  • Bandwidth
  • Infrastructure

Users pay model providers directly.

Three Things That Would Have Saved Us Time

1. Chunking Matters More Than Models

We spent weeks comparing language models.

The bigger factor was chunk size and overlap.

Fix retrieval before optimizing inference.

2. Shadow DOM Solves Widget CSS Problems

Scoped CSS eventually breaks.

Shadow DOM doesn't.

Once we switched, CSS-related issues effectively disappeared.

3. Design for BYOK Early

The users who want BYOK are often the most engaged.

They build real systems.

Supporting them from the start avoids painful architectural changes later.

Conclusion

Minyut started as an attempt to solve a simple problem:

How do you build a chatbot that answers only from documents and refuses to invent information?

The answer ended up being a combination of:

  • Retrieval-Augmented Generation
  • Careful chunking
  • Semantic search
  • Shadow DOM isolation
  • A simple deployment model

The result is a chatbot that can be embedded on almost any website using a single script tag and answer only from uploaded content.

That's exactly what we set out to build.


Frequently Asked Questions

What file types does Minyut support?

  • PDF
  • Markdown (.md)
  • Plain text (.txt)

Will it answer questions not present in my documents?

No.

Minyut is designed specifically for document-grounded responses.

If the information isn't present in uploaded content, the chatbot says it doesn't know.

Which platforms does the embed support?

The widget has been tested on:

  • WordPress
  • Webflow
  • Shopify
  • Framer
  • React
  • Next.js
  • Plain HTML

Because it uses Shadow DOM isolation, it works reliably across virtually any platform that permits custom JavaScript.