Paste Your URL.
Get a Chatbot Trained on It.
How to train an AI chatbot on your website
You train an AI chatbot on your website by pointing a chatbot builder at your URL. The platform crawls your pages, extracts the readable text, generates embeddings, and indexes everything in a vector database. When a user asks a question, the chatbot retrieves the most relevant pages and uses them as context for the answer — a pattern called retrieval-augmented generation (RAG).
In Chatmount the flow is: paste URL, click train, done. The crawler discovers pages via your sitemap.xml or by following internal links. JavaScript-rendered content (Next.js, React, Vue) is rendered properly so single-page apps work too. Indexing typically takes 2-5 minutes for sites under 500 pages.
You never set up a vector database, choose an embedding model, or write a chunking algorithm. You paste a URL.
A real website crawler, not a one-page scraper
Six things that separate a real website-trained chatbot from a weekend prototype.
Auto-crawl every page
Paste a single URL. Chatmount discovers and crawls your site recursively, respecting your robots.txt and any crawl-depth limits you set.
Sitemap-aware indexing
Has a sitemap.xml? Chatmount uses it to find pages. No sitemap? The crawler discovers pages via internal links. Either way, every public page can be indexed.
Source-cited answers
Every chatbot answer can include the URL of the page the answer came from. No hallucinated facts — and visitors can click through to read more.
Scheduled re-training
Set the chatbot to re-crawl your site daily, weekly, or monthly. New pages get indexed automatically. Removed content gets cleaned up.
Page-level control
Exclude specific URLs (like the admin area or marketing-only pages). Include specific subdomains. Throttle to be polite to your origin server.
Respects your site
User-agent identifies as Chatmount. Honors robots.txt. Configurable crawl rate so you never see a spike from us in your analytics.
The flow, in three commands
- 1
Paste your URL
https://your-site.com — that's all.
- 2
Click train
Crawler discovers pages, extracts text, chunks, embeds, indexes. 2-5 minutes for a typical site.
- 3
Chat or embed
Test in the playground. Copy the one-line embed script. Done.
What gets indexed (and what doesn't)
Indexed by default
- Marketing site pages
- Blog posts and articles
- Product / pricing pages
- Help center articles
- Public docs
- JS-rendered content
Skipped by default
- Anything in robots.txt disallow
- Pages with noindex meta
- Auth-gated dashboards
- PDF / image-only pages (use PDF training instead)
- URL patterns you exclude in settings
- Off-domain external links
Mix sources for the best chatbot
Most teams train one chatbot on multiple sources: the website crawl for breadth, PDFs for technical depth, Q&A pairs for hand-tuned FAQ answers.
Frequently Asked Questions
Everything teams ask before pointing a chatbot at their website.
How do I train an AI chatbot on my website?
Paste your website URL into Chatmount and click train. The crawler walks your pages (using your sitemap.xml if available, otherwise following internal links), extracts the readable text, splits it into chunks, generates embeddings, and indexes everything. The chatbot starts answering from your content as soon as indexing finishes — usually 2-5 minutes for sites under 500 pages.
Will the crawler hit my server hard?
No. Chatmount's crawler runs at a polite default rate (one request every few seconds) and respects your robots.txt. You can throttle it further or set crawl-depth limits if you want to be even gentler. The crawler identifies itself as 'Chatmount' in the User-Agent header so you can see it in your access logs.
Can the chatbot stay in sync as I publish new content?
Yes. You can schedule re-crawls daily, weekly, or monthly. New pages are indexed automatically; removed content is cleaned up. Re-training is incremental — only changed pages get re-embedded — so it's fast and doesn't run up your monthly credits.
What if my site is JavaScript-heavy (Next.js, React, Vue)?
Chatmount's crawler renders JavaScript so single-page apps and JS-rendered content work correctly. Server-rendered pages and statically generated content work even better — they index faster and use fewer crawl credits.
Can I exclude specific pages from training?
Yes. You can: (1) Add disallow rules in your robots.txt for the Chatmount crawler. (2) Specify URL patterns to exclude in the chatbot settings (e.g., /admin/*, /private/*). (3) Use the noindex meta tag — Chatmount respects it.
How does this differ from training on a sitemap or PDF?
A website crawl gives you depth and freshness — it follows links and stays current. A sitemap gives you a curated list of pages. A PDF gives you static, point-in-time content. Most teams combine all three: site crawl for marketing/blog content, PDFs for product manuals, Q&A pairs for FAQ-style answers.
What about large sites — 1,000+ pages?
Chatmount handles large sites well, but storage budget matters. Free: 200 pages max crawl. Go: 1,000. Plus: 5,000. Pro: 20,000. Enterprise: unlimited. You can also exclude entire subdirectories to focus the crawl on the content that actually matters for chatbot answers.
Will the chatbot answer questions about content I haven't indexed?
By default the chatbot answers only from your trained content and admits when it doesn't know. You can configure looser behavior (allow general-knowledge fallback for off-topic questions) or stricter (refuse anything not in your sources).
Explore more from Chatmount
AI Chatbot Builder
The full overview of Chatmount as an AI chatbot builder for businesses.
AI Chatbot Trained on PDF
Upload PDFs and get an instant Q&A bot for them.
AI Chatbot for Website
Add the trained chatbot to any website in five minutes.
Embed ChatGPT on Your Website
Get a ChatGPT-style widget that answers from YOUR content, not the open internet.
AI Chatbot for Customer Support
Resolve common questions automatically using your support content.
No-Code AI Chatbot
Build, train, and ship a working chatbot in five minutes — no developers required.
Train a chatbot on your site in 5 minutes
7-day free trial. Paste your URL, click train, embed the script. Your content, conversational.