ResearchBot—InternalKnowledgeBaseRAGAgent

SaaSSaaS

Next.jsNode.jsOpenAI APIpgvectorPostgreSQL

76%

Search Time Saved

Vs. manual cross-tool searching

92%

Answer Accuracy

Validated against known-correct answers

100%

Citation Coverage

Of claims backed by a linked source

240+

Weekly Active Users

Across the organization

Illustrative Project

ResearchBot is an illustrative example demonstrating a RAG-based internal knowledge agent pattern, not a completed client engagement.

Overview

ResearchBot lets employees ask questions in plain language and get an answer synthesized from the organization's actual internal knowledge — documentation, past support tickets, and Slack discussion history — with citations pointing back to the original source, instead of manually searching across four or five disconnected internal tools.

The Challenge

Internal knowledge was genuinely scattered: some in a wiki, some in resolved support tickets that contained the actual answer to a recurring question, some only ever discussed in a Slack thread that nobody documented elsewhere. The system needed to search across fundamentally different content types and produce a coherent, trustworthy answer — not just a list of possibly-relevant links.

Architecture & Technical Decisions

Unified Embedding Index Across Source Types

Documents, ticket resolutions, and Slack threads were each processed into embeddings and stored in a unified vector index, with metadata preserving the source type and a link back to the original. This let a single query search across all sources simultaneously rather than requiring separate searches per tool.

Chunking Strategy Tailored Per Source Type

Wiki documents were chunked by section to preserve coherent context. Slack threads were chunked by conversation rather than individual message, since a single message rarely contains a complete answer on its own. Getting this chunking strategy right per source type made a measurable difference in retrieval relevance compared to a one-size-fits-all approach.

Mandatory Citation Enforcement

The generation prompt requires every factual claim in the answer to be tied to a specific retrieved source, and the system displays the cited sources alongside the answer. If the retrieval step doesn't find anything sufficiently relevant, the system explicitly says so rather than generating an answer from general knowledge that might not reflect the company's actual current practice.

Unified vector index across docs, tickets, and Slack history with source-type metadata
Per-source-type chunking strategy rather than one generic approach
Mandatory citation requirement enforced in the generation prompt
Explicit 'not found' response when retrieval confidence is low, rather than guessing

Results

76% reduction in time employees spent searching across multiple internal tools for an answer
92% answer accuracy validated against a set of questions with known-correct answers
100% of factual claims in generated answers were backed by a linked, verifiable source
240+ weekly active users across the organization within the first quarter after launch

What I Learned

Retrieval quality, not generation quality, determined almost all of the system's perceived usefulness. The work that mattered most was getting the chunking strategy right for each different content type and being disciplined about citation — an answer with no clear source, even if factually correct, got far less trust from users than a shorter answer with a clear, clickable citation attached.

Related Projects

ComplianceWatch — Regulatory Filing Monitoring Agent

An agent that monitors regulatory filing deadlines across multiple jurisdictions, drafts filing checklists from source requirements, and proactively alerts the compliance team — replacing a manually maintained spreadsheet of deadlines.

Node.jsNestJSClaude APIPostgreSQL+1

-100% Missed Deadline Risk

OnboardAI — Automated Employee Onboarding Workflow Agent

An agent that orchestrates the multi-step, multi-system employee onboarding process — provisioning accounts, scheduling training, and answering new-hire questions — that previously required a coordinator manually tracking a checklist across five tools.

Next.jsNestJSOpenAI APIPostgreSQL+1

12hrs Coordinator Hours Saved

QABot — Bug Triage & Reproduction Agent for Engineering Teams

An agent that reviews incoming bug reports, attempts reproduction using available context, checks for duplicates, assigns severity, and routes to the right engineering team with relevant logs attached.

Node.jsNestJSClaude APIPostgreSQL+1

70% Triage Time Reduced