tRAGar Playground

Source label

Document text (split on blank lines → chunks)

chunks: — | dim: — | model: —

Ingested documents

No documents yet.

Embedder

Model Fast 6-gram hash. Deterministic, no download, but not truly semantic — synonyms won't score well.

Chunking

Strategy Split on one or more blank lines. Natural for prose and markdown.

Min chunk length

Filter out chunks shorter than this

Chunk preview

Paste text in the Ingest tab, then click preview.

Display

Score dim threshold

Dim results below X% of top score

30%

Instance (read-only)

namespace: playground
store: —
embedder: playground-hash-v1
dim: 384

To change namespace or store, use reset storage in the Ingest tab.

k = 5

Results will appear here.

No chunks yet. Ingest some text first.

Run a query first, then switch here to see the score distribution.

Chunks

Before indexing, each document is split into chunks — smaller pieces of text that can be matched independently. Smaller chunks give more precise results; larger chunks preserve more context.

Strategy	How it splits	Best for
Blank-line	One or more empty lines	Prose, markdown, docs
Sentence	Sentence-ending punctuation	Dense text, articles
Fixed size	Every N characters	Code, logs, structured data

Configure the strategy in the Config tab. Use Preview current text to see how your document would be split before ingesting.

Embeddings

Each chunk is converted into a vector — an array of numbers (here, 384 dimensions). Similar texts produce similar vectors. This encoding captures meaning, not just keywords.

The playground supports a hash embedder (fast, no download, not semantic) and real transformer models via the Config → Embedder selector. The hash embedder counts 6-character n-gram hashes, which captures surface patterns but not synonyms or paraphrases. Switching to all-MiniLM-L6-v2 or better will dramatically improve recall. Each model uses its own namespace so your data isn't lost when switching.

Click any result or chunk to expand its embedding bar chart — the top-64 active dimensions, showing which features the embedder fired on.

Cosine similarity (score)

Retrieval ranks chunks by the cosine similarity between the query vector and each chunk vector. The score is between −1 and 1; higher is more similar.

Scores near 1.0 mean near-identical vectors. Scores near 0.0 mean orthogonal (unrelated). Negative scores are rare with the hash embedder because all values are non-negative.

The Score dim threshold in Config dims results that score less than X% of the top hit — useful for filtering out low-relevance chunks without hard-coding a cutoff.

k (top-k)

The k slider in the Results tab controls how many results to return. Larger k = higher recall (more chunks returned), but may include lower-relevance hits.

The Histogram tab shows the full score distribution across all chunks — not just top-k — which helps you choose a good k value and understand how well-separated the relevant chunks are.

Storage (OPFS / IndexedDB)

tRAGar persists chunks to Origin Private File System (OPFS) — a fast, sandboxed, browser-native file store. Chunks survive page reloads within the same origin.

If OPFS is unavailable (older browser or HTTP context), the library automatically falls back to IndexedDB and emits a StoreFallback warning shown in the log.

The store namespace (here: playground) isolates data. Two pages with different namespaces don't share chunks. Use reset storage to wipe the current namespace.

Keyboard shortcuts

Key	Action
Enter	Run query (when query input focused)