L2 semantic chatbot cache
Repeat-style chatbot questions hit the L2 semantic cache (bge-m3 embeddings, cosine threshold 0.92) instead of LLM. Cuts cost by ~35% on FAQ-heavy traffic.
Repeat-style chatbot questions hit the L2 semantic cache (bge-m3 embeddings, cosine threshold 0.92) instead of LLM. Cuts cost by ~35% on FAQ-heavy traffic.
Free account, no card required. Your vote helps decide what we ship next.
Create free account Log inYou can change these choices anytime via the "Cookie preferences" link in the footer.
Required for the site to function: language, login, security tokens, your consent choice. Always on.
Helps us understand which pages and tools are useful (Google Analytics, web vitals, performance monitoring). Never used to identify you personally.
Lets us measure which marketing channels work (UTM attribution, A/B test variants, ad effectiveness). Aggregated only — no profile sale.
Comments (0)
Sign up to comment.