Monitoring Discourse AI

Monitoring and evaluating LLMs is critical:

I started working with language models five years ago when I led the team that created CodeSearchNet, a precursor to GitHub CoPilot. Since then, I’ve seen many successful and unsuccessful approaches to building LLM products. I’ve found that unsuccessful products almost always share a common root cause: a failure to create robust evaluation systems.

https://hamel.dev/blog/posts/evals/

If Discourse AI is to power business-critical LLM tasks, I think supporting monitoring tools like LangSmith should be prioritized.

Using LangSmith is as simple as running yarn add langchain langsmith and adding a few environment variables.

Has team Discourse thought about how we can configure LLM tracing? Also, any thoughts on how we can implement this prior to discourse-ai official supporting it?



Discuss this on our forum.