Handpicked and reviewed AI applications to supercharge your workflow — updated daily.
Page 5 of 12 · 1,004 tools
Cleanlab automatically finds and fixes label errors in datasets using confident learning algorithms. Studies show 3-10% of labels…
Vellum is an AI product development platform with prompt versioning, side-by-side comparisons, and evaluation workflows. Product and engineering…
Labelbox is a data-centric AI platform for labeling training data, managing datasets, and evaluating model quality. Used by…
PromptLayer is a platform for tracking, managing, and evaluating LLM prompts in production. Log every prompt and completion,…
Langfuse is an open-source LLM engineering platform for observability, testing, and prompt management. Debug production AI issues, evaluate…
Helicone provides one-line LLM observability — add a single line to your OpenAI calls and get full logging,…
Opik by Comet is an open-source LLM evaluation framework for testing AI application quality at scale. Automated evaluation…
Braintrust is an enterprise AI evaluation platform for measuring, improving, and shipping AI applications. Logging, evaluation datasets, prompt…
Phoenix by Arize is an open-source AI observability library for ML engineers. Traces LLM and embedding applications, visualizes…
TruLens is an open-source framework for evaluating and tracking LLM applications. Feedback functions assess truthfulness, harmlessness, and helpfulness…
🔍
We review every submission within 24–48 hours. Free listing, no strings attached.