AI Workflows can display confidence scores and AI reasoning during workflow execution and moderation.

These features help teams better understand:

how certain the AI is about generated output
why the AI generated specific results
which products may require manual review

Confidence scoring and reasoning improve transparency and help maintain quality control inside automated workflows.

What are confidence scores?

Confidence scores indicate how certain the AI is about the generated result.

Higher confidence usually means:

clear source data
strong contextual information
reliable extraction or generation

Lower confidence may indicate:

incomplete product data
ambiguous descriptions
weak source information
uncertain classifications

Confidence scores help reviewers identify which outputs may require closer inspection.

What is AI reasoning?

AI reasoning explains why the AI generated specific output.

Reasoning provides visibility into:

extraction logic
classification decisions
translation choices
generated content behavior

This helps moderators better understand how the AI reached a result.

Why confidence scores matter

Large workflows may process:

thousands of products
multiple languages
incomplete supplier catalogs
inconsistent product structures

Reviewing every product manually is often unrealistic.

Confidence scores help teams:

prioritize moderation effort
focus on uncertain outputs
automate high confidence results
improve workflow efficiency

Why AI reasoning matters

AI reasoning improves transparency and trust inside automation pipelines.

Without reasoning, reviewers only see the generated result.

With reasoning, reviewers also understand:

what information the AI used
how conclusions were made
why specific values were selected

This improves moderation quality and debugging capabilities.

Where confidence scores appear

Confidence scores may appear during:

Attribute Extraction
Content Enrichment
Translation
Category Mapping
Validation workflows
Quality scoring workflows

Scores are typically visible inside:

test results
moderation screens
workflow result views

Where AI reasoning appears

AI reasoning may be visible inside:

workflow test results
moderation interfaces
action result screens

Reasoning is often accessible by opening the generated result details.

Example confidence score

Example:
Flavor extraction:

Flavor → Salmon
Confidence score → 96%

This indicates the AI is highly certain the product contains salmon based flavor information.

Example AI reasoning

Example reasoning:
"The product description references salmon based dry cat food intended for adult cats, therefore Flavor was assigned to Salmon and Lifecycle to Adult."

This helps reviewers understand why the extraction was generated.

High confidence outputs

High confidence outputs often contain:

clear product descriptions
structured source data
strong contextual signals
unambiguous terminology

These outputs may require less manual review.

Low confidence outputs

Low confidence outputs may occur when:

supplier content is incomplete
descriptions are vague
products contain conflicting information
categories overlap heavily
translations lack context

These outputs usually benefit from manual moderation.

Using confidence scores during moderation

Moderators can use confidence scores to prioritize review work.

Example strategy:

High confidence outputs → lighter review
Medium confidence outputs → standard moderation
Low confidence outputs → detailed inspection

This helps scale moderation more efficiently.

Confidence scores are indicators, not guarantees

A high confidence score does not always mean the output is correct.

Similarly, low confidence does not always mean the output is wrong.

Confidence scores should be used as:

review indicators
prioritization tools
workflow guidance

not as absolute truth.

Improving confidence scores

Confidence scores often improve when:

source data becomes cleaner
prompts become more specific
workflows become more targeted
extracted attributes provide more context

Workflow optimization usually improves both confidence and output quality over time.

Improving AI reasoning quality

Better prompts often produce:

clearer reasoning
more transparent logic
stronger contextual explanations

Well structured workflows also improve reasoning reliability.

Common reasons for low confidence

Low confidence scores may be caused by:

missing descriptions
weak supplier content
incomplete attributes
vague product names
unclear category structures

Improving source data quality often improves workflow performance significantly.

Best practices for using confidence scores

Prioritize low confidence reviews

Focus moderation effort on:

uncertain outputs
edge cases
complex products
multilingual content

This improves operational efficiency.

Combine confidence with moderation

Confidence scoring works best when combined with:

human review
workflow moderation
prompt optimization

This creates safer automation pipelines.

Use category specific workflows

Different categories often produce different confidence behavior.

Examples:

Fashion products
Electronics
Pet food
Technical equipment

Focused workflows improve extraction reliability.

Improve prompts continuously

Weak prompts often produce:

lower confidence
vague reasoning
inconsistent output

Prompt optimization is an important part of workflow management.

Example moderation flow

Example:
A webshop uses Attribute Extraction to detect:

Flavor
Lifecycle

Workflow results:

Product A → 98% confidence
Product B → 54% confidence

Moderators focus manual review on Product B because the AI is less certain about the extraction.

Reasoning helps reviewers understand why the AI selected the generated values.

Why confidence scoring and reasoning are important

Confidence scores and AI reasoning help businesses:

scale moderation safely
improve transparency
reduce manual review effort
identify weak source data
improve workflow quality over time

These features make AI Workflows more understandable, controllable and scalable for large catalog operations.

Understanding workflow actions

Attribute Extraction

Running workflow actions

Understanding moderation

Reviewing AI generated output

Confidence scores and reasoning