Back to prompts
AI Masterybeginner
0.0

Grade Any AI Output — Know If It's Actually Good or Just Sounds Good

AI can sound confident while being wrong. This prompt turns AI into its own quality checker.

Copy & Paste this prompt
You are an AI output quality analyst. Your job is to critically evaluate AI-generated content — not praise it.

I'll give you an AI output. Grade it ruthlessly on these dimensions:

1. ACCURACY (0-10) — Are the facts correct? Any hallucinations or made-up information? Flag anything that needs verification.
2. COMPLETENESS (0-10) — Does it fully answer the question? What's missing?
3. ACTIONABILITY (0-10) — Can someone actually USE this output? Or is it vague motivational fluff?
4. ORIGINALITY (0-10) — Is this generic copy-paste thinking, or does it show genuine insight?
5. STRUCTURE (0-10) — Is it well-organized? Easy to scan? Properly formatted?

Then provide:
- OVERALL GRADE (A+ to F)
- TOP 3 WEAKNESSES — Be specific, not polite
- REWRITE SUGGESTION — Show how the worst section should have been written
- VERDICT — "Would a human expert approve this?" (Yes / With edits / No / Dangerous)

The AI output to evaluate:
[PASTE THE AI OUTPUT HERE]

Original question/prompt that produced it:
[PASTE THE ORIGINAL PROMPT]
#quality-check#fact-checking#ai-evaluation#critical-thinking

Works with

chatgptclaudegemini

💡 Pro Tips

  • Use this BEFORE publishing or acting on any AI-generated content
  • Run it on a different AI model than the one that created the output
  • Keep a log of common AI failure patterns you discover

✨ Example Output

ACCURACY: 6/10 — Paragraph 3 claims 'React is faster than Vue in all benchmarks' which is incorrect. Vue 3's compiler optimizations outperform React in several scenarios...

ACTIONABILITY: 4/10 — Says 'optimize your code' but never shows HOW. Classic AI fluff.

OVERALL GRADE: C+
VERDICT: With edits — A human expert would flag 3 inaccuracies and demand specifics.

🧠 Why This Works

Using AI as its own quality evaluator exploits the model's ability to apply rubrics consistently. By defining explicit grading criteria — accuracy, completeness, clarity, actionability — you transform subjective quality assessment into structured, repeatable evaluation.

📅 When to Use This Prompt

Use after generating any important AI output before publishing or acting on it. Critical for content creators fact-checking articles, developers reviewing AI-generated code, or researchers validating AI-synthesized information.

🎯 What You'll Get

You get a detailed scorecard with ratings across multiple quality dimensions, specific weaknesses identified, and concrete improvement suggestions. Unreliable claims are flagged with confidence levels so you know exactly what to verify.

🔗 Related Prompts

Coding & Development

Senior Engineer Code Review

Get a thorough code review as if a senior engineer is looking at your PR — bugs, patterns, performance, and suggestions.

code-reviewbest-practicessecurity
4.9
intermediate
Decision MakingPremium

Cognitive Bias Detector

Identify cognitive biases affecting your decisions with debiasing techniques.

cognitive-biaspsychologycritical-thinking
4.6
intermediate