How Assay works — AI contribution detection for GitHub

Scoring model

Every submission produces a score from 0 (certainly human) to 100 (certainly AI agent). The scoring engine runs three tiers of signals in sequence:

score = 0 // Tier 1 — Zero false positive // Any single signal firing = high confidence if (promptLeakage) score = 95 // Tier 2 — Low false positive // Requires multiple signals before raising score lowFpCount = fired signals in this tier if (lowFpCount >= 3) score = max(score, 85) if (lowFpCount >= 2) score = max(score, 70) if (lowFpCount == 1) score = max(score, 40) // Tier 3 — Medium false positive // Score boosters only — never sufficient alone to flag mediumBoost = sum(each fired signal × 0.15) score = min(score + mediumBoost, 95) // Final score = clamp(score, 0, 100)

Tier 3 signals can never push a score above 95 on their own. A submission with perfect AI vocabulary and sentence uniformity but no Tier 1 or Tier 2 signals will score at most in the 40–60 range — marked "uncertain," not "agent."

Signal reference

Signal	Tier	What it detects
Prompt leakage	Zero FP	Unfilled template tokens in submission text — `{{first_name}}`, `[INSERT COMPANY]`. Only AI agents send these.
Heartbeat cadence	Low FP	Submissions from the same author arriving at machine-consistent intervals (30, 60, or 120 minutes ± 3 min variance).
Send-time anomaly	Low FP	Unnaturally even distribution of submission times — humans submit in irregular bursts, agents optimize for exact times.
Ghost author	Low FP	Username patterns consistent with algorithmic generation (high digit ratio, low vowel ratio) combined with no prior contribution history.
Superhuman speed	Low FP	Responses to comments arriving within seconds, consistently across a thread.
Sentence uniformity	Medium	Coefficient of variation in sentence length below 0.25. Human writing averages CV > 0.40. LLMs produce suspiciously consistent sentence lengths.
LLM vocabulary	Medium	Three or more characteristic LLM phrases ("I hope this finds you well," "please don't hesitate," "I'd be happy to") in a single submission.
AI opener formula	Medium	Research hook + pivot phrase + hard CTA structure in the opening paragraph. Fires when 3 of 5 sub-indicators are present.
Greeting formula	Medium	AI-generated greeting and closing phrases found in the first 100 and last 150 characters. Fires when 2+ patterns match.
Structural template	Medium	2–5 paragraph structure with personalization hook in first paragraph and CTA in last, 80–250 words, no casual language.
Personalization ratio	Medium	Generic pitch body with inserted personalization tokens — the pattern of AI SDR tools that merge variable fields into templates.
No human artifacts	Medium	Absence of typos, colloquialisms, formatting inconsistencies, or sentence fragments typical of real human writing.
Reply mirroring	Medium	Replies that address every point from the prior message in the same order — a systematic LLM response pattern.

Thresholds

0–39: No comment posted. Submission passes silently.
40–60: Comment posted with "Possibly AI-generated" finding and possibly-ai-generated label.
61–100: Comment posted with "Likely AI-generated" finding and ai-generated label.

The 40-point threshold is conservative by design. A single low-FP signal firing raises the score to 40 but does not trigger a comment. Two low-FP signals — or a zero-FP signal — are required before Assay says anything.

False positives

The tiered signal architecture is designed to make false positives rare. Medium-FP signals cannot flag a submission on their own — they only boost a score already elevated by Tier 1 or Tier 2 signals.

If Assay incorrectly flags a human contributor, you can dismiss the label manually. Feedback on false positives is welcome — contact us.

Privacy

Assay processes the text content of submissions to compute a score. Submission text is not stored after scoring. Author usernames are not retained. No submission data is used for training or shared with third parties.

Full privacy policy →