Question 1

What is copyright.sh?

Accepted Answer

copyright.sh is a content licensing platform that enables fair compensation for creators when their content is used to train AI models. Think of it as "ASCAP/BMI for the AI age" - we ensure creators get paid when AI companies use their work.

We provide a simple meta tag system for content protection, real-time usage tracking with HMAC verification, and automatic payments to creators when AI companies license their content.

Question 2

How is this different from traditional licensing?

Accepted Answer

Traditional licensing requires complex negotiations, legal paperwork, and often takes months. copyright.sh provides:

• Instant setup with a simple meta tag
• Transparent pricing set by creators
• Real-time payments (daily vs quarterly)
• Global coverage (not limited to specific regions)
• 100% revenue share to creators vs ~60-70% in traditional licensing

Question 3

Is this legal?

Accepted Answer

Yes, copyright.sh is built on established copyright law. Content creators own the copyright to their work and have the legal right to license it on their terms.

We're also supporting 35+ ongoing lawsuits against AI companies that use content withou permission. Our system provides the legal documentation and proof needed for copyrigh enforcement.

Question 4

How much does it cost?

Accepted Answer

For Creators: Free to join. You keep 100% of licensing fees. We charge AI companies an additional 10-15% platform fee for our compliance and tracking services. For AI Companies: Pricing follows a 2-axis model based on jurisdiction, risk tier, and content class. Rates start at $0.01 / 1K tokens for open-web public-domain–equivalent text and scale up to $50 / 1K tokens for embargoed newsroom or premium book content. No minimum commitments, monthly fees, or hidden charges—you pay only for the exact tokens you train on.

Question 5

How do I get started as a creator?

Accepted Answer

Getting started takes just 5 minutes: 1. Sign up for a free account. 2. Add our meta tag to your website. 3. Set your rates and start earning when AI companies use your content.

Question 6

How much can I earn?

Accepted Answer

Earnings depend on your content quality, traffic, and rates. Based on our launch projections, creators can expect:

• Tech bloggers: $85-$140/month
• Researchers: $200-$312/month
• News sites: $150-$250/month
• Authors: $50-$400/month

Use our earnings calculator on the pricing page for a personalized estimate.

Question 7

What rates should I set?

Accepted Answer

Popular rate ranges by content type: News articles: $0.01-$0.50 per 1K tokens, Blog posts: $0.10-$1.00 per 1K tokens, Technical docs: $0.50-$5.00 per 1K tokens, Research papers: $1.00-$10.00 per 1K tokens, Books/premium content: $30-$100 per 1K tokens. You can adjust rates anytime based on demand and results.

Question 8

When and how do I get paid?

Accepted Answer

You get paid daily via Stripe once you earn $10 or more. Payments are automatic - no invoicing or paperwork required.

We support multiple currencies (EUR, USD, GBP) and provide detailed earning reports for tax purposes.

Question 9

Can I opt out anytime?

Accepted Answer

Yes, you can opt out anytime by removing the meta tag from your website. There are no contracts or commitments.

You'll receive any pending payments, and AI companies will no longer be able to license your content through our platform.

Question 10

Why should we license content instead of using fair use?

Accepted Answer

Commercial AI training is likely not fair use. With 35+ active lawsuits seeking billions in damages, the legal risk is enormous. Licensing provides: Legal protection from copyright claims, higher quality curated training data, brand protection - avoid 'AI theft' accusations, and regulatory compliance for EU AI Act and similar laws.

Question 11

How do we integrate with our training pipeline?

Accepted Answer

Integration takes under an hour: 1. Register for API access. 2. Check licenses before using content via our API. 3. Use licensed content in your training with automatic billing. We provide SDKs for Python, JavaScript, and Go, plus comprehensive API documentation.

Question 12

What does licensing cost vs lawsuit risk?

Accepted Answer

Our pricing is still trivial compared to litigation exposure:

• Open-web text: $0.05 – $0.50 / 1K tokens
• Professional & journalistic: $1 – $5 / 1K tokens
• Embargoed newsroom (breaking inference) & premium books (training): $10 – $50 / 1K tokens

Training 10 M tokens on premium news therefore costs ~$500K—still a rounding error nex to nine-figure statutory damages (e.g. The New York Times' $1 B+ claim).

Question 13

Do you provide compliance documentation?

Accepted Answer

Yes, we provide comprehensive compliance documentation:

• HMAC-verified usage logs with cryptographic proof
• Licensing agreements for each piece of content
• Payment receipts showing creator compensation
• Compliance certificates for regulatory audits
• Real-time dashboards for your legal team

Question 14

Is AI training really copyright infringement?

Accepted Answer

This is actively being litigated, but the trend is clear: courts are skeptical of AI companies' fair use claims for commercial training.

Key factors working against fair use:

• Commercial nature: AI companies make billions from training
• Market harm: AI output can replace original creators
• Substantial copying: Entire articles/books used for training
• No transformative purpose: Goal is to reproduce similar content

Question 15

What about robots.txt - isn't that enough?

Accepted Answer

robots.txt is not legally binding and many AI companies ignore it. It's a courtesy convention for web crawlers, not a copyright protection mechanism.

Our meta tag system provides:

• Legal enforceability based on copyright law
• Licensing terms clearly stated
• Payment mechanisms built-in
• Usage tracking for enforcement

Question 16

How do you enforce licensing terms?

Accepted Answer

We enforce licensing through multiple mechanisms:

• API blocking: Non-paying AI companies can't access content
• Legal documentation: Clear evidence for copyright claims
• Collective action: Supporting 35+ ongoing lawsuits
• Public transparency: Naming companies that refuse to pay
• DMCA takedowns: For unauthorized usage

Question 17

What about international copyright law?

Accepted Answer

Copyright protection is recognized internationally through treaties like the Berne Convention. Our system works globally because:

• Universal copyright: No registration required in most countries
• International enforcement: DMCA and equivalent laws worldwide
• EU AI Act compliance: Strict requirements for training data provenance
• Cross-border payments: We handle multi-currency transactions

Question 18

Which regulations do you comply with?

Accepted Answer

We are fully aligned with the EU AI Act (Art. 53 provenance), the U.S. NO FAKES Act draft, and Australia's proposed mandatory licensing scheme. Our policy team tracks 20+ global bills and updates the license generator automatically.

Question 19

Do you have a patent strategy?

Accepted Answer

Yes. We have three provisional patents filed covering (1) bar-second metering for generative audio, (2) cryptographic audit trails for AI training data, and (3) dynamic jurisdictional pricing. Final filings are scheduled for Q3 2025.

Question 20

How does HMAC verification work?

Accepted Answer

HMAC (Hash-based Message Authentication Code) provides cryptographic proof that conten was licensed:

1. AI company requests license for specific content
2. We generate HMAC signature using SHA-256 with content URL + timestamp + token count
3. Usage is logged with tamper-proof signature
4. Creators get paid based on verified usage

This prevents fraud and ensures accurate billing for all parties.

Question 21

What's a token and how do you count them?

Accepted Answer

A token is roughly 3-4 characters or about 0.75 words. For example, "Hello world!" is about 3 tokens.

Important: Billing counts tokens after normalization (no HTML/CSS/JS, ads, nav, or boilerplate).

We count tokens using the same methods as major AI companies:

• Text content: Standard tokenization (similar to GPT models)
• Average webpage: ~450 tokens
• Blog post: 800-2,000 tokens
• Research paper: 5,000-15,000 tokens

Question 22

How do you detect when AI companies use content?

Accepted Answer

We use multiple detection methods: API integration (ethical AI companies check licenses before using content), web crawling patterns (we monitor for training-specific access patterns), model output analysis (looking for training data 'leakage' in AI responses), and community reporting (creators can report suspected unauthorized use).

Question 23

What if my website is behind a paywall or login?

Accepted Answer

Our system works with protected content too:

• Meta tag in public areas: Add to login pages, headers, or preview content
• API integration: For sites with APIs that AI companies migh access
• Retroactive protection: If your content appears in training data, we can help with licensing claims

The key is making your licensing terms discoverable by AI companies.

Question 24

Do you offer search or retrieval-augmented generation (RAG)?

Accepted Answer

Phase 1 focuses on web-first, meta-tag licensing with verified API metering. Full search/RAG and in-context licensing will arrive in Phase 2 (Q4 2025) once the creator corpus exceeds 1 Bn licensed tokens.

Question 25

Do we get charged for HTML, code, or other token-stuffed noise?

Accepted Answer

No. Billing applies to the normalized content layer (human-readable text and structured fields), not raw markup or scaffolding. When content is sourced via partners like Tavily, Brave, or Firecrawl, it's already cleaned: boilerplate, navigation, ads, scripts, and duplicate blocks are removed. We then meter only the resulting content tokens and settle in batches. If your pipeline ingests raw pages, enable our Content Normalization filter to strip non-content prior to tokenization. Outcome: you pay for signal, not noise.

Question 26

How much do AI companies pay for content licensing?

Accepted Answer

The AI licensing market has grown to $816 million annually, with $2.92 billion committed in major publisher deals through November 2025. Average deals are valued at $24 million, though top-tier publishers like News Corp command $250M+ agreements. Anthropic's $1.5 billion settlement established $3,000 per copyrighted work as a legal precedent. Industry pricing ranges from $0.02-$0.10 per 1,000 tokens for text, $0.05-$1 per image, and $1-$4 per video minute.

Question 27

What percentage do creators keep with copyright.sh?

Accepted Answer

Creators receive 100% of licensing fees paid by AI companies. Unlike traditional licensing platforms like ASCAP and BMI, which retain 10-15% of royalties (creators get only 85-90%), copyright.sh charges AI companies an additional 10-15% platform fee for compliance infrastructure and tracking services. This means your $1 license generates $1 in your pocket, while the AI company pays $1.10-$1.15 total. Daily automated payouts via Stripe with full transparency.

Question 28

Are robots.txt blocks effective against AI scrapers?

Accepted Answer

No—robots.txt is increasingly ineffective. Evidence shows 12.9% of bots now ignore robots.txt (up from 3.3% previously), with 26 million robots.txt bypasses recorded in March 2025 alone. Perplexity was caught using undisclosed IP addresses to scrape sites that blocked their official crawler. robots.txt is a voluntary convention for web crawlers, not a copyright protection mechanism with legal enforceability. Effective alternatives include cryptographic verification (copyright.sh HMAC signatures) that creates legal evidence and CDN-level blocking that physically prevents access.

Question 29

Why is Google traffic dropping for publishers?

Accepted Answer

Google Search traffic has plummeted from 54% to 24% of total web traffic over just two years as AI-powered answer engines replace traditional search. 93% of AI searches now end without users clicking through to source websites. This represents an existential crisis for the advertising-based web model. Publishers are watching ad revenue evaporate while AI companies profit from their content without compensation. 60% of major news sites have implemented robots.txt blocks in response, but licensing infrastructure is critical for publisher survival.

Question 30

Which AI licensing platform is easiest to use?

Accepted Answer

copyright.sh offers the lowest barrier to entry: Installation takes 60 seconds with our WordPress plugin OR simple meta tag copy-paste. Configuration requires setting one price, done. Coverage extends to 43% of the web (all WordPress sites). No requirements for CDN integration, XML files, or enterprise sales cycles. For individual creators and small publishers, copyright.sh makes AI licensing accessible to everyone.

Question 31

What's the legal risk of using unlicensed content for AI training?

Accepted Answer

The legal exposure is staggering and growing: Anthropic paid $1.5 billion in settlement ($3,000 per work for ~500,000 books). The New York Times lawsuit against OpenAI seeks $1 billion+. Typical statutory damages range from $10-50 million for major copyright cases. 35+ active lawsuits are seeking billions in aggregate damages. Courts are increasingly skeptical of fair use claims for commercial AI training due to commercial nature (AI companies make billions from training), market harm (AI outputs replace original creators), substantial copying (entire articles/books used), and lack of transformative purpose (goal is reproducing similar content). Licensing cost comparison: Training 10M premium tokens costs ~$500K via licensing vs $1B+ potential litigation exposure—a 99.95% cost reduction. Regulatory pressure is intensifying with EU AI Act Article 53 requiring training data provenance, U.S. NO FAKES Act draft mandating compensation, and Australia proposing mandatory licensing schemes.

Frequently Asked Questions

Still Have Questions?

After Microsoft’s $357B Loss: Why Content Is the Next Bottleneck

Frequently Asked Questions

Still Have Questions?

After Microsoft’s $357B Loss: Why Content Is the Next Bottleneck

Stay up to date on Copyright.sh and AI Licensing