Building an open index of 69K+ Claude Code skills — insights and lessons

A month ago, a solo developer set out to solve a growing problem in the Claude Code ecosystem: discoverability. Today, their open catalog hosts 69,369 indexed SKILL.md files, each representing a specialized skill for Anthropic’s Claude Code agent. Unlike traditional extensions or plugins, these skills are written in plain Markdown with YAML frontmatter, allowing developers to define behavior in natural language rather than code. The project now functions as a free, neutral index that anyone can use — and its creator is sharing the engineering journey behind it.

Why the Claude Code skill ecosystem needed an open catalog

The core challenge wasn’t technical; it was structural. Before this catalog, finding Claude Code skills relied on word of mouth or scattered repositories. Official directories didn’t exist, and curated lists covered only a few hundred entries. Meanwhile, GitHub’s code search revealed thousands of public repos containing SKILL.md files — a long tail of unindexed, often niche tools. Many skills were high-quality tools with detailed documentation, pricing models, and anti-trigger guidance. Others were minimal stubs labeled "TODO: write a skill that does X." Both were technically indexable, but neither was distinguishable without deeper inspection. Worse, the format itself was evolving rapidly, with new frontmatter fields like allowed-tools and user-invokable being added monthly. A skill valid yesterday could break tomorrow if required fields were missing.

Most critically, there was no reliable way to build tools on top of the ecosystem. Want to evaluate skills systematically? Build a recommender? Create an installer? You’d have to scrape GitHub yourself — an inefficient, error-prone process. The creator aimed to fix all four issues with a single solution: an open, daily-updated catalog with a free API and dataset, built on transparency and anti-rent-seeking principles.

How the miner works: 24 sources, 4 hours, zero paywalls

The catalog runs on a single Python script executed nightly at 1 AM on a Mac mini in the creator’s office. It doesn’t just check GitHub; it pulls from 24 public sources to surface SKILL.md files that others miss. These include GitHub code search with 101 query variants, GitHub Topics and their 31 variants, GitLab, Codeberg, and Hugging Face repositories. It even scans Reddit, Hacker News, Dev.to, YouTube descriptions, and Telegram messages for mentions of repo URLs — all using text-blob analysis. The miner uses the Wayback Machine’s CDX API to recover skills from repos that were renamed or deleted, ensuring continuity. It also mines the stargazer graphs of known skill repos, because authors who publish one skill often have others hidden in their profile.

# Example query variant used in mining
queries = [
    'filename:SKILL.md language:python',
    'filename:SKILL.md topic:claude-code-skills stars:>10',
    'filename:SKILL.md model:sonnet-3.5 description:*api*',
    '"claude skill" in:readme language:javascript'
]

Each source returns candidate URLs, which the miner fetches, validates for YAML correctness, and scores using a content-only admission model. Valid skills are categorized into 10 domains (Engineering, Security, Growth, etc.) and tagged across ~100 orthogonal dimensions like language, AI provider, cloud platform, and integration type. The output is a static HTML page per skill, published the same day it’s discovered. The entire process takes about four hours, with safeguards in place to prevent API rate limit exhaustion.

Admission rules: content over popularity, always

The creator’s most opinionated design choice is clear: ranking cannot be bought. No pay-to-list fees. No popularity-based ranking. The catalog admits skills based on signals derived solely from the SKILL.md file itself. This ensures objectivity and prevents gaming. For example:

Anti-trigger discipline: A skill with a "when NOT to use" or "out of scope" section earns +4 per pattern, capped at +16. This rewards authors who think critically about edge cases.

Transparency: Skills that document pricing, rate limits, or expected API costs get +10. Hidden fees or quotas are flagged early.

Frontmatter depth: Beyond the basics like name: and description:, each additional valid field (model:, tags:, version:, etc.) increases the score, capped at 10 distinct keys to prevent abuse.

Structure and substance: A skill with a detailed body (>800 characters), multiple code examples, and clear headings scores higher. Conversely, placeholder text like "// TODO" or generic templated phrases incurs a 5-point penalty.

Crucially, the score never considers GitHub stars, forks, follower counts, or install metrics. A skill authored by a developer with zero followers but a robust anti-trigger section outranks a flashy project by a 50,000-follower influencer if the latter lacks depth. This ensures fairness and prioritizes quality over hype.

What’s next for the catalog and the ecosystem

The creator emphasizes that this is just the beginning. While the catalog solves discovery and quality variability, a separate evaluation layer is planned for the desktop app’s Pro tier — one that will use the same content-first formula. The goal remains unchanged: build a neutral, transparent foundation for the Claude Code skill ecosystem. No monetization of listings. No paywalls. No ranking algorithms skewed by influence.

As the format matures and more developers contribute skills, the open index could become the de facto standard for skill discovery. But its real value lies in empowering builders — whether they’re evaluating tools, building recommenders, or simply exploring what’s possible with Claude Code. The long tail is no longer invisible.

For now, the catalog continues its nightly crawls, adding new skills daily and refining its admission logic. The next step? Expanding the miner’s reach and deepening the content signals to capture even more nuance in skill quality.

AI summary

Claude Code için 69.000’in üzerinde `SKILL.md` dosyasını nasıl toplu olarak indekslediğimizi, kabul kriterlerini ve gelecekteki fırsatları keşfedin. Açık veri ve içerik odaklı puanlama sistemi hakkında tüm detaylar.

Building an open index of 69K+ Claude Code skills — insights and lessons

Why the Claude Code skill ecosystem needed an open catalog

How the miner works: 24 sources, 4 hours, zero paywalls

Admission rules: content over popularity, always

What’s next for the catalog and the ecosystem

Comments

Shift from prompts to specifications for AI-assisted coding

How to Pick a Headless CMS That Won’t Slow Down Your Team

How JVM internals impact microservices in containers