Automate Secret Leak Prevention with Git Pre-Commit Hooks

In the rush to ship features, developers sometimes embed API keys or credentials directly into code to test functionality. While the intent may be temporary, the consequences linger indefinitely once those secrets enter version control. A 2023 report from GitGuardian identified more than 10 million exposed secrets in public GitHub repositories, and the volume continues to rise. The damage isn't just embarrassment—compromised keys can trigger massive cloud bills on AWS or grant attackers access to sensitive payment data on Stripe. Cleaning secrets from history requires rewriting commits, force-pushing, and rotating every dependent service. Prevention, therefore, must happen before the push.

How pre-commit hooks stop secrets before they reach history

Git’s pre-commit hook runs automatically before each commit. If the hook exits with a non-zero status, the commit is blocked entirely—effectively stopping secrets from ever entering the repository. The solution involves scanning staged files for patterns that resemble API keys, tokens, or credentials. When a match is found, the developer is prompted to remove the secret or suppress the flag before proceeding.

To implement this, teams add a script to .git/hooks/pre-commit. The hook filters staged files, skips binaries, and uses git show ":$file" to read only the staged version—not the working copy—ensuring consistency and preventing false negatives from partial staging:

#!/bin/sh
# Pre-commit hook: block secrets from entering git history
set -e

STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACM)
if [ -z "$STAGED_FILES" ]; then
  exit 0
fi

FOUND=0
for file in $STAGED_FILES; do
  # Skip binary files
  if file "$file" | grep -q "binary"; then
    continue
  fi

  # Read only staged content
  CONTENT=$(git show ":$file" 2>/dev/null) || continue

  # Check for known secret patterns
  if echo "$CONTENT" | check_patterns "$file"; then
    FOUND=1
  fi
done

if [ "$FOUND" -eq 1 ]; then
  echo "COMMIT BLOCKED: potential secrets detected."
  echo "Use a suppression comment to bypass false positives."
  exit 1
fi

Matching patterns that reveal real-world secrets

The most effective hooks include a curated set of regular expressions derived from actual secret formats. These patterns target common providers like AWS, Stripe, and GitHub, as well as generic high-entropy strings that often represent API keys or tokens:

check_patterns() {
  file="$1"
  matched=0

  # AWS Access Key ID
  if echo "$CONTENT" | grep -nE 'AKIA[0-9A-Z]{16}' | filter_suppressed; then
    echo " [AWS] $file: AWS Access Key ID"
    matched=1
  fi

  # Stripe secret keys
  if echo "$CONTENT" | grep -nE 'sk_(live|test)_[0-9a-zA-Z]{24,}' | filter_suppressed; then
    echo " [STRIPE] $file: Stripe secret key"
    matched=1
  fi

  # Stripe restricted keys
  if echo "$CONTENT" | grep -nE 'rk_(live|test)_[0-9a-zA-Z]{24,}' | filter_suppressed; then
    echo " [STRIPE] $file: Stripe restricted key"
    matched=1
  fi

  # GitHub personal access tokens
  if echo "$CONTENT" | grep -nE 'ghp_[0-9a-zA-Z]{36}' | filter_suppressed; then
    echo " [GITHUB] $file: GitHub PAT"
    matched=1
  fi

  # Generic high-entropy strings
  if echo "$CONTENT" | grep -nE "['\"][0-9a-zA-Z]{32,}['\"]" | filter_suppressed; then
    echo " [ENTROPY] $file: high-entropy string (>=32 chars)"
    matched=1
  fi

  return $matched
}

The generic check for 32+ character alphanumeric strings is especially valuable, catching tokens and keys that don’t match known vendor prefixes. It also flags legitimate values like UUIDs or hashes, which is where suppression becomes essential.

Suppressing false positives with intentional comments

No pattern scanner is perfect. Hashes, encoded public keys, or long IDs may trigger false alarms. To avoid disabling the hook entirely, teams use a suppression pragma: pii-ok. If a line contains this marker, the scanner skips it. This balance keeps the hook effective while minimizing interruptions:

// Test fixture containing a SHA-256 hash - no sensitive data
const EXPECTED_HASH = 'a1b2c3d4e5f6...'; // pii-ok

// This WILL be blocked (no suppression comment)
const STRIPE_KEY = 'sk_live_abc123...';

The rule is straightforward: if a value is confirmed non-sensitive, add pii-ok. If unsure, leave it uncensored and let the hook flag it. The minor friction of a false positive pales in comparison to the cost of a leaked secret.

Extending protection to .env and .htaccess files

Secrets aren’t limited to source code. Environment files and web server configurations frequently contain credentials. Teams should extend pre-commit checks to block .env files entirely and flag .htaccess entries that embed real values:

# Block .env files
if echo "$file" | grep -qE '\.env$'; then
  echo " [ENV] $file: .env files must be .gitignored"
  FOUND=1
  continue
fi

# Flag SetEnv in .htaccess with real values
if echo "$file" | grep -qE '\.htaccess$'; then
  if echo "$CONTENT" | grep -nE 'SetEnv\s+\S+\s+\S+' | filter_suppressed; then
    echo " [HTACCESS] $file: SetEnv with real values"
    FOUND=1
  fi
fi

The convention is simple: commit sanitized templates like .env.example with placeholder values. The real .env file remains in .gitignore. The same principle applies to .htaccess—keep production credentials out of version control.

Beyond regex: the role of AI in secret detection

Regular expressions excel at catching known patterns, but they struggle with obfuscated or novel secrets. Emerging AI-powered tools analyze context, entropy, and distribution to flag unusual strings that regex might miss. These systems can detect database connection strings, hardcoded JWTs, or custom token formats without relying solely on predefined patterns. Integrating such tools with pre-commit hooks provides a layered defense—combining immediate prevention with adaptive detection.

By implementing a pre-commit hook today, teams can shift from reactive damage control to proactive protection, ensuring secrets never make it into version control in the first place.

AI summary

Stop API key leaks before they reach Git. Learn to automate secret detection with pre-commit hooks, preventing costly breaches and cloud overages using regex and suppression pragmas.

Automate Secret Leak Prevention with Git Pre-Commit Hooks

How pre-commit hooks stop secrets before they reach history

Matching patterns that reveal real-world secrets

Suppressing false positives with intentional comments

Extending protection to .env and .htaccess files

Beyond regex: the role of AI in secret detection

Comments

How MCP Transforms App Interactions Beyond Traditional APIs

Why Validating Ideas Beats Building Perfect Products

Refine AI Writing Before the Prompt: 5 Editorial Choices That Matter