Cloud security tools often prioritize tracking changes over static risks, leaving dangerous blind spots unattended. A recent redesign of a drift detection system revealed how easily overlooked conditions—like long-unrotated secrets—can evade traditional monitoring. The lesson: not all risks are tied to change, and some require evaluating absolute state rather than transitions.
The flaw in change-only detection
Traditional drift detectors excel at identifying dynamic risks by comparing snapshots of cloud environments. A security group opened to 0.0.0.0/0 or an RDS instance flipped to public visibility triggers immediate alerts because these changes represent clear security threats. However, this approach fails when risks stem from inaction rather than activity.
Consider a secret in AWS Secrets Manager that hasn’t rotated in 200 days. Scanning it daily and comparing snapshots yields no differences—because the secret hasn’t changed. Yet its static state—unchanged for over 180 days—represents a critical security risk. The tool’s core logic, designed to flag transitions, is blind to this standing condition. As one engineer noted, "The dangerous state of this secret is precisely the state in which nothing is happening to it."
Rebuilding detection for static and dynamic risks
The solution required separating detection mechanisms into two distinct modules: one for change-based risks and another for standing conditions. The change detector retains its original function, comparing old and new snapshots to grade transitions:
# rules.py — grades a field-level diff (old → new)
if asset.raw_data_prev:
changes = _compute_raw_diff(asset.raw_data_prev, asset.raw_data)
if changes:
findings.extend(assess(asset.asset_type, changes)['findings'])
has_change = TrueMeanwhile, a new rotation module evaluates current state without relying on prior snapshots:
# rotation.py — grades a standing condition (now, no diff)
if (asset.raw_data or {}).get('_resource_type') == 'aws_secretsmanager_secret':
findings.extend(assess_rotation(asset.raw_data, now, max_age)['findings'])Both modules output identical risk structures ({'field', 'severity', 'reason'}), ensuring seamless integration into a unified severity-ranked list. The user interface remains unaware of the differing detection methods, presenting all findings uniformly.
Prioritizing rotation risks with a severity ladder
Rotation posture isn’t binary—it’s a multi-tiered assessment where each rung reflects a different level of urgency. The grading logic breaks down as follows:
- Rotation disabled → HIGH severity: Automatic rotation is turned off entirely. This isn’t just overdue; it’s structurally incapable of ever rotating.
- Enabled but never rotated → MEDIUM severity: The rotation feature is active, but the secret has never rotated, suggesting a misconfigured Lambda or abandoned configuration.
- Overdue past the limit → HIGH severity: The secret exceeded its maximum rotation age, but isn’t critically late.
- Over twice the limit → CRITICAL severity: A secret 180 days overdue indicates a dead process with no oversight.
This hierarchy is implemented as a pure function, free from external dependencies like AWS API calls. The only input required is the secret’s metadata—rotation status, last rotation date, and configured rotation rules—ensuring testability and reliability.
Avoiding secret exposure in security scans
A primary concern was ensuring the tool never handles sensitive data. By design, AWS Secrets Manager’s ListSecrets API returns only metadata such as RotationEnabled, LastRotatedDate, and RotationRules, without exposing the actual secret value. This approach eliminates the risk of secret leakage while still providing all necessary data for posture assessment:
sm.get_paginator('list_secrets')
# Returns: RotationEnabled, LastRotatedDate, NextRotationDate, RotationRules
# Excludes: SecretStringThe team validated this approach with automated tests that verify no secret strings are ever stored in scan records. A moto-backed test suite ensures compliance with this security constraint.
Addressing the "who did it?" ambiguity
Many risk alerts include an attribution feature that traces changes to specific users via CloudTrail. However, this mechanism breaks down for static risks like overdue secret rotation. Non-events—such as a secret that hasn’t rotated—have no actor to blame. Asking CloudTrail "who caused this secret to not rotate?" yields no results because the absence of an event isn’t attributable to any user.
To prevent misleading interfaces, the attribution button is conditionally enabled only for change-based findings. This ensures users aren’t presented with irrelevant or false information:
if has_change:
# Only change-based findings get an actor
findings.extend(assess(asset.asset_type, changes)['findings'])Honest limitations and customization
The severity thresholds and criticality cliffs in the rotation module are heuristic, not absolute rules. Teams should adjust settings like SECRET_ROTATION_MAX_AGE_DAYS to align with their compliance and security policies. This flexibility acknowledges that risk tolerance varies across organizations.
Cloud security tools must evolve beyond change detection to account for static risks. By separating dynamic and static evaluations, teams can eliminate blind spots and ensure no critical condition—rotated or static—slips through the cracks.
AI summary
Bulut sistemlerinde gizli anahtarların dönme süresi ihlallerini tespit etmek için statik durum analizi şart. Değişim tabanlı sistemlerin kaçırdığı riskleri nasıl yakalarsınız?