The rise of Model Context Protocol (MCP) servers as critical infrastructure demands more rigorous validation than a simple process start and schema check. Many teams mistakenly assume that if an MCP server boots and responds to tools/list with a clean schema, it is ready for production. This oversight can lead to failures in authentication, tenant isolation, environment configuration, or permission scopes—issues that only surface during actual agent interactions.
To address this gap, the latest release of mcp-probe@1.8.0 introduces stricter CI readiness checks that validate not just server availability but the full operational contract an MCP agent will depend on. The update transforms what was once a basic smoke test into a production-grade gate that catches subtle but critical failures before deployment.
Why basic CI checks for MCP servers fall short
A common mistake in MCP server validation is equating server initialization with operational readiness. Passing the initialize handshake and advertising expected tools does not guarantee that the server will handle real agent requests correctly. Common failure modes include:
- Broken OAuth flows that require browser redirects unsupported in headless environments
- Tools that return
401 Unauthorizeddespite correct server startup - Role-based permission issues where admin credentials work but read-only roles fail
- Workflow configurations that mention MCP probes without actually executing critical boundary checks
These issues often manifest as degraded performance rather than outright crashes, making them easy to overlook in superficial validation pipelines.
Stricter CI gates: what’s new in mcp-probe@1.8.0
The latest update introduces four key enhancements to transform MCP server validation from a basic check into a production-grade CI gate.
1. Warnings can now halt CI pipelines
Previously, mcp-probe treated warnings as non-fatal, allowing pipelines to continue even when issues like auth handoff failures or permission warnings arose. The new --fail-on-warn flag changes this behavior, ensuring that any warning triggers a pipeline failure.
npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warnThis stricter enforcement is critical because many MCP failures are not hard crashes but subtle degradations that break agent workflows. For example, an OAuth flow that cannot complete in a CI environment may not crash the server but will fail every subsequent agent request that depends on authenticated access.
2. Workflow receipt validation ensures actual execution
The doctor command previously checked whether a GitHub Actions workflow included mcp-probe steps, but this did not guarantee the checks were executed with the intended configuration. The updated behavior requires that all critical flags (--github-summary, --fail-on-warn, etc.) appear on the same step that runs the probe.
A valid configuration looks like this:
- run: npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warnAn invalid configuration spreads flags across multiple steps, making it impossible to verify that the intended checks were actually performed:
- run: npx @k08200/mcp-probe --config mcp-probe.config.json
- run: npx @k08200/mcp-probe ./server.js --github-summary --fail-on-warnThis distinction separates superficial pipeline coverage from meaningful enforcement of production contracts.
3. Tool call coverage now requires meaningful inputs
The tool now supports explicit declarations of expected tool catalogs, including sidecar sample inputs that validate real-world usage patterns. For example, a configuration can specify which tools must be tested and what inputs should trigger them:
{
"servers": [
{
"name": "datadog",
"target": "
"transport": "http",
"headers": {
"Authorization": "Bearer ${DATADOG_MCP_TOKEN}"
},
"expectedTools": ["logs_query"],
"forbiddenTools": ["delete_dashboard", "rotate_api_key"],
"toolsFile": "./datadog.tools.json"
}
]
}When both expectedTools and toolsFile are set, the probe validates not just that the tools are advertised but that meaningful dry-run samples are provided for each tool an agent might depend on. Auto-generated inputs are insufficient because they primarily test schema validation rather than functional readiness.
4. Sidecar inputs define the real operational contract
Meaningful sidecar inputs are essential for validating that an MCP server behaves as expected in production. For example, a logs_query tool might require a specific query and timeframe to verify that read-only roles work correctly:
{
"tools": {
"logs_query": {
"input": {
"query": "service:web status:error",
"timeframe": "1h"
},
"expect": {
"status": "pass",
"not_error_code": [401, 403],
"requiredFields": ["source", "freshness"],
"maxRows": 100
}
}
}
}For database-backed MCP servers, these assertions validate critical production concerns:
- Do read-only roles function as intended?
- Are row limits enforced to prevent excessive data exposure?
- Are administrative actions properly gated or absent from read-only endpoints?
- Do error responses include structured recovery guidance instead of raw stack traces?
- Do results include provenance fields like
sourceandfreshnessto ensure traceability? - Are sensitive data or internals accidentally exposed in responses?
Getting started with stricter MCP CI validation
Installing mcp-probe is straightforward via npm:
npm install -D @k08200/mcp-probeOr run it directly in your pipeline:
npx @k08200/mcp-probe@latest doctor
npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warnThe goal is simple: ensure that MCP servers in CI pass the same contract tests that agents will rely on in production. By treating warnings as failures, validating actual workflow execution, and enforcing meaningful tool call coverage, teams can catch subtle but critical issues before they reach end users.
As MCP adoption grows, the distinction between "the server starts" and "the server is ready" will define the reliability of AI-driven workflows. Stricter CI gates are the first line of defense against the hidden failures that slip through superficial validation.
AI summary
MCP sunucularınızın CI sürecinde yalnızca `tools/list` çıktısına güvenmek yeterli değil. `mcp-probe` aracındaki yeni özelliklerle yetkilendirme, kapsam ve gerçek araç çağrılarını nasıl doğrulayabilirsiniz?