How a custom GitLab code review bot cut AI costs by 80%

Developers at a mid-sized team faced a growing AI bill from GitLab Duo Code Review, averaging $1 for every five merge requests. Seeking a cost-effective alternative, they turned to their existing ChatGPT subscription and built a custom bot that automated code reviews without per-token billing. The project, completed in a weekend, eliminated recurring expenses while preserving deep code analysis and threaded conversations.

The limitations of off-the-shelf AI review tools

GitLab Duo Code Review operates on a pay-per-token model, which quickly adds up for active development teams. The team’s $0.20 per review charge ballooned with each merge request, making automation essential. While OpenAI’s Codex SDK offered promising automation potential, it fell short in key areas: strict token-based billing, closed workflows, and lack of session continuity for follow-up comments. The team needed a solution that could resume conversations, self-host securely, and leverage their existing ChatGPT subscription—without incurring additional costs.

Why a custom bot outperformed standard approaches

Two common Codex integration patterns were considered but ultimately rejected. The first involved running Codex as a CI job, where each execution incurred token charges and provided no state persistence. The second relied on stateless codex exec commands, which also charged per token and failed to support threaded discussions. Both approaches were efficient for environments with strict infrastructure constraints but incompatible with the team’s goals: conversation persistence, seamless follow-ups, and cost predictability.

Instead, the team prioritized a service-based architecture that could:

Maintain conversation state across multiple developer replies
Run on their existing ChatGPT subscription without per-token billing
Self-host on their OpenShift cluster for security and control
Receive model updates automatically as OpenAI released them

With these requirements in place, building a custom solution became the logical path forward.

How the custom bot handles code reviews end to end

The bot’s workflow is intentionally streamlined, relying on a webhook-driven pipeline that processes merge requests through a secure queue. When a developer opens or updates a merge request, GitLab triggers a webhook that Fastify validates and forwards to a BullMQ queue backed by Redis. A single worker processes each job, prepares a secure worktree from a cached repository clone, and invokes Codex with the relevant codebase context.

The results are posted back to GitLab as both a summary comment and inline discussions at specific lines of code. To prevent abuse and ensure compliance with OpenAI’s terms, the system enforces strict concurrency limits—only one Codex seat runs at any time, and rate limits trigger automatic retries with exponential backoff.

Secure state management across multiple storage layers

The bot distributes state across three secure storage layers, each serving a distinct purpose:

Redis: Manages the BullMQ job queue with deterministic job IDs (review--- and note---) to avoid duplicate processing. If Redis fails, subsequent webhooks re-enqueue the request automatically.
MariaDB: Tracks high-level identifiers in two tables—reviews for merge request metadata and threads for bot-authored discussions. No code diffs, prompts, or LLM outputs are stored here, limiting exposure in case of a breach.

Codex sessions: Stored as JSONL files on a persistent volume claim, each session file contains the full conversation context. The bot retains only the thread ID in the database, while the session data remains isolated under namespace-level role-based access control. This separation ensures sensitive content never enters the database, reducing security risks.

Enabling threaded follow-ups with precision

When a developer replies to a bot comment using @mr-codex, the system recognizes the discussion as a follow-up and resumes the existing Codex session. The worker retrieves the session from storage, appends the new prompt, and invokes Codex with the full prior context intact. This approach eliminates redundant token consumption and preserves conversational continuity.

Replies without the @mr-codex trigger are intentionally ignored to prevent unintended bot participation in unrelated discussions. Mentions outside existing bot threads initiate fresh sessions, with subsequent replies correctly resuming the new thread. This design prevents bot-to-bot loops and ensures GitLab discussions remain focused.

Overcoming sandbox restrictions in OpenShift clusters

The most challenging hurdle was enabling Codex’s sandboxed modes on an OpenShift cluster. Codex offers three sandbox levels—read-only, workspace-write, and danger-full-access—but the first two rely on bwrap, which requires unprivileged user namespaces. OpenShift’s default restricted-v2 security context constraints block these modes entirely.

After exploring Landlock-based alternatives and custom seccomp profiles, the team discovered that danger-full-access mode was the only viable option. While it lacks sandboxing, it operates within the cluster’s existing RBAC and volume restrictions, ensuring security without compromising functionality. The solution proved robust enough for production use.

The road ahead for AI-powered code reviews

The custom bot has already delivered measurable savings and operational flexibility, but its potential extends further. Future improvements could include integrating additional static analysis tools, expanding language support, or refining the prompt engineering for deeper insights. For teams looking to reduce AI tooling costs without sacrificing quality, a self-hosted, state-aware code review bot represents a compelling alternative to proprietary solutions.

As AI-driven development tools evolve, the real value lies in customization. Tools that adapt to existing infrastructure—and existing budgets—will define the next generation of software engineering workflows.

AI summary

GitLab Duo Code Review maliyetlerinden kurtulun! ChatGPT aboneliğiyle kendi otomatik kod inceleme botunuzu nasıl oluşturacağınızı adım adım öğrenin.

How a custom GitLab code review bot cut AI costs by 80%

The limitations of off-the-shelf AI review tools

Why a custom bot outperformed standard approaches

How the custom bot handles code reviews end to end

Secure state management across multiple storage layers

Enabling threaded follow-ups with precision

Overcoming sandbox restrictions in OpenShift clusters

The road ahead for AI-powered code reviews

Comments

2026 Travel Costs: Where $20 Per Day Beats $170 for Beach Vacations

Why Breaking Up Your App into Microservices Boosts Scalability

How Test-Driven Development Turns Fear of Bugs Into Confidence