How to Build a Safe Developer Agent with OpenAI Agents SDK

Building your first developer agent can feel overwhelming, especially when visions of autonomous code editors come to mind. However, the most effective agents start not by writing code, but by understanding the problem thoroughly. A developer agent’s first job should be to read the issue, examine the codebase, draft a plan, suggest tests, and summarize findings for review—before touching a single line of production code. This approach builds trust and ensures the agent remains a helpful assistant rather than an uncontrolled system.

This guide walks through creating a cautious developer agent using the OpenAI Agents SDK. The goal isn’t to build a fully autonomous engineer, but a disciplined workflow assistant that can analyze issues, search repositories, and propose changes with clear safety boundaries. The example uses Python, reflecting the SDK’s strong support for the language, and is designed as a learning template rather than a production-ready system.

Why Start with Caution? The Agent’s Role in Real Engineering

A developer agent should act as a senior engineer’s collaborator, not a replacement. Its primary value lies in carefully analyzing problems before any edits are made. The agent should:

Read GitHub issues to understand the problem context
Search the codebase to find relevant files
Create a step-by-step implementation plan
Suggest or run tests to validate assumptions
Generate a pull request summary for human review

Only after this process should it propose edits—and even then, human approval should be required before any write action. This conservative design ensures the agent remains helpful, transparent, and safe to use in real workflows.

Setting Up the Development Environment

Start by creating a new Python project in a dedicated directory. Use a virtual environment to isolate dependencies and prevent conflicts:

mkdir developer-agent-demo
cd developer-agent-demo
python -m venv .venv
source .venv/bin/activate  # On Windows, use `.venv\Scripts\activate`
pip install openai-agents

Next, configure your OpenAI API key. This key enables the agent to interact with the model, so keep it secure and avoid committing it to version control:

export OPENAI_API_KEY="your-api-key-here"

Organize your project with a simple structure:

agent.py – The main agent logic
tools.py – Safe tools for file access and code search
github_client.py – Minimal GitHub issue reader
safety.py – Optional safety policies (not used in this example)
target-repo/ – A local repository the agent will analyze

The target-repo directory simulates a real codebase you want the agent to inspect.

Designing Safe Tools: The Agent’s Controlled Toolkit

The most critical part of any agent is its toolset. Tools define what actions the agent can perform, so they must be carefully designed to prevent unintended behavior. In this example, tools are limited to read-only and testing operations—no file writes, deletions, or remote actions.

In tools.py, define three essential tools:

from pathlib import Path
import subprocess
from typing import List

REPO_ROOT = Path("target-repo").resolve()

def safe_path(relative_path: str) -> Path:
    path = (REPO_ROOT / relative_path).resolve()
    if not str(path).startswith(str(REPO_ROOT)):
        raise ValueError("Path is outside the repository.")
    return path

def search_codebase(query: str, max_results: int = 20) -> List[str]:
    result = subprocess.run(
        ["grep", "-R", "-n", query, str(REPO_ROOT)],
        text=True,
        capture_output=True,
    )
    lines = result.stdout.splitlines()
    return lines[:max_results]

def read_file(relative_path: str, max_chars: int = 12000) -> str:
    path = safe_path(relative_path)
    if not path.is_file():
        raise FileNotFoundError(relative_path)
    content = path.read_text(errors="replace")
    return content[:max_chars]

def run_tests(test_command: str) -> str:
    allowed_commands = {
        "pytest": ["pytest"],
        "phpunit": ["vendor/bin/phpunit"],
        "npm-test": ["npm", "test"],
    }
    if test_command not in allowed_commands:
        raise ValueError(f"Test command is not allowed: {test_command}")
    result = subprocess.run(
        allowed_commands[test_command],
        cwd=REPO_ROOT,
        text=True,
        capture_output=True,
        timeout=120,
    )
    return result.stdout + "\n" + result.stderr

These tools enforce strict boundaries. The agent can only search, read, and run approved test commands. It cannot modify files, push changes, or interact with external systems. This design ensures safety and predictability.

Reading GitHub Issues: Bridging Agents and Repositories

To analyze real issues, the agent needs access to GitHub issue data. While production systems would use the GitHub REST API or an SDK, this example implements a minimal local reader in github_client.py:

from dataclasses import dataclass

@dataclass
class GitHubIssue:
    number: int
    title: str
    body: str
    labels: list[str]

def read_issue(issue_number: int) -> GitHubIssue:
    return GitHubIssue(
        number=issue_number,
        title="Fix duplicate invoice reminder emails",
        body="""
        Customers sometimes receive duplicate invoice reminder emails. 
        This seems to happen when the scheduled reminder command and 
        the invoice overdue event run close together.
        """,
        labels=["bug", "billing"],
    )

This mock issue simulates a real-world bug report, allowing the agent to practice reading, parsing, and reasoning about issue content without external dependencies.

Assembling the Agent: Instructions, Tools, and Execution

With the tools and issue reader in place, assemble the agent in agent.py. The agent receives instructions, access to tools, and a workflow to follow:

from agents import Agent, Runner, function_tool
from github_client import read_issue
from tools import search_codebase, read_file, run_tests

@function_tool
def get_github_issue(issue_number: int) -> str:
    issue = read_issue(issue_number)
    return f"""
    Issue #{issue.number}: {issue.title}
    Labels: {', '.join(issue.labels)}
    Body: {issue.body}
    """

@function_tool
def search_repo(query: str) -> str:
    results = search_codebase(query)
    if not results:
        return "No matches found."
    return "\n".join(results)

@function_tool
def read_repo_file(path: str) -> str:
    return read_file(path)

@function_tool
def run_approved_tests(command: str) -> str:
    return run_tests(command)

developer_agent = Agent(
    name="Developer Planning Agent",
    instructions="""
    You are a careful senior software engineering assistant. 
    Your job is to analyze GitHub issues and create safe implementation plans.
    Rules:
    - Do not edit files.
    - Do not invent files you have not inspected.
    - Use repository search before making claims about code.
    - Prefer tests before implementation.
    - Preserve public APIs unless the issue explicitly requires changing them.
    - Explain risks and assumptions.
    - Ask for human approval before any write action.
    - If context is missing, say what is missing.
    
    Output format:
    ## Issue Summary
    ## Relevant Code Areas
    ## Current Behavior Hypothesis
    ## Implementation Plan
    ## Tests To Add Or Run
    """,
)

result = Runner.run(developer_agent, message="Analyze issue #123")
print(result)

This agent combines strict instructions with controlled tools to perform a safe, guided analysis. It can now read issues, search code, and propose plans without making unauthorized changes.

The Path Forward: From Learning to Production

This developer agent is a starting point, not a final solution. To evolve it into a production-ready assistant, consider integrating real GitHub API access, replacing grep with semantic code search, and adding more granular approval gates. You could also expand the toolset to support documentation generation or dependency analysis.

The key takeaway is that effective developer agents prioritize safety, transparency, and collaboration over automation. By building with clear boundaries and human oversight, you create a system that enhances engineering workflows without introducing risk.

AI summary

Learn how to create a cautious developer agent using OpenAI Agents SDK. Includes setup, safe tools, and practical workflow for analyzing issues and planning changes.

How to Build a Safe Developer Agent with OpenAI Agents SDK

Why Start with Caution? The Agent’s Role in Real Engineering

Setting Up the Development Environment

Designing Safe Tools: The Agent’s Controlled Toolkit

Reading GitHub Issues: Bridging Agents and Repositories

Assembling the Agent: Instructions, Tools, and Execution

The Path Forward: From Learning to Production

Comments

AI-powered Chrome extension automates web form testing in seconds

GlassFish redeploys cut from 2 minutes to 5 seconds with JDWP

How AI agents automate blog reading with pluckmd workflows