How to Generate an Audit Trail for AI Agent Actions (With Visual Proof)

You've deployed an AI agent to handle customer refunds. It works perfectly in testing.

But your compliance officer asks: "How do we prove what the agent actually did in the browser?"

You show them text logs from LangSmith or Langfuse. They're not satisfied.

Text logs tell you what the agent claimed to do. Visual proof shows what it actually did.

This is the gap between logs and compliance.

The Problem: Text Logs Aren't Audit Proof

Observability platforms (LangSmith, Langfuse, OpenTelemetry) capture:

Agent decisions
Tool calls and responses
Token usage
Latency metrics

But they don't capture what the agent actually saw or clicked.

Example: Your agent logs say "clicked refund button." But did it? What was on screen? Did the page load correctly?

For compliance (HIPAA, SOC 2, PCI-DSS, EU AI Act), you need visual evidence.

The Solution: Screenshot After Each Agent Step

Add a screenshot after every agent action:

import anthropic
from pathlib import Path
from datetime import datetime

client = anthropic.Anthropic()

def agent_with_visual_proof(task: str):
 """Run agent and capture screenshot proof after each step."""

 audit_trail = {
 "task": task,
 "timestamp": datetime.now().isoformat(),
 "steps": []
 }

 # Define tools with screenshot capture
 tools = [
 {
 "name": "take_screenshot",
 "description": "Capture current page state for audit trail",
 "input_schema": {
 "type": "object",
 "properties": {
 "url": {"type": "string", "description": "URL to screenshot"},
 "reason": {"type": "string", "description": "Why this screenshot matters"}
 },
 "required": ["url", "reason"]
 }
 },
 {
 "name": "process_refund",
 "description": "Process customer refund",
 "input_schema": {
 "type": "object",
 "properties": {
 "order_id": {"type": "string"},
 "amount": {"type": "number"}
 },
 "required": ["order_id", "amount"]
 }
 }
 ]

 messages = [
 {
 "role": "user",
 "content": task
 }
 ]

 step_count = 0

 while True:
 response = client.messages.create(
 model="claude-opus-4-5-20251101",
 max_tokens=1024,
 tools=tools,
 messages=messages
 )

 # Check if agent is done
 if response.stop_reason == "end_turn":
 break

 # Process tool calls
 if response.stop_reason == "tool_use":
 step_count += 1

 for content_block in response.content:
 if content_block.type == "tool_use":
 tool_name = content_block.name
 tool_input = content_block.input

 print(f"Step {step_count}: {tool_name}")
 print(f" Input: {tool_input}")

 # Capture screenshot for audit trail
 if tool_name == "take_screenshot":
 screenshot_result = capture_screenshot(
 tool_input["url"],
 f"step-{step_count}",
 tool_input["reason"]
 )
 tool_result = screenshot_result

 elif tool_name == "process_refund":
 # Process refund and capture proof
 refund_result = {
 "status": "approved",
 "order_id": tool_input["order_id"],
 "amount": tool_input["amount"],
 "reference": f"REF-{step_count}-{tool_input['order_id']}"
 }

 # Screenshot after refund
 screenshot_path = capture_screenshot(
 "https://app.example.com/refunds",
 f"step-{step_count}-refund-proof",
 f"Refund {refund_result['reference']} processed"
 )

 refund_result["proof_screenshot"] = screenshot_path
 tool_result = refund_result

 # Record in audit trail
 audit_trail["steps"].append({
 "step": step_count,
 "action": tool_name,
 "input": tool_input,
 "result": tool_result,
 "timestamp": datetime.now().isoformat()
 })

 # Add tool result to conversation
 messages.append({
 "role": "assistant",
 "content": response.content
 })

 messages.append({
 "role": "user",
 "content": [
 {
 "type": "tool_result",
 "tool_use_id": content_block.id,
 "content": str(tool_result)
 }
 ]
 })

 return audit_trail

def capture_screenshot(url: str, step_id: str, reason: str) -> dict:
 """Capture screenshot via PageBolt API."""
 import requests

 response = requests.post(
 "https://pagebolt.dev/api/v1/screenshot",
 headers={
 'x-api-key': os.getenv(PAGEBOLT_API_KEY')}",
 "Content-Type": "application/json"
 },
 json={
 "url": url,
 "format": "png",
 "width": 1280,
 "height": 720,
 "fullPage": True,
 "blockBanners": True
 }
 )

 if response.status_code != 200:
 return {"error": f"Screenshot failed: {response.status_code}"}

 # Save screenshot
 filename = f"audit-trail/{step_id}-{datetime.now().timestamp()}.png"
 Path(filename).parent.mkdir(parents=True, exist_ok=True)

 with open(filename, "wb") as f:
 f.write(response.content)

 return {
 "screenshot_path": filename,
 "reason": reason,
 "url": url
 }

# Run agent with visual audit trail
if __name__ == "__main__":
 import os

 audit = agent_with_visual_proof(
 "Process refund for order ORDER-12345 with amount $50"
 )

 # Save audit trail as JSON with screenshot references
 import json
 with open("audit-trail.json", "w") as f:
 json.dump(audit, f, indent=2)

 print(f"Audit trail saved with {len(audit['steps'])} steps")
 for step in audit["steps"]:
 print(f" Step {step['step']}: {step['action']} → {step.get('result', {}).get('screenshot_path', 'N/A')}")

Real Use Case: Autonomous Customer Service Refund

Customer initiates refund request. Agent:

Screenshot initial state — customer data page
Retrieve order details — agent calls order API
Screenshot order confirmation — verify customer info
Process refund — submit refund form
Screenshot refund confirmation — proof of success
Send customer notification — email with refund ID

Each step has:

Tool call with input
Result (API response or form submission)
Screenshot evidence of what happened on screen

This creates a complete visual audit trail for compliance audits.

Compliance Frameworks: What They Require

Framework	Requirement	Solution
HIPAA	Audit logs with evidence	Screenshots of patient data access
SOC 2	Detailed access logs	Before/after screenshots of changes
PCI-DSS	Transaction proof	Screenshots of payment processing
EU AI Act	Decision transparency	Screenshots of agent actions/reasoning
GDPR	Data handling proof	Screenshots of data deletion/handling

Visual proof satisfies all of them.

Architecture: Visual Audit Trail System

┌─────────────────────────────────────────────────────────┐
│ AI Agent (Claude) │
│ ├─ Observability (LangSmith/Langfuse) │
│ ├─ Text logs: "clicked refund button" │
│ └─ Visual proof: screenshot after each step │
└──────────┬──────────────────────────────────────────────┘
 │
 ├─ Store logs in observability platform
 │
 └─ Capture screenshots via PageBolt
 ├─ Screenshot after tool calls
 ├─ Screenshot after form submissions
 └─ Screenshot after navigation
 │
 ▼
 ┌──────────────────────────┐
 │ Audit Trail Storage │
 ├─ step-1.png │
 ├─ step-2.png │
 ├─ step-3.png │
 └─ audit-trail.json │
 │
 ▼
 ┌──────────────────────────┐
 │ Compliance Report │
 │ - Text logs │
 │ - Screenshots │
 │ - Timeline │
 │ - Decision points │
 └──────────────────────────┘

Generating Compliance Reports

def generate_audit_report(audit_trail: dict) -> str:
 """Generate HTML report with screenshots for auditors."""

 html = f"""
 <html>
 <head><title>AI Agent Audit Trail</title></head>
 <body>
 <h1>Audit Trail Report</h1>
 <p><strong>Task:</strong> {audit_trail['task']}</p>
 <p><strong>Timestamp:</strong> {audit_trail['timestamp']}</p>

 <h2>Agent Actions</h2>
 """

 for step in audit_trail["steps"]:
 html += f"""
 <div style="border: 1px solid #ccc; margin: 20px 0; padding: 10px;">
 <h3>Step {step['step']}: {step['action']}</h3>
 <p><strong>Input:</strong> {step['input']}</p>
 <p><strong>Result:</strong> {step['result']}</p>
 <p><strong>Time:</strong> {step['timestamp']}</p>
 """

 if isinstance(step['result'], dict) and 'screenshot_path' in step['result']:
 html += f"""
 <h4>Visual Proof</h4>
 <img src="{step['result']['screenshot_path']}" style="max-width: 100%; border: 1px solid #ddd;">
 """

 html += "</div>"

 html += "</body></html>"
 return html

# Generate report
report_html = generate_audit_report(audit)
with open("audit-report.html", "w") as f:
 f.write(report_html)

print("Audit report generated: audit-report.html")

Pricing

Plan	Requests/Month	Cost	Best For
Free	100	$0	Testing, low-volume agents
Starter	5,000	$29	10–50 agent runs/month
Growth	25,000	$79	100–500 agent runs/month
Scale	100,000	$199	1000+ agent runs/month

At 5 screenshots per agent run, Starter covers 1,000 agent executions.

Summary

✅ Text logs from LangSmith/Langfuse document agent decisions
✅ Screenshots from PageBolt document agent actions
✅ Together they create compliance-ready audit trails
✅ Visual proof satisfies HIPAA, SOC 2, PCI-DSS, EU AI Act
✅ Generate HTML reports with embedded screenshots
✅ Store alongside observability logs for complete evidence

Get started free: pagebolt.dev — 100 requests/month, no credit card required.

URL: https://dev.to/custodiaadmin/how-to-generate-an-audit-trail-for-ai-agent-actions-with-visual-proof-31a7