Understanding Audits

An audit is Docubat's core feature - an automated test that verifies whether AI models can successfully understand and implement your documentation. This page explains how audits work, what they test, and how to interpret results.

What is an Audit?

Think of an audit as a comprehensive test where we ask AI models to read your documentation and write working code. Each audit tests multiple combinations of:

Programming Languages (Python, JavaScript, Java, etc.)
AI Models (GPT-4, Claude, etc.)
Your Documentation (APIs, SDKs, tutorials)

The goal is to identify where your documentation might be unclear, incomplete, or difficult for AI to interpret.

How Audits Work

The Audit Process

When you run an audit, here's what happens behind the scenes:

1. Documentation Processing

Docubat fetches your documentation from the URLs you provided
Content is processed and organized by programming language
Documentation is optimized for AI consumption while preserving accuracy

2. Implementation Planning

For each programming language and AI model combination:

The AI creates an implementation plan based on your task description
Multiple attempts are made (up to 3 tries per combination)
Each attempt learns from previous failures

3. Code Generation

AI models write actual, executable code in the target programming language
Code follows the task requirements and documentation guidelines
Generated code includes proper error handling and best practices

4. Execution and Testing

Generated code runs in secure, isolated cloud environments
Tests execute with any authentication credentials you provided
Results are captured, including output, errors, and execution logs

5. Results Analysis

Actual output is compared against your expected output
Code structure is validated against any specified requirements
Success/failure is determined based on multiple criteria

What Makes an Audit Succeed?

An audit succeeds when:

Functional Success: The generated code produces the expected output
Structural Compliance: Code follows specified patterns and requirements
Error-Free Execution: Code runs without critical errors
Output Matching: Results match your defined success criteria

What Makes an Audit Fail?

Common failure reasons include:

Documentation Gaps: Missing crucial information for implementation
Ambiguous Instructions: Unclear or conflicting guidance
Authentication Issues: Problems with API keys or access credentials
Language-Specific Gaps: Missing examples for specific programming languages
Outdated Information: Documentation that doesn't match current API behavior

Audit Configuration

Task Definition

The task description is the most critical part of your audit configuration:

Good Task Description:

Create a new user account using our API. The user should have a name,
email, and password. Return the created user's ID and handle any
validation errors appropriately.

Poor Task Description:

Use our API to create a user.

Expected Output

Define clear success criteria:

Specific Expected Output:

Successfully created user with returned user ID (numeric).
Error handling for duplicate emails and invalid passwords.
Proper HTTP status codes (201 for success, 400 for validation errors).

Vague Expected Output:

User creation works.

Programming Language Selection

Choose languages strategically:

Start Small: Begin with 2-3 key languages
Consider Your Audience: Focus on languages your developers actually use
Test Officially Supported Languages: Ensure languages you officially support work well

AI Model Selection

Balance coverage with cost:

Popular Models: Include models your users are likely to use
Version Variety: Test both latest and slightly older model versions
Cost Considerations: More models = higher cost but better coverage

Interpreting Results

Success Metrics

Audit results include several key metrics:

Overall Success Rate: Percentage of language/model combinations that succeeded
Language-Specific Success: How well each programming language performed
Model-Specific Success: How different AI models performed
Error Patterns: Common failure reasons across attempts

Detailed Trial Information

For each language/model combination, you'll see:

Generated Code: The actual code the AI produced
Execution Output: What happened when the code ran
Error Messages: Any errors encountered during execution
Token Usage: How many tokens the AI used (affects cost)
Execution Time: How long the test took to run

Failure Analysis

When audits fail, look for patterns:

Consistent Failures Across Languages: Likely a documentation issue
Language-Specific Failures: Missing language-specific examples or guidance
Model-Specific Failures: Some models may struggle with certain types of tasks
Authentication Failures: Issues with API keys or access permissions

Improving Your Documentation

Common Issues and Solutions

Issue: Low Success Rates Across All Languages

Solution: Review your core documentation for clarity and completeness

Issue: Specific Language Always Fails

Solution: Add language-specific examples and installation instructions

Issue: Authentication Errors

Solution: Verify API keys and provide clearer authentication documentation

Issue: Code Structure Failures

Solution: Add code examples and explain expected patterns

Iterative Improvement Process

Run Initial Audit: Get baseline results
Identify Patterns: Look for common failure reasons
Update Documentation: Make targeted improvements
Re-run Audit: Test your improvements
Repeat: Continue until you achieve acceptable success rates

Advanced Features

Scheduled Audits

Set up recurring audits to:

Catch documentation drift over time
Ensure new documentation changes don't break existing functionality
Monitor how AI model improvements affect your results

Team Collaboration

Shared Configurations: Team members can collaborate on audit setups
Results Sharing: Share audit results with stakeholders
Role-Based Access: Control who can view, edit, or run audits

Custom Validation

Advanced audit configurations can include:

Code Structure Requirements: Specify patterns the generated code must follow
Performance Criteria: Test not just functionality but performance
Security Validation: Ensure generated code follows security best practices

Best Practices

Documentation Preparation

Keep Documentation Current: Outdated docs lead to failed audits
Provide Complete Examples: Include full, working code examples
Test Documentation Manually: Ensure humans can follow your docs successfully
Include Error Scenarios: Document what happens when things go wrong

Audit Design

Start Simple: Begin with basic tasks before testing complex scenarios
Test Incrementally: Build up complexity gradually
Focus on User Journeys: Test the paths real developers will take
Consider Edge Cases: Include both happy path and error scenarios

Result Analysis

Look for Patterns: Don't focus on individual failures
Consider Your Audience: Weight results based on your actual user base
Track Trends: Monitor how results change over time
Act on Results: Use audit feedback to actually improve documentation

Next Steps

Review our Getting Started guide for step-by-step setup instructions
Check out Pricing to understand audit costs
Start with a simple audit to get familiar with the platform

Need help interpreting your audit results? Contact us at lets-get-jam@gmail.com.

What is an Audit?​

How Audits Work​

The Audit Process​

1. Documentation Processing​

2. Implementation Planning​

3. Code Generation​

4. Execution and Testing​

5. Results Analysis​

What Makes an Audit Succeed?​

What Makes an Audit Fail?​

Audit Configuration​

Task Definition​

Expected Output​

Programming Language Selection​

AI Model Selection​

Interpreting Results​

Success Metrics​

Detailed Trial Information​

Failure Analysis​

Improving Your Documentation​

Common Issues and Solutions​

Issue: Low Success Rates Across All Languages​

Issue: Specific Language Always Fails​

Issue: Authentication Errors​

Issue: Code Structure Failures​

Iterative Improvement Process​

Advanced Features​

Scheduled Audits​

Team Collaboration​

Custom Validation​

Best Practices​

Documentation Preparation​

Audit Design​

Result Analysis​

Next Steps​