Server Quality Checklist

Profile completionA complete profile improves this server's visibility in search results.

Latest release: v1.0.0
Disambiguation5/5
Each tool has a clearly distinct purpose targeting different media types: audio, images, and videos. There is no overlap in functionality, as they handle separate input formats with similar analysis capabilities but different domains.
Naming Consistency5/5
All tool names follow a consistent pattern of 'media_type_recognition' using snake_case. This predictable naming scheme makes it easy to understand what each tool does based on its name alone.
Tool Count3/5
With only 3 tools, the server feels somewhat thin for a video recognition domain, as it lacks operations like video editing, frame extraction, or metadata retrieval. However, the core recognition functions for audio, images, and videos are covered, making it borderline appropriate.
Completeness3/5
The server provides basic recognition for three media types but lacks comprehensive coverage for video processing. There are no tools for operations like video segmentation, object tracking, or format conversion, which are common in video recognition workflows, leaving notable gaps.
Average 2.9/5 across 3 of 3 tools scored.
See the Tool Scores section below for per-tool breakdowns.
- No issues in the last 6 months
- 0 commits in the last 12 weeks
- No stable releases found
- No critical vulnerability alerts
- No high-severity vulnerability alerts
- No code scanning findings
- CI status not available
This repository is licensed under MIT License.
This repository includes a README.md file.
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
Add a glama.json file to provide metadata about your server.
This server has been verified by its author.
Add related servers to improve discoverability.

Tool Scores

Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'analyze and transcribe' but fails to describe key traits such as processing time, error handling, output format, or any limitations (e.g., file size, supported audio formats). This leaves significant gaps for a tool that performs AI-based analysis.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's function without unnecessary words. It is appropriately sized and front-loaded, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of AI-based audio analysis, no annotations, and no output schema, the description is incomplete. It lacks details on behavioral aspects, output structure, and usage context, which are critical for an agent to effectively invoke this tool without trial and error.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with clear descriptions for all parameters (filepath, modelname, prompt). The description adds no additional semantic context beyond what the schema provides, such as examples or constraints, so it meets the baseline for high schema coverage without compensating value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'analyze and transcribe' and the resource 'audio' with the technology 'Google Gemini AI', making the purpose evident. However, it doesn't explicitly differentiate from sibling tools like 'image_recognition' or 'video_recognition' beyond the audio focus, which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like sibling tools or other audio processing methods. It lacks context about use cases, prerequisites, or exclusions, leaving the agent to infer usage from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'analyze and describe' implies a read-only operation, it doesn't specify whether this requires API keys, has rate limits, handles errors, or what the output format looks like. For a tool with no annotations and no output schema, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise - a single sentence that directly states the tool's purpose without any unnecessary words. It's front-loaded with the core functionality and uses efficient language. Every word earns its place in this minimal description.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that there are no annotations, no output schema, and this is an AI analysis tool with potential behavioral complexities, the description is insufficiently complete. It doesn't explain what kind of analysis or description will be returned, doesn't mention authentication requirements for Google Gemini AI, and provides no context about limitations or error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters (filepath, modelname, prompt) with clear descriptions. The tool description adds no additional parameter semantics beyond what's in the schema. According to the rules, when schema coverage is high (>80%), the baseline score is 3 even with no param info in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Analyze and describe images using Google Gemini AI'. It specifies the verb ('analyze and describe'), resource ('images'), and technology ('Google Gemini AI'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like audio_recognition or video_recognition, which would require mentioning it's specifically for images.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools (audio_recognition, video_recognition) or any context for choosing this specific image analysis tool over others. There's no information about prerequisites, limitations, or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool analyzes and describes videos but doesn't mention critical behavioral aspects like rate limits, authentication requirements, file size limits, supported video formats, processing time, or error handling. The description is too vague about what 'analyze and describe' entails operationally.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core functionality without unnecessary words. It's appropriately sized and front-loaded with the essential information, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of video analysis (which typically involves format handling, processing time, and potential errors), no annotations, and no output schema, the description is insufficient. It doesn't explain what the tool returns, how to interpret results, or any operational constraints, leaving significant gaps for an AI agent to use it effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all three parameters thoroughly. The description adds no additional parameter semantics beyond what's in the schema, such as explaining how the prompt interacts with video analysis or model selection trade-offs. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as analyzing and describing videos using Google Gemini AI, which is specific (verb+resource) and distinguishes it from sibling tools like audio_recognition and image_recognition. However, it doesn't explicitly mention video-specific capabilities beyond the name, leaving some ambiguity about whether it handles all video formats or specific features.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus its siblings (audio_recognition, image_recognition). It doesn't mention prerequisites, limitations, or alternative scenarios, leaving the agent to infer usage based on tool names alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

GitHub Badge

Glama performs regular codebase and documentation scans to:

Confirm that the MCP server is working as expected.
Confirm that there are no obvious security issues.
Evaluate tool definition quality.

Our badge communicates server capabilities, safety, and installation instructions.

Card Badge

👁 mcp_video_recognition MCP server

Copy to your README.md:

Score Badge

👁 mcp_video_recognition MCP server

Copy to your README.md:

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp_video_recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

URL: https://glama.ai/mcp/servers/mario-andreschak/mcp_video_recognition/score