Last indexed: 19 April 2025 (0747dc)

Development Guide

This guide provides essential information for developers who want to contribute to or extend the SpeechRecognition library. It covers the development environment setup, testing framework, project structure, and build/deployment process. For information about using the library, see Speech Recognition Library Overview.

Setting Up a Development Environment

To contribute to the SpeechRecognition library, you'll need to set up a proper development environment with all the necessary dependencies.

Prerequisites

Python 3.9 or newer
Git
FFmpeg (for Whisper support)
System dependencies depending on your operating system (for audio support)

Installation for Development

Clone the repository:
Install the package in development mode with all extras:

The development setup installs all optional dependencies to ensure you can test all recognition services.

Sources: setup.py setup.cfg .github/workflows/unittests.yml

Project Structure

The SpeechRecognition library is organized into several key components, with recognizers implemented as separate modules.

Sources: setup.py .github/workflows/unittests.yml .github/workflows/lint.yml .github/workflows/rstcheck.yml

Testing Framework

The SpeechRecognition library uses pytest for testing. Tests are located in the tests/ directory and are organized by functionality.

Running Tests

To run the test suite:

This will execute both the tests in the tests/ directory and the doctests in the recognizers modules.

Test Environment

The CI pipeline tests the library on multiple Python versions and operating systems to ensure compatibility:

Python Version	Operating Systems
3.9	Ubuntu
3.10	Ubuntu
3.11	Ubuntu, Windows
3.12	Ubuntu
3.13	Ubuntu

Adding New Tests

When adding new functionality or fixing bugs, you should also add appropriate tests:

For recognizers, add doctests within the module
For core functionality, add unit tests in the tests/ directory
Ensure tests run on all supported platforms

Sources: .github/workflows/unittests.yml59-61

Adding New Recognizers

The SpeechRecognition library is designed to be extensible. You can add support for new speech recognition services by creating a new recognizer module.

Recognizer Module Structure

Each recognizer typically includes:

A main recognition function (e.g., recognize_service_name)
Helper functions for audio conversion and API communication
Error handling for service-specific exceptions

Dependencies Management

New recognizers often require additional dependencies:

Add them to setup.cfg under [options.extras_require] with an appropriate group name
Update CI workflows to include testing for the new dependencies
Document the new dependencies in the README

Sources: setup.cfg1-26

Build and Deployment Process

The SpeechRecognition library uses standard Python packaging tools for building and distribution.

Building the Package

To build the package:

This creates both source distribution and wheel packages in the dist/ directory.

Publishing to PyPI

To publish a new release:

Update the version number in speech_recognition/__init__.py
Build the package with make distribute
Publish to PyPI with make publish

Sources: Makefile10-17 setup.py37-78

Code Quality and Standards

The project enforces code quality through automated linting and documentation checks.

Linting

The project uses flake8 for linting, with certain exceptions for code style:

This runs flake8 with specific ignored errors (E501, E701, W503).

Documentation Standards

Documentation is written in reStructuredText format and checked with rstcheck:

For code documentation:

Use docstrings for all public functions, classes, and methods
Include examples in docstrings where appropriate
Keep documentation up-to-date with code changes

Sources: Makefile1-9 .github/workflows/lint.yml .github/workflows/rstcheck.yml

Continuous Integration

The project uses GitHub Actions for continuous integration, running tests and code quality checks on each push and pull request.

Workflow Configuration

The CI workflow is defined in the following files:

.github/workflows/unittests.yml: Runs tests on multiple Python versions and platforms
.github/workflows/lint.yml: Runs flake8 for code quality
.github/workflows/rstcheck.yml: Validates RST documentation

System Dependencies in CI

The CI environment installs system dependencies required for testing:

libpulse-dev, libasound2-dev: Required for PocketSphinx
portaudio19-dev: Required for PyAudio
ffmpeg: Required for Whisper

Sources: .github/workflows/unittests.yml11-61 .github/workflows/lint.yml11-18 .github/workflows/rstcheck.yml11-18

Contribution Guidelines

When contributing to the SpeechRecognition library, please follow these guidelines:

Use Issues: Create an issue describing the bug or feature before submitting a pull request
Testing: Ensure all tests pass and add new tests for new functionality
Documentation: Update documentation to reflect changes
Code Style: Follow existing code style and pass linting checks
Commit Messages: Write clear, descriptive commit messages
Backward Compatibility: Maintain backward compatibility when possible

Pull Request Process

Fork the repository
Create a branch for your changes
Make changes and add tests
Ensure all tests and quality checks pass
Submit a pull request referencing the issue

Sources: .github/workflows/unittests.yml Makefile

Refresh this wiki

URL: https://deepwiki.com/Uberi/speech_recognition/6-development-guide