VOOZH about

URL: https://deepwiki.com/Uberi/speech_recognition/6-development-guide

⇱ Development Guide | Uberi/speech_recognition | DeepWiki


Loading...
Menu

Development Guide

This guide provides essential information for developers who want to contribute to or extend the SpeechRecognition library. It covers the development environment setup, testing framework, project structure, and build/deployment process. For information about using the library, see Speech Recognition Library Overview.

Setting Up a Development Environment

To contribute to the SpeechRecognition library, you'll need to set up a proper development environment with all the necessary dependencies.

Prerequisites

  • Python 3.9 or newer
  • Git
  • FFmpeg (for Whisper support)
  • System dependencies depending on your operating system (for audio support)

Installation for Development

  1. Clone the repository:

    
    
  2. Install the package in development mode with all extras:

    
    

The development setup installs all optional dependencies to ensure you can test all recognition services.


Sources: setup.py setup.cfg .github/workflows/unittests.yml

Project Structure

The SpeechRecognition library is organized into several key components, with recognizers implemented as separate modules.


Sources: setup.py .github/workflows/unittests.yml .github/workflows/lint.yml .github/workflows/rstcheck.yml

Testing Framework

The SpeechRecognition library uses pytest for testing. Tests are located in the tests/ directory and are organized by functionality.

Running Tests

To run the test suite:


This will execute both the tests in the tests/ directory and the doctests in the recognizers modules.

Test Environment

The CI pipeline tests the library on multiple Python versions and operating systems to ensure compatibility:

Python VersionOperating Systems
3.9Ubuntu
3.10Ubuntu
3.11Ubuntu, Windows
3.12Ubuntu
3.13Ubuntu

Adding New Tests

When adding new functionality or fixing bugs, you should also add appropriate tests:

  1. For recognizers, add doctests within the module
  2. For core functionality, add unit tests in the tests/ directory
  3. Ensure tests run on all supported platforms

Sources: .github/workflows/unittests.yml59-61

Adding New Recognizers

The SpeechRecognition library is designed to be extensible. You can add support for new speech recognition services by creating a new recognizer module.

Recognizer Module Structure

Each recognizer typically includes:

  1. A main recognition function (e.g., recognize_service_name)
  2. Helper functions for audio conversion and API communication
  3. Error handling for service-specific exceptions

Dependencies Management

New recognizers often require additional dependencies:

  1. Add them to setup.cfg under [options.extras_require] with an appropriate group name
  2. Update CI workflows to include testing for the new dependencies
  3. Document the new dependencies in the README

Sources: setup.cfg1-26

Build and Deployment Process

The SpeechRecognition library uses standard Python packaging tools for building and distribution.

Building the Package

To build the package:


This creates both source distribution and wheel packages in the dist/ directory.

Publishing to PyPI

To publish a new release:

  1. Update the version number in speech_recognition/__init__.py
  2. Build the package with make distribute
  3. Publish to PyPI with make publish

Sources: Makefile10-17 setup.py37-78

Code Quality and Standards

The project enforces code quality through automated linting and documentation checks.

Linting

The project uses flake8 for linting, with certain exceptions for code style:


This runs flake8 with specific ignored errors (E501, E701, W503).

Documentation Standards

Documentation is written in reStructuredText format and checked with rstcheck:


For code documentation:

  • Use docstrings for all public functions, classes, and methods
  • Include examples in docstrings where appropriate
  • Keep documentation up-to-date with code changes

Sources: Makefile1-9 .github/workflows/lint.yml .github/workflows/rstcheck.yml

Continuous Integration

The project uses GitHub Actions for continuous integration, running tests and code quality checks on each push and pull request.

Workflow Configuration

The CI workflow is defined in the following files:

  • .github/workflows/unittests.yml: Runs tests on multiple Python versions and platforms
  • .github/workflows/lint.yml: Runs flake8 for code quality
  • .github/workflows/rstcheck.yml: Validates RST documentation

System Dependencies in CI

The CI environment installs system dependencies required for testing:

  • libpulse-dev, libasound2-dev: Required for PocketSphinx
  • portaudio19-dev: Required for PyAudio
  • ffmpeg: Required for Whisper

Sources: .github/workflows/unittests.yml11-61 .github/workflows/lint.yml11-18 .github/workflows/rstcheck.yml11-18

Contribution Guidelines

When contributing to the SpeechRecognition library, please follow these guidelines:

  1. Use Issues: Create an issue describing the bug or feature before submitting a pull request
  2. Testing: Ensure all tests pass and add new tests for new functionality
  3. Documentation: Update documentation to reflect changes
  4. Code Style: Follow existing code style and pass linting checks
  5. Commit Messages: Write clear, descriptive commit messages
  6. Backward Compatibility: Maintain backward compatibility when possible

Pull Request Process

  1. Fork the repository
  2. Create a branch for your changes
  3. Make changes and add tests
  4. Ensure all tests and quality checks pass
  5. Submit a pull request referencing the issue

Sources: .github/workflows/unittests.yml Makefile