VOOZH about

URL: https://pypi.org/project/pythonaibrain/

⇱ pythonaibrain Β· PyPI


Skip to main content

pythonaibrain 1.1.9

pip install pythonaibrain

Latest release

Released:

A versatile, plug-and-play Python AI toolkit for building offline intelligent assistants.

Navigation

Verified details

These details have been verified by PyPI
Maintainers
πŸ‘ Avatar for DivyanshuSinha from gravatar.com
DivyanshuSinha

Unverified details

These details have not been verified by PyPI
Project links
Meta
  • License Expression: LGPL-3.0-or-later AND AGPL-3.0-or-later
    SPDX License Expression
  • Author: Divyanshu Sinha
  • Tags ai , artificial-intelligence , assistant , tts , stt , speech , nlp , nlp-library , text-to-speech , speech-to-text , object-detection , computer-vision , named-entity-recognition , ner , image-generation , offline-ai , math-ai , summarizer , memory
  • Requires: Python >=3.9
  • Provides-Extra: core , tts , stt , camera , itt , context , ner , memory , math , search , pptx , pdf , eye , summarizer , clse , all , dev , docs , zentraa

Project description

πŸ‘ Pythonaibrain

Pythonaibrain

Pythonaibrain is a versatile, plug-and-play Python package designed to help you build offline intelligent AI assistants and applications effortlessly. With modules covering speech recognition, text-to-speech, natural language understanding, and more, Pythonaibrain lets you create powerful AI solutions without deep expertise or complex setup. Whether you’re a beginner or an experienced developer, get ready to bring your AI ideas to life quickly and efficiently.


Requirements

  • Python 3.9 or later
  • pip 23+

Installation

Minimal install (core package only)

pipinstallpythonaibrain==1.1.9

Install with specific modules

Pick only what you need:

# Text-to-Speech
pipinstall"pythonaibrain[tts]==1.1.9"

# Speech-to-Text
pipinstall"pythonaibrain[stt]==1.1.9"

# Camera + QR/barcode
pipinstall"pythonaibrain[camera]==1.1.9"

# Image-to-Text (OCR)
pipinstall"pythonaibrain[itt]==1.1.9"

# AI Brain + NLP
pipinstall"pythonaibrain[core]==1.1.9"

# Named-Entity Recognition
pipinstall"pythonaibrain[ner]==1.1.9"

# Memory management
pipinstall"pythonaibrain[memory]==1.1.9"

# Mathematical AI + CAS
pipinstall"pythonaibrain[math]==1.1.9"

# Internet Search
pipinstall"pythonaibrain[search]==1.1.9"

# PowerPoint extraction
pipinstall"pythonaibrain[pptx]==1.1.9"

# PDF extraction
pipinstall"pythonaibrain[pdf]==1.1.9"

# Real-time object detection (YOLOv8)
pipinstall"pythonaibrain[eye]==1.1.9"

# Text summarisation
pipinstall"pythonaibrain[summarizer]==1.1.9"

# ZENTRAA encrypted chat server + clients
pipinstall"pythonaibrain[zentraa]==1.1.9"

# CLSE β€” image generation ⚠ AGPL-3.0-or-later
pipinstall"pythonaibrain[clse]==1.1.9"

Install multiple modules at once

pipinstall"pythonaibrain[core,tts,stt]==1.1.9"
pipinstall"pythonaibrain[core,ner,memory,summarizer]==1.1.9"

Install everything

pipinstall"pythonaibrain[all]==1.1.9"

Linux β€” System Dependencies

Some modules require native system libraries. Install them before running pip:

# STT (Speech-to-Text) β€” PortAudio
sudoaptinstallportaudio19-devpython3-pyaudio

# Camera / QR barcode β€” zbar
sudoaptinstalllibzbar0

Post-install Steps

NLTK data (required by Brain, Context, NER, SummarizerAI)

importpyaitk
pyaitk.InstallNLTKData()

Or manually:

python-mnltk.downloaderpunktpunkt_tabaveraged_perceptron_taggerstopwordswordnetwordsmaxent_ne_chunker

spaCy model (required by NER, optional for Core)

python-mspacydownloaden_core_web_sm

STT offline engine (optional β€” pocketsphinx)

pipinstallpocketsphinx

Verify Installation

importpyaitk

print(pyaitk.get_version()) # 1.1.9

info = pyaitk.get_info()
print(info)

availability = pyaitk.check_module_availability()
for module, ok in availability.items():
 status = "βœ”" if ok else "✘"
 print(f" {status}{module}")

Or from the terminal:

pythonaibrain--version
pythonaibrain--modules
pythonaibrain--info

Import Usage

Each module is imported directly by its submodule path:

importpyaitk.core
importpyaitk.TTS
importpyaitk.STT
importpyaitk.Camera
importpyaitk.NER
importpyaitk.Memory
importpyaitk.MathAI
importpyaitk.Search
importpyaitk.SummarizerAI
importpyaitk.CLSE
importpyaitk.eye
importpyaitk.ITT
importpyaitk.PPTExtract
importpyaitk.Grammar

License Notice

Component License
Pythonaibrain (all modules) LGPL-3.0-or-later
pyaitk.CLSE AGPL-3.0-or-later

If you use pyaitk.CLSE in your application, the AGPL requires you to release your application's source code. See pyaitk/CLSE/LICENSE.txt for full terms.


About Pythonaibrain Package

Pythonaibrain package consists of pyaitk (which means Python AI Toolkit) module and PyAgent modules, pyaitk provides you methods to create your AI models/chatbots where as PyAgent provides best GUI and Web supports for intrection with models/chatbots. It also provide best contorl on your device. In this package you got default pre-trained models.


pyaitk (Python AI ToolKit)

pyaitk provides various type of methods and functions to create an advance AI.


How to import pyaitk

After pythonaibrain installations run this code for importing pyaitk

importpyaitk

Core Component

The central orchestration layer of the Pythonaibrain (pyaitk) framework. Wires together intent classification, neural chatbot training, memory, NER, translation, frame analysis, weather, and an optional quantized LLM β€” all under two primary entry points: Brain and AdvanceBrain.


What Is This?

pyaitk.core is the heart of the pyaitk package, providing:

  • Brain β€” intent-based chatbot with memory, NER, translation, frame classification, grammar correction, and TTS
  • AdvanceBrain β€” routes through a local quantized LLM (pythonaibrain-llm) for open-ended generation, with the same interface as Brain
  • VectorizerMode β€” switchable feature extraction: Binary BoW (NumPy), TF-IDF (scikit-learn), or Gensim TF-IDF
  • IntentsManager β€” load, save, and extend intents.json files dynamically
  • Frame classification β€” lightweight neural sentence-type classifier (Statement / Question / Command / Name / …)
  • NER integration β€” entity extraction via the NER subsystem, with optional in-call training
  • Translation β€” GRU seq-to-seq translator (multilingual β†’ English), trained once per process
  • Language detection β€” keyword-based classifier for English / Hindi / French / Spanish
  • Weather API β€” city-level weather lookup via OpenWeatherMap
  • configure() β€” load a .pbcfg file to override all settings before constructing any Brain
  • Structured logging β€” all subsystems use Python's logging module, config-driven level and format

Installation

# Optional: AdvanceBrain LLM support
pipinstallpythonaibrain-llm

Set your OpenWeatherMap API key in a .env file:

weather_api_key=YOUR_KEY_HERE

Quick Start

frompyaitk.coreimport Brain

with Brain() as brain:
 brain.load()
 print(brain.process_messages("Hello"))

Configuration

Load a .pbcfg file before constructing any Brain to override all settings:

importpyaitk.coreascore
core.configure("project.pbcfg")
brain = core.Brain()

configure() updates the global config singleton, applies logging settings, and returns the loaded AppConfig.


VectorizerMode β€” Feature Extraction

Controls how text is converted to feature vectors for the neural intent classifier.

Mode Backend Notes
BOW (default) Pure NumPy Binary Bag-of-Words; zero extra deps
TFIDF scikit-learn TF-IDF weighting via TfidfVectorizer
GENSIM Gensim Requires pip install gensim
frompyaitk.coreimport Brain, VectorizerMode

brain = Brain(vectorizer_mode=VectorizerMode.TFIDF)
brain = Brain(vectorizer_mode=VectorizerMode.GENSIM)
brain = Brain() # default BOW

Brain β€” Intent-Based Chatbot

Basic usage

frompyaitk.coreimport Brain

# Train from scratch
with Brain() as brain:
 brain.train()
 brain.save()
 print(brain.ask("What is the weather?"))

# Load a saved model
with Brain() as brain:
 brain.load()
 response = brain.process_messages("Tell me a joke", grammar=True)
 print(response)

Constructor parameters

Parameter Type Default Description
intents_path str or None config Path to intents.json
condition bool or None config Enable dynamic intent learning from search results
download bool or None config Auto-download NLTK data on init
memory_path str or None config Path for memory persistence file
smart_memory bool or None config Use SmartMemory (semantic search + clustering)
memory_fit_interval int or None config Auto-fit SmartMemory every N memories
config AppConfig or None global Override global config for this instance
vectorizer_mode VectorizerMode BOW Feature extraction strategy
**function_mapping Any β€” Map intent tags to Python callables

Methods

Method Returns Description
train() None Parse intents, build features, train neural model
load() None Load saved model from paths in config
save() None Save model weights and dimension file
process_messages(message, grammar, TTS) str Classify intent, respond, remember, optionally speak
talk(message, grammar, TTS) str Attempt web search first, fall back to process_messages
ask(query, TTS) str talk() + typewriter-style write() output
write(message, set_timer, TTS) None Print message character by character with optional TTS
translator(message) str Translate input to English via GRU seq-to-seq model
classify_language(message) str Detect language: english, hindi, french, spanish
predict_message_type(message) str Classify sentence type: Statement / Question / Command / …
predict_entitie(message, train) list Extract named entities via NER pipeline
memorize_user_name(message) None Detect and store user name from message
recall_user_name() str Retrieve stored user name from memory
search_memory(query, top_k) list Semantic or substring search over conversation history
memory_intent(query) str Predict intent of query from stored patterns
memory_report() Any Return SmartMemory cluster/intent analysis report
export_memory_report(path) bool Export memory report as JSON
fit_memory() bool Manually retrain SmartMemory summarizer
is_loaded() / is_saved() / is_trained() bool State flags
count_query() int Total number of queries processed
pyai_say(*msg) None Print with PYAI : prefix

Function mapping

Map intent tags to Python callables β€” called automatically when that intent is predicted:

defopen_calculator():
 importsubprocess; subprocess.Popen("calc.exe")

with Brain(calculator=open_calculator) as brain:
 brain.load()
 brain.process_messages("open calculator")

Memory

Brain uses SmartMemory by default (falls back to plain Memory if the summarizer module is absent). Every process_messages() call automatically stores the exchange.

# Semantic search over history
results = brain.search_memory("weather", top_k=3)

# Export memory analytics
brain.export_memory_report("memory_report.json")

# Manually trigger SmartMemory refit
brain.fit_memory()

AdvanceBrain β€” LLM-powered Brain

Routes responses through a local quantized LLM from pythonaibrain-llm. Falls back to the intent classifier when advance=False.

frompyaitk.coreimport AdvanceBrain

with AdvanceBrain() as brain:
 brain.load()
 print(brain.process_messages("What is Python?"))

# Skip LLM, use intent classifier only
print(brain.process_messages("Hello", advance=False))

AdvanceBrain has the same train(), load(), save(), translator(), classify_language(), predict_message_type(), and predict_entitie() methods as Brain.

Requires: pip install pythonaibrain-llm The LLM is lazy-loaded on first process_messages(advance=True) call.


IntentsManager β€” Dynamic Intent Management

frompyaitk.coreimport IntentsManager

im = IntentsManager("intents.json")

# Add or extend an intent
im.add_intent(
 tag="greeting",
 patterns=["Hi", "Hey there"],
 responses=["Hello!", "Hey! How can I help?"]
)

# Add a search-derived intent
im.add_search_intent("best Python books", ["Python Crash Course", "Fluent Python"])

Utility Functions

Weather

frompyaitk.coreimport get_weather

weather = get_weather("London") # e.g. "Clouds"

Requires weather_api_key in .env. Also available: longitude(city), latitude(city), humidity(city), temperature(city).

Frame classification

frompyaitk.coreimport predict_frame, VectorizerMode

frame = predict_frame("What time is it?") # β†’ "Question"
frame = predict_frame("Open the door") # β†’ "Command"
frame = predict_frame("The sky is blue") # β†’ "Statement"
frame = predict_frame("My name is Divyanshu") # β†’ "Name"

Supported frame types: Statement, Question, Command, Answer, Name, Know, Shutdown, Make Dir, Start.

Translation

frompyaitk.coreimport translate_to_en

text = translate_to_en("tum kaise ho") # β†’ "how are you"
text = translate_to_en("hola como estas")

Model trains once per process on a built-in multilingual corpus (Hindi, French, Spanish, English).

Language detection

frompyaitk.coreimport language_classifier

lang = language_classifier("tum kaise ho") # β†’ "hindi"
lang = language_classifier("je m'appelle") # β†’ "french"
lang = language_classifier("hola como estas") # β†’ "spanish"
lang = language_classifier("Hello there") # β†’ "english"

NER

frompyaitk.coreimport Brain

with Brain() as brain:
 brain.load()
 entities = brain.predict_entitie("Apple was founded by Steve Jobs.")
 print(entities) # list of Entity objects

Or directly:

frompyaitk.coreimport predictNER

entities = predictNER("NASA launched Voyager 1.")
entities = predictNER("LeBron James plays for the Lakers.", train=True) # retrain first

Full Example

frompyaitk.coreimport Brain, VectorizerMode, configure

# Optional: load a custom config before anything else
configure("myproject.pbcfg")

# Train and save
with Brain(vectorizer_mode=VectorizerMode.TFIDF) as brain:
 brain.train()
 brain.save()

# Load and interact
with Brain() as brain:
 brain.load()

 # Chat
 print(brain.ask("Tell me a joke"))
 print(brain.process_messages("What is the weather in Delhi?"))

 # Language tools
 print(brain.classifier_language("tum kaise ho")) # hindi
 print(brain.translator("mera naam ravi hai")) # my name is ravi
 print(brain.predict_message_type("Shutdown /s")) # Shutdown

 # NER
 entities = brain.predict_entitie("Elon Musk founded SpaceX.")
 print(entities)

 # Memory
 brain.memorize_user_name("My name is Divyanshu")
 print(brain.recall_user_name())
 results = brain.search_memory("joke", top_k=3)
 brain.export_memory_report("report.json")

 # Stats
 print(brain.count_query())
 print(brain.is_loaded(), brain.is_saved())

Architecture Overview

Brain / AdvanceBrain
β”œβ”€β”€ ChatbotAssistant ← intent parse β†’ feature matrix β†’ PyTorch neural classifier
β”‚ β”œβ”€β”€ VectorizerMode ← BOW (NumPy) | TF-IDF (sklearn) | Gensim TF-IDF
β”‚ └── IntentsManager ← load/save/extend intents.json
β”œβ”€β”€ Memory / SmartMemory ← conversation persistence + semantic search
β”œβ”€β”€ NERPipeline ← named entity recognition (spaCy-based)
β”œβ”€β”€ GrammarCorrector ← optional post-processing on responses
β”œβ”€β”€ TTS (speak) ← optional audio output
β”œβ”€β”€ Search ← fallback web search for unknown intents
β”œβ”€β”€ FrameClassifier ← sentence-type classifier (singleton, trains once)
β”œβ”€β”€ GRU Translator ← multilingual β†’ English (singleton, trains once)
β”œβ”€β”€ _PyAILLM ← lazy-loaded quantized LLM (AdvanceBrain only)
└── WeatherAPI ← OpenWeatherMap lookup

TTS - Text To Speech

A fully offline, cross-platform Text-to-Speech module built on the pyttsx3 backend. Works on Windows (SAPI5), macOS (NSSpeechSynthesizer), and Linux (eSpeak) without any network calls.

How to Use

1. One-shot (simplest)

frompyaitk.TTSimport TTS

TTS().say("Hello, world!")

2. Context manager (recommended)

The engine is initialised once and shut down cleanly on exit β€” even if an exception is raised.

frompyaitk.TTSimport TTS, TTSConfig

with TTS(TTSConfig(voice="zira", rate=160)) as tts:
 tts.say("Hello!")
 tts.say("How are you?")

3. Module-level convenience function

frompyaitk.TTSimport speak

speak("Hello world!")
speak("Bonjour", voice="fiona", rate=140)

4. Save speech to a WAV file

frompyaitk.TTSimport TTS, TTSConfig

cfg = TTSConfig(output_path="greeting.wav")
with TTS(cfg) as tts:
 saved_path = tts.save("Hello, this will be saved to a file.")
 print(f"Saved to: {saved_path}")

You can also override the path at call time:

with TTS() as tts:
 tts.save("Custom path example", path="custom_output.wav")

5. List available voices

frompyaitk.TTSimport TTS

with TTS() as tts:
 for name in tts.available_voices():
 print(name)

6. Get platform-appropriate voice hints

frompyaitk.TTSimport TTS

with TTS() as tts:
 print(tts.platform_voice_hints())
# e.g. ['david', 'zira', 'mark'] on Windows

Configuration (TTSConfig)

All parameters are optional and fall back to values from your project config (config.tts.*).

Parameter Type Description
rate int Speech rate in words per minute
volume float Volume level from 0.0 (silent) to 1.0 (full)
voice str Voice name fragment (fuzzy matched)
default_text str Fallback text when say() is called with no argument
output_path str or None Default WAV output path for save()
cfg = TTSConfig(rate=180, volume=0.8, voice="samantha")

Voice Matching

Voices are resolved from a short fragment string using a ranked strategy:

  1. Direct substring match β€” "zira" matches "Microsoft Zira Desktop"
  2. Platform-hint-assisted match β€” uses known fragments for the current OS
  3. First installed voice β€” fallback with a warning logged

Platform voice hints:

Voice Gender Style OS
David Male en-US Windows
Mark Male en-US Windows
Zira Female en-US Windows
Alex Male en-US macOS
Samantha Female en-US macOS
Victoria Female en-US macOS
Fred Male Robotic macOS
Daniel Male en-GB macOS
Fiona Female en-GB macOS
English β€” Default eSpeak Linux
English-US β€” American English Linux
English-UK β€” British English Linux
MB-EN1 β€” MBROLA English 1 Linux
MB-FR1 β€” MBROLA French 1 Linux

Examples Summary

# Quickest usage
frompyaitk.TTSimport speak
speak("Hello!")

# Full control with context manager
frompyaitk.TTSimport TTS, TTSConfig
with TTS(TTSConfig(voice="david", rate=175, volume=0.9)) as tts:
 tts.say("Line one.")
 tts.say("Line two.")

# Save to file
with TTS(TTSConfig(voice="samantha")) as tts:
 tts.save("Saved audio example.", path="demo.wav")

# List voices
with TTS() as tts:
 voices = tts.available_voices()
 print(voices)

STT β€” Speech-to-Text Module

A fully cross-platform Speech-to-Text library supporting both online (Google Speech Recognition) and offline (PocketSphinx) backends, with automatic engine selection based on network availability.

What Is This?

  • Dual engine support β€” Google Speech Recognition (online) and PocketSphinx (offline), selected automatically
  • Auto engine detection β€” checks network connectivity at runtime and picks the best available engine
  • Context manager interface β€” opens the microphone once and releases it cleanly on exit, even on exception
  • Ambient noise calibration β€” samples the noise floor before each listen to improve accuracy
  • Configurable capture parameters β€” energy threshold, pause detection, phrase time limits, and timeouts
  • Retry logic β€” automatically retries on transient service errors with configurable delay
  • Structured logging β€” all internal events use Python's logging module; no bare print statements in library code
  • Custom exception hierarchy β€” fine-grained errors (STTAudioError, STTRecognitionError, STTServiceError, STTEngineError) for clean error handling

Installation

On Linux, you may also need:

sudoaptinstallportaudio19-devpython3-pyaudio

How to Use

1. One-shot (simplest)

frompyaitk.sttimport STT

stt = STT()
text = stt.listen()
print(text)

2. Context manager (recommended)

The microphone is opened once and released cleanly on exit β€” even if an exception is raised.

frompyaitk.sttimport STT

with STT() as stt:
 text = stt.listen()
 print(text)

3. Force a specific engine

frompyaitk.sttimport STT, STTConfig, Engine

# Force offline mode
cfg = STTConfig(preferred_engine=Engine.POCKETSPHINX)
with STT(config=cfg) as stt:
 text = stt.listen()
 print(text)

# Force online Google mode
cfg = STTConfig(preferred_engine=Engine.GOOGLE)
with STT(config=cfg) as stt:
 text = stt.listen()
 print(text)

4. Check which engine is active

frompyaitk.sttimport STT

with STT() as stt:
 print(stt.active_engine) # Engine.GOOGLE or Engine.POCKETSPHINX
 text = stt.listen()

5. Custom configuration

frompyaitk.sttimport STT, STTConfig

cfg = STTConfig(
 timeout=8.0, # wait up to 8 seconds for speech to start
 phrase_time_limit=15.0, # cap each utterance at 15 seconds
 pause_threshold=1.0, # 1 second of silence = end of phrase
 ambient_noise_duration=2.0, # spend 2 seconds calibrating noise floor
 google_language="en-GB", # use British English
 max_retries=5, # retry up to 5 times on service errors
)

with STT(config=cfg) as stt:
 text = stt.listen()
 print(text)

6. Multi-turn listening loop

frompyaitk.sttimport STT, STTAudioError, STTRecognitionError

with STT() as stt:
 while True:
 try:
 text = stt.listen()
 print(f"You said: {text}")
 if "stop" in text.lower():
 break
 except STTAudioError:
 print("No speech detected, trying again…")
 except STTRecognitionError:
 print("Couldn't understand that, please repeat.")

Configuration (STTConfig)

All parameters are optional and fall back to values from your project config (config.stt.*).

Parameter Type Description
energy_threshold float or None Mic sensitivity; None = auto-calibrate
dynamic_energy_threshold bool Continuously adjust threshold for ambient noise
pause_threshold float Seconds of silence that mark end of phrase
phrase_time_limit float or None Hard cap per utterance in seconds
timeout float or None Seconds to wait for speech to begin
ambient_noise_duration float Seconds spent sampling the noise floor before listening
preferred_engine Engine or None Force a specific engine; None = auto-detect
google_language str BCP-47 language tag for Google (e.g. "en-US", "fr-FR")
google_api_key str or None Google API key; None = free tier
sphinx_language str Language/model name for PocketSphinx
max_retries int Max retry attempts on transient service errors
retry_delay float Seconds to wait between retries
connectivity_host str Host used for the network connectivity check
connectivity_port int Port used for the connectivity check
connectivity_timeout float Timeout for the connectivity probe

Engine Selection

The engine is resolved in this order:

  1. config.preferred_engine β€” if explicitly set, always used
  2. Network probe β€” a TCP connection to connectivity_host:connectivity_port is attempted
    • Success β†’ Engine.GOOGLE (online)
    • Failure β†’ Engine.POCKETSPHINX (offline)
Engine Mode Requires Best for
GOOGLE Online Internet access High accuracy, broad language support
POCKETSPHINX Offline pocketsphinx Air-gapped / low-latency use

Exception Hierarchy

STTError
β”œβ”€β”€ STTAudioError β€” mic could not be opened, or no speech detected in time
β”œβ”€β”€ STTRecognitionError β€” audio captured but speech was unintelligible
β”œβ”€β”€ STTServiceError β€” online service was unreachable or returned an error
└── STTEngineError β€” PocketSphinx failed to init or missing language data

Example error handling:

frompyaitk.STTimport STT, STTAudioError, STTRecognitionError, STTServiceError, STTEngineError

try:
 with STT() as stt:
 text = stt.listen()
 print(text)
except STTAudioError as e:
 print(f"Mic issue: {e}")
except STTRecognitionError as e:
 print(f"Could not understand: {e}")
except STTServiceError as e:
 print(f"Service unreachable: {e}")
except STTEngineError as e:
 print(f"Offline engine error: {e}")

Examples Summary

# Quickest usage
frompyaitk.STTimport STT
text = STT().listen()

# Context manager with config
frompyaitk.STTimport STT, STTConfig, Engine
cfg = STTConfig(preferred_engine=Engine.GOOGLE, google_language="fr-FR", timeout=6.0)
with STT(config=cfg) as stt:
 text = stt.listen()
 print(text)

# Resilient loop with error recovery
frompyaitk.STTimport STT, STTAudioError, STTRecognitionError
with STT() as stt:
 while True:
 try:
 print(stt.listen())
 except STTAudioError:
 pass # timed out, try again
 except STTRecognitionError:
 print("Please repeat.")

PTT (PDF To Text) Function

What Is This?

  • Full document extraction β€” reads all pages and joins them into a single string
  • Per-page fault tolerance β€” if a single page fails, extraction continues for the rest
  • Configurable page separator β€” control how pages are joined in the output string
  • Structured logging β€” all warnings and errors go through Python's logging module
  • Custom exception β€” PDFExtractionError wraps PyMuPDF errors for clean upstream handling
  • Input validation β€” checks for empty paths, missing files, and non-file paths before opening

How to Use

1. Basic extraction

frompyaitk.PTTimport extract_text_from_pdf

text = extract_text_from_pdf("document.pdf")
print(text)

2. Custom page separator

text = extract_text_from_pdf("report.pdf", page_separator="\n---\n")
print(text)

3. Custom encoding

text = extract_text_from_pdf("document.pdf", encoding="latin-1")

4. With error handling

frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError

try:
 text = extract_text_from_pdf("document.pdf")
 print(f"Extracted {len(text)} characters.")
except FileNotFoundError:
 print("File not found.")
except ValueError as e:
 print(f"Invalid input: {e}")
except PDFExtractionError as e:
 print(f"Could not read PDF: {e}")

5. Processing multiple files

frompathlibimport Path
frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError

results = {}
for pdf_path in Path("./docs").glob("*.pdf"):
 try:
 results[pdf_path.name] = extract_text_from_pdf(str(pdf_path))
 except PDFExtractionError as e:
 print(f"Skipping {pdf_path.name}: {e}")

for name, text in results.items():
 print(f"{name}: {len(text)} characters")

API Reference

extract_text_from_pdf(pdf_path, encoding, page_separator)

Extracts all text from a PDF and returns it as a single string.

Parameter Type Default Description
pdf_path str (required) Path to the PDF file
encoding str "utf-8" Text encoding for extraction
page_separator str "\n\n" String inserted between pages in the output

Returns: str β€” full extracted text from all pages.

Raises:

Exception When
ValueError pdf_path is None, empty, or not a file path
FileNotFoundError The specified file does not exist on disk
PDFExtractionError PDF is corrupted, invalid, or unreadable

Exception Reference

PDFExtractionError
└── Wraps fitz.FileDataError and other PyMuPDF runtime errors

PDFExtractionError is the single catch-all for PDF-level failures. FileNotFoundError and ValueError are raised directly for path/input issues and do not need to be caught as PDFExtractionError.


Behaviour Notes

  • Empty PDF β€” if the document has zero pages, an empty string "" is returned and a warning is logged.
  • Page-level errors β€” if one page fails to extract, that page contributes an empty string and extraction continues for remaining pages. The error is logged.
  • No partial results lost β€” all successfully extracted pages are still returned even if some pages failed.

Examples Summary

# Minimal usage
frompyaitk.PTTimport extract_text_from_pdf
text = extract_text_from_pdf("file.pdf")

# Custom separator between pages
text = extract_text_from_pdf("file.pdf", page_separator="\n--- PAGE BREAK ---\n")

# Full error handling
frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError
try:
 text = extract_text_from_pdf("file.pdf")
except FileNotFoundError:
 print("File not found.")
except PDFExtractionError as e:
 print(f"Extraction failed: {e}")

# Batch processing a folder
frompathlibimport Path
frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError
for f in Path("./docs").glob("*.pdf"):
 try:
 print(f"{f.name}: {len(extract_text_from_pdf(str(f)))} chars")
 except PDFExtractionError:
 print(f"{f.name}: failed")

PPTXExtractor β€” PowerPoint Content Extraction Utility

A straightforward utility for extracting text, images, and tables from .pptx PowerPoint files using python-pptx. Processes all slides and organises extracted content by slide number.


What Is This?

PPTXExtractor is a class-based PPTX extraction tool that provides:

  • Text extraction β€” pulls all non-empty text from every shape across all slides
  • Image extraction β€” saves embedded images to disk, preserving their original format (PNG, JPEG, etc.)
  • Table extraction β€” reads all table shapes and returns row/cell data as nested lists
  • All-in-one extraction β€” a single extract_all() call returns text, images, and tables together
  • Auto output directory β€” creates the image output folder automatically if it doesn't exist
  • Slide-indexed results β€” all output is keyed by slide number (1-based) for easy lookup

How to Use

1. Extract everything at once

frompyaitk.PPTExtractimport PPTXExtractor

extractor = PPTXExtractor("presentation.pptx")
data = extractor.extract_all()

# data["texts"] β†’ {slide_num: [str, ...]}
# data["images"] β†’ {slide_num: [image_path, ...]}
# data["tables"] β†’ {slide_num: [[[cell, ...], ...], ...]}

2. Extract text only

extractor = PPTXExtractor("presentation.pptx")
texts = extractor.extract_text()

for slide_num, lines in texts.items():
 print(f"Slide {slide_num}:")
 for line in lines:
 print(f" {line}")

3. Extract and save images

extractor = PPTXExtractor("presentation.pptx", image_output_dir="my_images")
images = extractor.extract_images()

for slide_num, paths in images.items():
 for path in paths:
 print(f"Slide {slide_num}: saved β†’ {path}")

Images are saved as slide{N}_image{M}.{ext} inside the output directory.

4. Extract tables

extractor = PPTXExtractor("presentation.pptx")
tables = extractor.extract_tables()

for slide_num, slide_tables in tables.items():
 for table in slide_tables:
 for row in table:
 print("\t".join(row))

5. Iterate all content by slide

extractor = PPTXExtractor("presentation.pptx")
data = extractor.extract_all()

for slide_num in data["texts"]:
 print(f"\n--- Slide {slide_num} ---")

 for text in data["texts"][slide_num]:
 print(f" Text: {text}")

 for img_path in data["images"].get(slide_num, []):
 print(f" Image: {img_path}")

 for table in data["tables"].get(slide_num, []):
 for row in table:
 print(" Row:", "\t".join(row))

API Reference

PPTXExtractor(pptx_path, image_output_dir)

Parameter Type Default Description
pptx_path str (required) Path to the .pptx file
image_output_dir str "extracted_images" Directory where extracted images will be saved

The image output directory is created automatically if it does not exist.


extract_text() β†’ dict[int, list[str]]

Returns all non-empty text from every shape across all slides.

{
 1: ["Title of Slide One", "Bullet point A", "Bullet point B"],
 2: ["Slide Two heading", "Some body text"],
}

extract_images() β†’ dict[int, list[str]]

Saves all embedded images to image_output_dir and returns their file paths.

{
 1: ["extracted_images/slide1_image1.png"],
 3: ["extracted_images/slide3_image1.jpeg", "extracted_images/slide3_image2.png"],
}

Filenames follow the pattern: slide{slide_num}_image{shape_num}.{ext}


extract_tables() β†’ dict[int, list[list[list[str]]]]

Returns table data as nested lists: slide β†’ list of tables β†’ list of rows β†’ list of cell strings.

{
 2: [
 [["Header A", "Header B"], ["Row 1A", "Row 1B"], ["Row 2A", "Row 2B"]],
 ]
}

extract_all() β†’ dict

Runs all three extractors and returns a combined dictionary:

{
 "texts": { 1: [...], 2: [...] },
 "images": { 1: [...], 3: [...] },
 "tables": { 2: [...] },
}

Examples Summary

frompyaitk.PPTExtractimport PPTXExtractor

# Extract everything
data = PPTXExtractor("deck.pptx").extract_all()

# Text only
texts = PPTXExtractor("deck.pptx").extract_text()

# Images saved to a custom folder
images = PPTXExtractor("deck.pptx", image_output_dir="assets").extract_images()

# Tables only
tables = PPTXExtractor("deck.pptx").extract_tables()

# Iterate slide by slide
extractor = PPTXExtractor("deck.pptx")
data = extractor.extract_all()
for slide_num in data["texts"]:
 print(f"Slide {slide_num}:", data["texts"][slide_num])

NER System β€” Named Entity Recognition

A complete, modular Named Entity Recognition system built on spaCy. Covers the full ML lifecycle: text preprocessing, inference, postprocessing, training with early stopping, evaluation with precision/recall/F1, and persistent entity storage.


What Is This?

This package is a NER system with six cooperating modules:

Module Class / Function Responsibility
pipeline.py NERPipeline Core inference β€” single and batch prediction
trainer.py NERTrainer Training loop with early stopping and model saving
evaluator.py NEREvaluator Precision / Recall / F1 metrics (exact + partial match)
preprocessor.py TextPreprocessor Cleans and normalises raw text before inference
postprocessor.py EntityPostprocessor Filters, deduplicates, and enriches entity outputs
entity_store.py EntityStore In-memory store with JSONL persistence and analytics
logging_config.py setup_logging() Configures root logger (console + optional file)

Quick Start

frompyaitk.NERimport NERPipeline, EntityStore

# Load a trained model
pipeline = NERPipeline.from_model_path("models/my_ner")

# Predict
result = pipeline.predict("Apple was founded by Steve Jobs in California.")
for entity in result.entities:
 print(entity.label, entity.text)
# ORG Apple
# PERSON Steve Jobs
# GPE California

# Store results
store = EntityStore(persist_path="entities.jsonl")
store.add(result)
print(store.top_entities("ORG", n=5))

Module Guide

NERPipeline β€” Inference

frompyaitk.NERimport NERPipeline

# From a saved model on disk
pipeline = NERPipeline.from_model_path("models/my_ner")

# From a blank model (for training or rule-based use)
pipeline = NERPipeline.from_blank(lang="en", labels=["ORG", "PERSON", "GPE"])

# Single prediction
result = pipeline.predict("Google acquired DeepMind in 2014.")
print(result.entities) # list of Entity objects
print(result.entity_types) # ['DATE', 'ORG']
print(result.filter_by_label("ORG")) # [Entity(text='Google', ...)]
print(result.processing_time_ms) # e.g. 12.4

# Batch prediction (lazy generator, memory-efficient)
texts = ["Text one.", "Text two.", "Text three."]
for result in pipeline.predict_batch(texts):
 print(result.entities)

# Save model to disk
pipeline.save("models/my_ner_v2")

# Inspect labels
print(pipeline.labels) # ['DATE', 'GPE', 'ORG', 'PERSON']

Constructor options:

Parameter Type Default Description
nlp spacy.Language (required) Loaded spaCy model
preprocessor TextPreprocessor None Custom preprocessor; defaults to standard config
postprocessor EntityPostprocessor None Custom postprocessor; defaults to standard config
return_doc bool False Include raw spaCy Doc in results
batch_size int 64 Chunk size for nlp.pipe in batch mode

NERTrainer β€” Training

frompyaitk.NERimport NERTrainer
frompyaitk.NER.trainerimport TrainerConfig

TRAIN_DATA = [
 ("Apple was founded by Steve Jobs.", {"entities": [(0, 5, "ORG"), (21, 31, "PERSON")]}),
 ("Google is based in Mountain View.", {"entities": [(0, 6, "ORG"), (19, 32, "GPE")]}),
]
DEV_DATA = [...]

# Train from scratch
trainer = NERTrainer(config=TrainerConfig(n_iter=30, dropout=0.35))
trainer.prepare(TRAIN_DATA)
results = trainer.train(TRAIN_DATA, dev_data=DEV_DATA, output_dir="models/v1")

print(results["best_dev_f1"]) # e.g. 0.8712
print(results["history"]) # list of per-epoch loss + dev scores

# Fine-tune from an existing spaCy model
trainer = NERTrainer(base_model="en_core_web_sm")
trainer.train(TRAIN_DATA, output_dir="models/v1_finetuned")

# Export training data to spaCy v3 DocBin format
importspacy
nlp = spacy.blank("en")
NERTrainer.data_to_docbin(TRAIN_DATA, nlp, "train.spacy")

TrainerConfig fields:

Field Default Description
lang "en" spaCy language code for blank model
n_iter 30 Maximum training epochs
dropout 0.35 Dropout rate during training
batch_start 4 Starting batch size for compounding scheduler
batch_compound 1.001 Compounding factor for batch size growth
eval_every 5 Evaluate on dev set every N epochs
patience 5 Early stopping patience (epochs without improvement)
min_delta 0.001 Minimum F1 improvement to reset patience counter
seed 42 Random seed for reproducibility

NEREvaluator β€” Metrics

frompyaitk.NERimport NEREvaluator, NERPipeline

pipeline = NERPipeline.from_model_path("models/my_ner")
evaluator = NEREvaluator() # exact match (default)
evaluator_partial = NEREvaluator(partial=True) # any span overlap counts

gold_data = [
 ("Apple was founded by Steve Jobs.", {"entities": [(0, 5, "ORG"), (21, 31, "PERSON")]}),
]

report = evaluator.evaluate(gold_data, pipeline)
print(report)
# ============================================================
# Micro P=0.9200 R=0.8750 F1=0.8970
# Macro F1 = 0.8800
# ------------------------------------------------------------
# Label P R F1 TP FP FN
# ------------------------------------------------------------
# ORG 0.9500 0.9000 0.9245 ...
# PERSON 0.8100 0.8500 0.8295 ...
# ============================================================

print(report.to_dict()) # structured dict for logging / serialisation

# Evaluate pre-computed predictions (no pipeline needed)
gold_spans_list = [[(0, 5, "ORG"), (21, 31, "PERSON")]]
pred_spans_list = [[(0, 5, "ORG"), (21, 31, "PERSON")]]
report = evaluator.evaluate_from_predictions(gold_spans_list, pred_spans_list)

EvaluationReport properties:

Property Description
micro_precision TP / (TP + FP) across all labels
micro_recall TP / (TP + FN) across all labels
micro_f1 Harmonic mean of micro precision and recall
macro_f1 Unweighted average F1 across all labels
per_label Dict of LabelScore objects keyed by label

TextPreprocessor β€” Input Cleaning

frompyaitk.NERimport TextPreprocessor
frompyaitk.NER.preprocessorimport PreprocessorConfig

# Default config
preprocessor = TextPreprocessor()
clean = preprocessor.process("<b>Visit https://example.com for info.</b>")
# β†’ "Visit for info."

# Custom config
config = PreprocessorConfig(
 lowercase=True,
 remove_urls=True,
 remove_emails=True,
 remove_html_tags=True,
 normalize_whitespace=True,
 normalize_unicode=True,
 max_length=512,
 custom_patterns=[r"\d{4}-\d{2}-\d{2}"], # strip ISO dates
)
preprocessor = TextPreprocessor(config)

# Batch
cleaned_texts = preprocessor.process_batch(["Text one.", "Text two."])

PreprocessorConfig fields:

Field Default Description
lowercase False Lowercase the entire text
remove_urls True Remove http://, https://, and www. URLs
remove_emails False Remove email addresses (kept by default as NER signal)
remove_html_tags True Strip HTML tags
normalize_whitespace True Collapse all whitespace to single spaces
normalize_unicode True NFC-normalise Unicode
max_length None Truncate text to this many characters
custom_patterns [] List of regex strings to remove

EntityPostprocessor β€” Output Cleaning

frompyaitk.NERimport EntityPostprocessor
frompyaitk.NER.postprocessorimport PostprocessorConfig

config = PostprocessorConfig(
 min_length=2,
 max_length=100,
 allowed_labels={"ORG", "PERSON", "GPE"},
 blocked_labels={"MISC"},
 deduplicate=True,
 merge_adjacent=True,
 strip_punct=True,
 custom_label_map={"PERSON": "PER"}, # rename labels
)
postprocessor = EntityPostprocessor(config)
entities = postprocessor.process(entities)

Processing chain (in order):

  1. Filter by text length (min_length / max_length)
  2. Filter by label (allowed_labels / blocked_labels)
  3. Strip leading/trailing punctuation from entity text (strip_punct)
  4. Remap label names (custom_label_map)
  5. Lowercase labels (lowercase_labels)
  6. Deduplicate identical spans (deduplicate)
  7. Merge back-to-back same-label spans (merge_adjacent, gap ≀ 1 char)

EntityStore β€” Persistence & Analytics

frompyaitk.NERimport EntityStore

# In-memory only
store = EntityStore()

# With JSONL persistence (appends on each add; loads on init if file exists)
store = EntityStore(persist_path="entities.jsonl")

store.add(result) # add a single NERResult
store.add_batch([result1, result2]) # add multiple results

# Query
store.query(label="ORG") # all ORG records
store.query(text_contains="apple", min_score=0.8) # filtered
store.unique_entities(label="PERSON") # sorted unique names
store.top_entities("ORG", n=10) # [(entity, count), ...]
store.label_distribution() # {"ORG": 42, "PERSON": 31}

# Export
store.to_json("all_entities.json")

# Iterate
for record in store.iter_records():
 print(record["text"], record["label"])

print(len(store)) # total entity count
print(store) # EntityStore(total=132, labels={'ORG': 42, 'PERSON': 31})

Each stored record contains:

Field Description
text Entity surface form
label Entity type (e.g. ORG, PERSON)
start_char Start character offset in source text
end_char End character offset in source text
score Detection confidence (0.0–1.0)
source_text The original text the entity came from

Logging Setup

frompyaitk.NER.logging_configimport setup_logging

setup_logging(level="DEBUG", log_file="ner.log")
Parameter Default Description
level "INFO" Logging level
log_file None Optional path for file output
fmt "%(asctime)s [%(levelname)s] %(name)s β€” …" Log record format string

Full End-to-End Example

frompyaitk.NERimport NERPipeline, NERTrainer, NEREvaluator, EntityStore
frompyaitk.NER.trainerimport TrainerConfig
frompyaitk.NER.logging_configimport setup_logging

setup_logging(level="INFO", log_file="ner.log")

# 1. Prepare data
TRAIN_DATA = [
 ("Apple was founded by Steve Jobs in California.", {
 "entities": [(0, 5, "ORG"), (21, 31, "PERSON"), (35, 45, "GPE")]
 }),
]
DEV_DATA = TRAIN_DATA # use your actual dev split

# 2. Train
trainer = NERTrainer(config=TrainerConfig(n_iter=20, eval_every=5))
results = trainer.train(TRAIN_DATA, dev_data=DEV_DATA, output_dir="models/v1")
print("Best dev F1:", results["best_dev_f1"])

# 3. Load and predict
pipeline = NERPipeline.from_model_path("models/v1")
result = pipeline.predict("Microsoft acquired GitHub in 2018.")
print(result.entities)

# 4. Evaluate
evaluator = NEREvaluator()
report = evaluator.evaluate(DEV_DATA, pipeline)
print(report)

# 5. Store
store = EntityStore(persist_path="entities.jsonl")
store.add(result)
print(store.top_entities("ORG"))
print(store.label_distribution())

Data Format

Training and evaluation data uses spaCy's standard annotation format:

[
 ("Text to annotate.", {"entities": [(start, end, "LABEL"), ...]}),
 ("Google is in Mountain View.", {"entities": [(0, 6, "ORG"), (13, 26, "GPE")]}),
]
  • start / end are character offsets (not token indices)
  • Spans must not overlap
  • Labels are arbitrary strings registered via NERTrainer.prepare() or NERPipeline.from_blank()

MathAI β€” Production-grade Mathematical Expression Solver

A robust symbolic mathematics solver built on SymPy, supporting simplification, equation solving, differentiation, integration, matrix analysis, and Taylor series expansion β€” with automatic operation detection, input validation, and structured result objects.


What Is This?

MathAI is a symbolic math engine that provides:

  • Auto-detection β€” process() and MathAI() infer the operation type from the query automatically
  • Symbolic computation β€” exact symbolic results powered by SymPy (no floating-point approximation unless requested)
  • Numeric evaluation β€” automatically computes a decimal approximation when the result is a pure number
  • Equation solving β€” single equations and systems of equations (via x + y = 5 and x - y = 1 syntax)
  • Calculus β€” differentiation (any order) and definite/indefinite integration
  • Matrix analysis β€” determinant, trace, rank, inverse, and eigenvalues in one call
  • Taylor/Laurent series β€” configurable expansion point and order
  • Input validation β€” rejects empty input, oversized expressions, and dangerous code patterns
  • Structured results β€” every operation returns a MathResult dataclass with success, result, simplified, numeric, and metadata fields
  • Structured logging β€” all internal events go through Python's logging module
  • Convenience API β€” single MathAI(query, operation) function for quick one-liner use

How to Use

1. One-liner convenience function (simplest)

frompyaitk.MathAIimport MathAI

print(MathAI("x^2 + 2*x + 1"))
print(MathAI("x^2 - 4 = 0", operation="solve"))
print(MathAI("sin(x)", operation="differentiate"))

2. Auto-detect operation

The MathAI() function (and MathSolver.process()) detect the operation from the query:

  • Contains = β†’ solve
  • Starts with Matrix β†’ matrix operations
  • Starts with diff or derivative(...) β†’ differentiate
  • Starts with int or integrate(...) β†’ integrate
  • Anything else β†’ simplify
frompyaitk.MathAIimport MathAI

print(MathAI("x^2 - 9 = 0")) # β†’ solve
print(MathAI("Matrix([[2, 1], [5, 3]])")) # β†’ matrix
print(MathAI("sin(x)^2 + cos(x)^2")) # β†’ simplify

3. Using MathSolver directly

frompyaitk.mathaiimport MathSolver

solver = MathSolver()

result = solver.simplify("sin(x)^2 + cos(x)^2")
print(result)

4. Simplification

result = solver.simplify("x^3 + 3*x^2 + 3*x + 1")
print(result.simplified) # (x + 1)**3
print(result.numeric) # set if result is a pure number

5. Solving equations

# Single equation
result = solver.solve_equation("x^2 - 4 = 0")
print(result.result) # [{x: -2}, {x: 2}]

# System of equations (separate with "and")
result = solver.solve_equation("x + y = 10 and x - y = 2")
print(result.result) # [{x: 6, y: 4}]

6. Differentiation

# First derivative (default)
result = solver.differentiate("x^3 + sin(x)")
print(result.result) # 3*x**2 + cos(x)
print(result.simplified) # simplified form

# Higher-order derivative
result = solver.differentiate("x^5", var="x", order=3)
print(result.result) # 60*x**2

7. Integration

# Indefinite integral
result = solver.integrate("x^2 + 3*x")
print(result.result) # x**3/3 + 3*x**2/2

# Definite integral
result = solver.integrate("x^2", var="x", limits=(0, 1))
print(result.result) # 1/3
print(result.numeric) # 0.333333333333333

8. Matrix operations

result = solver.matrix_operations("Matrix([[1, 2], [3, 4]])")
print(result.metadata["determinant"]) # -2
print(result.metadata["inverse"]) # Matrix([[-2, 1], [3/2, -1/2]])
print(result.metadata["eigenvalues"]) # {-sqrt(33)/2 + 5/2: 1, sqrt(33)/2 + 5/2: 1}
print(result.metadata["rank"]) # 2
print(result.metadata["trace"]) # 5

9. Taylor series expansion

# Default: expand around x=0, 6 terms
result = solver.series_expansion("sin(x)")
print(result.result) # x - x**3/6 + x**5/120 + O(x**6)

# Custom point and order
result = solver.series_expansion("exp(x)", var="x", point=0, order=4)
print(result.result) # 1 + x + x**2/2 + x**3/6 + O(x**4)

API Reference

MathAI(query, operation) β†’ str

Top-level convenience function. Returns a formatted string of the result.

Parameter Type Default Description
query str '1*x + 2*x - 3*x' Mathematical expression or equation
operation str 'auto' One of: auto, simplify, solve, differentiate, integrate, matrix, series

MathSolver methods

Method Description
process(query) Auto-detect and dispatch to the right operation
simplify(expr) Simplify a symbolic expression
solve_equation(expr) Solve one or more equations
differentiate(expr, var, order) Differentiate to any order
integrate(expr, var, limits) Indefinite or definite integral
matrix_operations(expr) Det, trace, rank, inverse, eigenvalues
series_expansion(expr, var, point, order) Taylor/Laurent series around a point

MathResult fields

Every method returns a MathResult dataclass:

Field Type Description
success bool True if the operation completed without error
operation str Name of the operation performed
input_expr str The original input string
result str or None Primary result of the operation
simplified str or None Simplified form (where applicable)
numeric str or None Decimal evaluation (when result is a pure number)
error str or None Error message if success is False
metadata dict or None Extra details (variable, limits, matrix properties, etc.)

Calling str(result) produces a human-readable summary of all populated fields.


Supported Symbols and Functions

The solver pre-loads a wide set of SymPy functions accessible directly in expressions:

Category Available
Trigonometric sin, cos, tan, cot, sec, csc, asin, acos, atan, sinh, cosh, tanh
Exponential / Log exp, log, ln
Constants e, pi, I (imaginary unit), oo (infinity)
Algebra sqrt, abs, floor, ceil, factorial, binomial
Calculus diff, integrate, limit, Sum, Product
Linear Algebra Matrix, det, transpose
Simplification factor, expand, simplify, cancel, apart, together
Special Functions gamma, erf
Variables x y z a b c … theta phi alpha beta gamma delta epsilon

Input Validation

Before any operation, expressions are checked for:

  • Empty or whitespace-only input
  • Expressions exceeding 10,000 characters
  • Forbidden patterns: __, import, eval, exec, compile, open, file

Rejected inputs return a MathResult with success=False and a descriptive error message β€” no exceptions are raised to the caller.


Examples Summary

frompyaitk.mathaiimport MathAI, MathSolver

# One-liner API
print(MathAI("x^2 + 2*x + 1"))
print(MathAI("x^2 - 4 = 0", operation="solve"))
print(MathAI("sin(x)", operation="differentiate"))
print(MathAI("x^2", operation="integrate"))
print(MathAI("Matrix([[1, 2], [3, 4]])", operation="matrix"))
print(MathAI("cos(x)", operation="series"))

# Using MathSolver directly
solver = MathSolver()
r = solver.solve_equation("x + y = 10 and x - y = 2")
print(r.result)

r = solver.differentiate("x^5", order=3)
print(r.simplified)

r = solver.integrate("x^2", limits=(0, 1))
print(r.numeric)

r = solver.matrix_operations("Matrix([[1, 2], [3, 4]])")
print(r.metadata)

Prompts for MathAI

Solve normal numeric problems.

1+2+3+4-5(1-55)+10

Solve symbolic methamatic.

X+2Y-X+10Z*(10-100)X

Matrix

Matrix([[1,0],[0,1]])

Matrix([[1,2,3],[2,3,10]])

Trigonometric Functions

sin
sin(0)
sin(30)
sin(45)
sin(60)
sin(90)
...
sin(180)
...
Syntax
sin(<valueoftheata>)
cos
cos(0)
cos(30)
cos(45)
cos(60)
cos(90)
...
cos(180)
...
Syntax
cos(<valueoftheata>)
tan
tan(0)
tan(30)
tan(45)
tan(60)
tan(90)
...
tan(180)
...
Syntax
tan(<valueoftheata>)
cosec
csc(0)
csc(30)
csc(45)
csc(60)
csc(90)
...
csc(180)
...
Syntax
csc(<valueoftheata>)
sec
sec(0)
sec(30)
sec(45)
sec(60)
sec(90)
...
sec(180)
...
Syntax
sec(<valueoftheata>)
cot
cot(0)
cot(30)
cot(45)
cot(60)
cot(90)
...
cot(180)
...
Syntax
cot(<valueoftheata>)

Limit

limit('2X')
limit('2X+3Y')
Syntax
limit(<symbolicandtrigonometricvaluesinstringformate>)

Determinants

det(Matrix([[10]]))
det(Matrix([[10],[20]]))
det(Matrix([[10],[30],[40]]))
det(Matrix([[10],[100],[0],[50]]))
det(Matrix([[10],[97],[95],[1],[99]]))
det(Matrix([[10],[11],[100],[3],[2],[150]]))
det(Matrix([[10,20]]))
det(Matrix([[10,20],[20,30]]))
det(Matrix([[10,20],[60,70],[80,100]]))
det(Matrix([[10,20,30]]))
det(Matrix([[10,20,30],[40,50,60]]))
det(Matrix([[10,20,30],[40,50,60],[90,100,102]]))
det(Matrix([[10,20,30],[40,50,60],[90,100,102],[95,101,1000]]))
...

Log

log(10)
log(20)
log(0)
log(100)
...

Ln

ln(0)
ln(1)
ln(10)
ln(100)
ln(1000)
ln(102)
ln(3)
ln(90)
ln(893)
ln(9)
...

E

e()

↑ Get the value of e.

e(10)
e(3)
e(21)
e(38)
e(0)
...

⊼ (pi)

pi()

↑ Get the value of ⊼

Square

Get all square of the value.

sqrt(10)
sqrt(2)
sqrt(30)
sqrt(28039)
sqrt(19)
sqrt(289843190)
...

Differential

diff('2x')
diff('x')
diff('y')
...

Give all the values in string formate.

Integration (∫)

Give all the values in string formate.

integrate('dx')
integrate('xdx')
integrate('2xdx')
integrate('2xdy')
integrate('xdy')
integrate('2x+3y-3ydx')
...

Factor

Get all the factors of the numbers

factor(10)
factor(213)
factor(389)
factor(1983)
factor(0)
factor(12)
...

ITT β€” Image-to-Text (OCR) Utility

A minimal, dependency-light OCR utility built on EasyOCR. Extracts text from images in a single function call, with the recognition model loaded once at startup for efficient repeated use.


What Is This?

ITT is a lightweight image-to-text module that provides:

  • Single function API β€” ITT(image_path) returns all recognised text as a plain string
  • EasyOCR backend β€” deep learning-based OCR supporting 80+ languages out of the box
  • Model loaded once β€” the reader is initialised at module level so repeated calls pay no reload cost
  • Multi-language support β€” pass any EasyOCR-supported language codes at call time
  • Zero boilerplate β€” no configuration classes, no context managers; just import and call

Note

EasyOCR downloads model weights on first use (~100 MB). An internet connection is required for the initial download; subsequent runs work fully offline.


How to Use

1. Basic usage

frompyaitk.ITTimport ITT

text = ITT("screenshot.png")
print(text)

2. Extract text from any image format

EasyOCR supports JPEG, PNG, BMP, TIFF, and more.

text = ITT("photo.jpg")
text = ITT("scan.bmp")
text = ITT("document.tiff")

3. Multi-language recognition

Pass a list of BCP-47 / EasyOCR language codes as the second argument.

# English + French
text = ITT("menu.jpg", languages=["en", "fr"])

# English + Hindi
text = ITT("sign.png", languages=["en", "hi"])

Note: The module-level reader is initialised with ['en'] only. To use other languages reliably, re-initialise easyocr.Reader with the desired codes before calling readtext.

4. Batch processing multiple images

frompyaitk.ITTimport ITT
frompathlibimport Path

results = {}
for img in Path("./images").glob("*.png"):
 results[img.name] = ITT(str(img))

for name, text in results.items():
 print(f"{name}: {text[:80]}")

5. Using the result downstream

text = ITT("invoice.png")

# Search for keywords
if "TOTAL" in text.upper():
 print("Invoice contains a total amount.")

# Write extracted text to file
with open("output.txt", "w", encoding="utf-8") as f:
 f.write(text)

API Reference

ITT(image_path, languages) β†’ str

Runs OCR on the given image and returns all detected text joined into a single space-separated string.

Parameter Type Default Description
image_path str (required) Path to the image file
languages list[str] ['en'] EasyOCR language codes to use for recognition

Returns: str β€” all detected text regions joined by spaces, in detection order.


Module-level reader

reader = easyocr.Reader(['en'])

The reader is created once when the module is imported. This means:

  • Model files are downloaded on the first import only
  • All subsequent ITT() calls reuse the same loaded model β€” no per-call overhead
  • If you need languages beyond English, create your own easyocr.Reader instance with the required codes

Supported Languages (selected)

EasyOCR supports 80+ languages. Common codes:

Code Language Code Language
en English fr French
hi Hindi de German
zh Chinese ja Japanese
ko Korean ar Arabic
es Spanish pt Portuguese

Full list: https://www.jaided.ai/easyocr


Notes

  • Detection order β€” text regions are returned in the order EasyOCR detects them, which follows a rough top-to-bottom, left-to-right reading order but may vary for complex layouts.
  • Low-quality images β€” blurry, low-contrast, or heavily compressed images will reduce accuracy. Pre-processing with a library like Pillow (resize, sharpen, greyscale) can improve results.
  • GPU acceleration β€” EasyOCR uses the GPU automatically if a CUDA-capable device is available. On CPU, recognition is slower but fully functional.

Examples Summary

frompyaitk.ITTimport ITT

# Basic
text = ITT("image.png")

# Multi-language
text = ITT("document.jpg", languages=["en", "fr"])

# Batch
frompathlibimport Path
for img in Path("./scans").glob("*.jpg"):
 print(img.name, ITT(str(img)))

# Save to file
with open("extracted.txt", "w") as f:
 f.write(ITT("receipt.png"))

GrammarCorrector β€” Grammar Correction Pipeline

A three-tier grammar correction system combining SpaCy rule-based morphology, a scikit-learn statistical token corrector, and a PyTorch LSTM Seq2Seq neural model β€” all unified behind a single GrammarCorrector facade with a JSON intents system for training on real-world error pairs.


What Is This?

Grammar provides a complete grammar correction pipeline with three escalating tiers:

Tier Engine Requires fitting? Description
1 SpaCy + regex rules No Morphological rules, subject-verb agreement, capitalisation, confused-word pairs
2 Scikit-learn (Logistic Regression + DictVectorizer) Yes Statistical token-level correction trained on (corrupted, correct) pairs
3 PyTorch LSTM Seq2Seq Yes Neural fallback for out-of-vocabulary and complex patterns

Tier 1 always runs. Tiers 2 and 3 activate after fit() and each falls back gracefully if unavailable.

Additional features:

  • JSON intents system β€” load structured {incorrect, correct, tag} pairs from a file; two supported layouts (flat and grouped)
  • Auto-corruption engine β€” 100+ deterministic corruption rules (homophones, slang, typos, contractions) to generate training data from plain correct sentences
  • BPE tokenizer β€” trained from scratch on the training corpus via HuggingFace tokenizers
  • Context manager interface β€” loads SpaCy on enter, releases all GPU/CPU resources on exit
  • quick_correct() β€” one-liner convenience function for rule-only or full pipeline correction

How to Use

1. Rule-only correction (no fitting needed)

Tier 1 (SpaCy + rules) always runs β€” no training data required.

frompyaitk.Grammarimport GrammarCorrector

with GrammarCorrector() as gc:
 print(gc.correct("i wants to here more about this"))
 # β†’ "I wants to hear more about this."

 print(gc.correct("she go to the market"))
 # β†’ "She goes to the market."

2. Full three-tier pipeline from a sentence list

frompyaitk.Grammarimport GrammarCorrector

sentences = [
 "I am going to the market.",
 "She has a beautiful garden.",
 "They were playing football yesterday.",
]

with GrammarCorrector() as gc:
 gc.fit(sentences)
 print(gc.correct("i wants to here more"))
 print(gc.correct("she go to market"))

3. Train from a JSON intents file (recommended)

frompyaitk.Grammarimport GrammarCorrector

with GrammarCorrector() as gc:
 gc.fit_from_intents_file("intents.json")
 print(gc.correct("i wants to here you"))

4. Train from intents with a tag filter

Only use specific error categories for training:

frompyaitk.Grammarimport GrammarCorrector, load_intents

examples = load_intents("intents.json", tags=["verb_agreement", "capitalisation"])

with GrammarCorrector() as gc:
 gc.fit_from_intents(examples)
 print(gc.correct("he go home"))

5. Merge intents + plain sentences

frompyaitk.Grammarimport GrammarCorrector, load_intents

sentences = ["The cat sat on the mat.", "I enjoy reading books."]
examples = load_intents("intents.json")

with GrammarCorrector() as gc:
 gc.fit(sentences, intents=examples)
 print(gc.correct("u r gr8"))

6. Save and load a trained model

frompyaitk.Grammarimport GrammarCorrector

# Train and save
with GrammarCorrector() as gc:
 gc.fit_from_intents_file("intents.json")
 gc.save("models/grammar_v1")

# Load and use
with GrammarCorrector() as gc:
 gc.load("models/grammar_v1")
 print(gc.correct("she go to the market"))

7. One-liner convenience function

frompyaitk.Grammarimport quick_correct

# Rule-only (no training data)
print(quick_correct("i go to school"))

# With training
sentences = ["She goes to school.", "I am happy."]
print(quick_correct("i go to school", sentences=sentences))

8. Custom configuration

frompyaitk.Grammarimport GrammarCorrector, CorrectorConfig

cfg = CorrectorConfig(
 epochs=20,
 hidden_dim=512,
 num_layers=3,
 dropout=0.4,
 learning_rate=5e-4,
 vocab_size=3000,
 device="cuda",
)

with GrammarCorrector(cfg) as gc:
 gc.fit_from_intents_file("intents.json")
 print(gc.correct("they was at home"))

JSON Intents File Format

Two supported layouts β€” freely mixable in the same file.

Layout A β€” flat pair

{
"intents":[
{
"tag":"subject_verb_agreement",
"incorrect":"she go to the market",
"correct":"she goes to the market"
},
{
"tag":"pronoun_capitalisation",
"incorrect":"i am here",
"correct":"I am here"
}
]
}

Layout B β€” grouped examples

{
"intents":[
{
"tag":"capitalisation",
"examples":[
{"incorrect":"i am here","correct":"I am here"},
{"incorrect":"hi my name","correct":"Hi my name"}
]
}
]
}

Rules:

  • "tag" is required; defaults to "general" if omitted
  • Pairs where incorrect == correct are silently skipped
  • Duplicate pairs (case-insensitive) are deduplicated automatically
  • Both layouts may be mixed freely in the same file

Loading and inspecting intents

frompyaitk.Grammarimport load_intents, intents_summary, intents_to_pairs

# Load all
examples = load_intents("intents.json")

# Load with tag filter
examples = load_intents("intents.json", tags=["verb_agreement"])

# Load from a dict (useful for testing)
examples = load_intents({
 "intents": [{"tag": "test", "incorrect": "i go", "correct": "I go"}]
})

# Summary: {tag: count}
summary = intents_summary(examples)
print(summary) # {"capitalisation": 5, "verb_agreement": 8}

# Convert to raw pairs
pairs = intents_to_pairs(examples) # [("i go", "I go"), ...]

API Reference

GrammarCorrector(config?)

Parameter Type Default Description
config CorrectorConfig or None None Uses CorrectorConfig() defaults if None
Methods
Method Returns Description
fit(sentences, intents?) self Train all three tiers; auto-corrupts sentences; merges intents pairs
correct(text) str Apply all available tiers in order
fit_from_intents(intents, sentences?, tags?) self Train primarily from IntentExample list
fit_from_intents_file(path, sentences?, tags?) self Load JSON file and train in one call
save(path) None Save Seq2Seq weights + BPE tokenizer to a directory
load(path) self Restore weights + tokenizer from a directory

All fit* methods return self for chaining:

gc.fit_from_intents_file("intents.json").correct("she go to market")

CorrectorConfig fields

Field Type Default Description
vocab_size int 2000 BPE tokenizer vocabulary size
special_tokens list ["<s>", "</s>", "<unk>", "<pad>"] Special tokens
emb_dim int 128 Token embedding dimension
hidden_dim int 256 LSTM hidden layer size
num_layers int 2 Number of LSTM layers
dropout float 0.3 Dropout rate
epochs int 10 Seq2Seq training epochs
learning_rate float 1e-3 Adam optimizer learning rate
teacher_forcing_ratio float 0.5 Probability of using teacher forcing per step
max_decode_len int 120 Max tokens to generate during inference
spacy_model str en_core_web_sm SpaCy model name
device str "cuda" if available else "cpu" PyTorch device

Exception Hierarchy

GrammarCorrectorError
β”œβ”€β”€ NotFittedError β€” correct() called before fit()
β”œβ”€β”€ TokenizerError β€” BPE tokenizer missing special tokens
└── IntentsValidationError β€” JSON intents file fails schema validation

Data Helpers

frompyaitk.Grammarimport corrupt_sentence, build_dataset

# Apply deterministic corruption rules to one sentence
noisy = corrupt_sentence("She goes to the market.")
# β†’ "She go 2 the market."

# Build (corrupted, correct) pairs from a list of correct sentences
dataset = build_dataset(["I am happy.", "She goes to school."])
# β†’ [("i'm happy.", "I am happy."), ...]

The corruption engine covers 100+ rules across five categories: homophones (their/there/they're), internet slang (u/r/gr8/lol), informal contractions (wanna/gonna/gotta), common typos (teh/freind/recieve), and punctuation corruption.


Low-level API

These are available for advanced use when you want direct access to individual tiers or the training loop:

frompyaitk.Grammarimport (
 rule_based_corrector, # Tier 1 standalone
 train_tokenizer, # BPE tokenizer training
 Seq2Seq, Encoder, Decoder, # Tier 3 PyTorch models
 train_model, # Tier 3 training loop
 correct_sentence_neural, # Tier 3 greedy inference
)
importspacy

nlp = spacy.load("en_core_web_sm")
corrected = rule_based_corrector("she go to the market", nlp)
print(corrected) # "She goes to the market."

Architecture Overview

GrammarCorrector.correct(text)
 β”‚
 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Tier 1 β€” SpaCy + Regex β”‚ Always runs
β”‚ β€’ Lowercase "i" β†’ "I" β”‚
β”‚ β€’ Sentence capitalisation β”‚
β”‚ β€’ Subject-verb agreement β”‚
β”‚ β€’ Confused-word pairs β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚
 β–Ό (if fitted)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Tier 2 β€” Sklearn Pipeline β”‚ Runs after fit()
β”‚ β€’ DictVectorizer features β”‚
β”‚ β€’ OneVsRest LogisticReg. β”‚
β”‚ β€’ Token-level sequence fix β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚
 β–Ό (if fitted)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Tier 3 β€” LSTM Seq2Seq β”‚ Runs after fit()
β”‚ β€’ BPE tokenizer β”‚
β”‚ β€’ Encoder (embedding + LSTM) β”‚
β”‚ β€’ Decoder (teacher forcing) β”‚
β”‚ β€’ Greedy inference β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Examples Summary

frompyaitk.Grammarimport GrammarCorrector, load_intents, quick_correct

# Rule-only (no fitting)
with GrammarCorrector() as gc:
 print(gc.correct("i go to school"))

# Full pipeline from sentences
with GrammarCorrector() as gc:
 gc.fit(["She goes to school.", "I am happy."])
 print(gc.correct("she go to school"))

# From intents file
with GrammarCorrector() as gc:
 gc.fit_from_intents_file("intents.json", tags=["verb_agreement"])
 print(gc.correct("he dont know"))

# Save and reload
with GrammarCorrector() as gc:
 gc.fit_from_intents_file("intents.json")
 gc.save("models/v1")

with GrammarCorrector() as gc:
 gc.load("models/v1")
 print(gc.correct("they was playing"))

# One-liner
print(quick_correct("i go home"))

EYE β€” Real-Time Object Detection

A production-ready webcam object detection application built on YOLOv8 (Ultralytics) and CustomTkinter. Supports a full GUI with live controls, headless CLI detection, a unified EyeSession facade, and full context-manager support on every public class β€” with extensive threading fixes and feature additions over a naive implementation.


What Is This?

EYE is a fully-featured real-time object detection module that provides:

  • Full context-manager protocol β€” every public class (CameraManager, ObjectDetector, DetectionApp, EyeSession) supports with blocks with guaranteed resource cleanup
  • EyeSession facade β€” unified entry point with .gui() and .headless() factory context managers covering the complete lifecycle
  • Live webcam detection β€” reads frames from any connected camera and runs YOLOv8 inference at a capped FPS
  • Full GUI application β€” dark-mode CustomTkinter UI with video feed, control panel, and detection log
  • Headless CLI mode β€” simple_detect() uses context managers internally for safe camera/detector teardown
  • Hot-swappable model variants β€” switch between yolov8n/s/m/l/x.pt at runtime without restarting
  • Class filter β€” show bounding boxes only for specific classes (e.g. person, car)
  • Confidence threshold β€” adjustable slider (0.1–0.95), persisted to ~/.yolov8_app.json
  • Detection heat-map β€” alpha-blended overlay that accumulates and decays detection positions
  • Video recording β€” save annotated footage to .mp4 at camera resolution
  • Screenshot β€” save the current annotated frame as a timestamped JPG
  • Live FPS counter β€” exponential moving average over the last 30 frames
  • Detection log β€” scrollable timestamped list of the last 200 detection events
  • Keyboard shortcuts β€” Space (pause), S (screenshot), R (record), Q (quit)
  • Serialisable config β€” DetectionConfig saves/loads all settings as JSON
  • Thread-safe design β€” all widget mutations dispatched via self.after(), frame buffer protected with threading.Lock
  • Legacy aliases β€” EYE() and OpenEYE() for backwards compatibility with core.py / pyaitk

Note

YOLOv8 model weights are downloaded automatically on first use (~6 MB for yolov8n.pt). An internet connection is required for the initial download; subsequent runs work offline.


Context-Manager Patterns

All four public classes now support the full context-manager protocol. Resources are always released β€” even when exceptions are raised.

EyeSession β€” recommended top-level API

frompyaitk.eyeimport EyeSession

# GUI mode (blocks until window closed)
with EyeSession.gui() as session:
 session.run()
 print(session.last_detections)

# Headless mode (blocks until target seen or 'q' pressed)
with EyeSession.headless(target_class="person") as session:
 detected = session.run()
 print(detected)

# GUI with custom config
frompyaitk.eyeimport DetectionConfig
cfg = DetectionConfig(model_name="yolov8s.pt", show_heatmap=True, confidence_threshold=0.6)
with EyeSession.gui(config=cfg) as session:
 session.run()

CameraManager β€” camera resource

frompyaitk.eyeimport CameraManager

# __enter__ / __exit__
with CameraManager(camera_index=0, width=640, height=480) as cam:
 ret, frame = cam.read()

# Factory context manager
with CameraManager.open_device(0, width=1280, height=720) as cam:
 ret, frame = cam.read()
 print(cam.is_opened) # True

ObjectDetector β€” inference resource

frompyaitk.eyeimport ObjectDetector, DetectionConfig

cfg = DetectionConfig()

# __enter__ / __exit__ (provide your own model)
frompyaitk.eyeimport ModelLoader
model = ModelLoader.load_model("yolov8n.pt")
with ObjectDetector(model, cfg) as det:
 classes, annotated, count, confs = det.detect(frame)

# Factory context manager (loads model internally)
with ObjectDetector.from_model("yolov8n.pt", cfg) as det:
 classes, annotated, count, confs = det.detect(frame)

DetectionApp β€” GUI application

frompyaitk.eyeimport DetectionApp, DetectionConfig
importcustomtkinterasctk

# __enter__ / __exit__
ctk.set_appearance_mode("dark")
ctk.set_default_color_theme("blue")
with DetectionApp(DetectionConfig()) as app:
 app.mainloop()

# Factory context manager (sets up CTk theme automatically)
with DetectionApp.run_app(DetectionConfig()) as app:
 app.mainloop()

Composing context managers (low-level pipeline)

frompyaitk.eyeimport CameraManager, ObjectDetector
importcv2

with CameraManager(0) as cam:
 with ObjectDetector.from_model("yolov8n.pt") as det:
 while True:
 ret, frame = cam.read()
 if not ret:
 break
 classes, annotated, count, confs = det.detect(frame)
 cv2.imshow("Detection", annotated)
 if cv2.waitKey(1) & 0xFF == ord("q"):
 break
cv2.destroyAllWindows()
# Camera and detector released automatically on exit

How to Use

1. Launch the GUI (simplest)

frompyaitk.eyeimport launch_gui

launch_gui()

2. GUI with custom config

frompyaitk.eyeimport launch_gui, DetectionConfig

config = DetectionConfig(
 model_name="yolov8s.pt",
 confidence_threshold=0.6,
 target_fps=25,
 show_heatmap=True,
 filter_classes=["person", "car"],
)
launch_gui(config)

3. Headless detection (no GUI)

simple_detect() opens an OpenCV window and runs until the target class is seen or q is pressed. Uses CameraManager and ObjectDetector context managers internally β€” camera is always released.

frompyaitk.eyeimport simple_detect

detected = simple_detect(target_class="person")
print("Detected:", detected)

macOS note: cv2.imshow must be called from the main thread. Only call simple_detect() from the main thread.

4. Headless with custom config

frompyaitk.eyeimport simple_detect, DetectionConfig

config = DetectionConfig(
 model_name="yolov8m.pt",
 confidence_threshold=0.45,
 camera_index=1,
 target_fps=20,
)
detected = simple_detect(target_class="car", config=config)
print(detected)

5. Legacy aliases (pyaitk / core.py compatibility)

frompyaitk.eyeimport EYE, OpenEYE

detected = EYE() # β†’ simple_detect()
OpenEYE() # β†’ launch_gui()

6. Low-level components with context managers

frompyaitk.eyeimport CameraManager, ObjectDetector, DetectionConfig
importcv2

config = DetectionConfig()

with CameraManager(camera_index=0, width=640, height=480) as cam:
 with ObjectDetector.from_model("yolov8n.pt", config) as det:
 ret, frame = cam.read()
 if ret:
 detected_classes, annotated, count, confidences = det.detect(frame)
 print(f"Detected {count} objects: {detected_classes}")
 cv2.imshow("Frame", annotated)
 cv2.waitKey(0)
cv2.destroyAllWindows()
# No explicit release needed β€” context managers handle it

GUI Controls Reference

Control Description
Model selector Switch between yolov8n/s/m/l/x.pt (hot-swap, no restart needed)
Confidence slider Detection threshold from 0.10 to 0.95
Class filter field Type a class name and press Enter or "Add Filter"
Clear Filter button Remove all filters (show all classes)
Heatmap toggle Enable/disable the alpha-blended detection heat-map
Pause / Resume button Freeze/unfreeze the video loop
Screenshot button Save annotated frame as screenshot_YYYYMMDD_HHMMSS.jpg
Record button Start/stop saving annotated video as recording_*.mp4
Detection log Scrollable list of last 200 timestamped events
Clear Log button Empty the detection log
Status bar Camera info, resolution, active model name

Keyboard Shortcuts

Key Action
Space Pause / Resume
S Screenshot
R Start / Stop recording
Q Quit

API Reference

EyeSession

Unified high-level facade with two factory context managers.

EyeSession.gui(config?) β†’ context manager

Launches the full CustomTkinter GUI. Blocks on session.run() until the window is closed. Tears down the app on exit.

with EyeSession.gui(config=DetectionConfig(model_name="yolov8s.pt")) as session:
 session.run()
 print(session.last_detections)
EyeSession.headless(target_class, config?) β†’ context manager

Headless detection. Blocks on session.run() until target is detected or q is pressed. Calls cv2.destroyAllWindows() on exit.

with EyeSession.headless("car") as session:
 detected = session.run()
Property Type Description
last_detections list[str] Class names detected at the time of exit

launch_gui(config) β†’ None

Functional launcher. Uses DetectionApp.run_app() context manager internally.

Parameter Type Default Description
config DetectionConfig or None None Load from ~/.yolov8_app.json if None

simple_detect(target_class, config) β†’ list[str]

Headless OpenCV detection loop. Uses CameraManager and ObjectDetector context managers internally.

Parameter Type Default Description
target_class str "person" Class name to trigger exit on
config DetectionConfig or None None Uses defaults if None

Returns: sorted list of all class names detected at the time of exit.


CameraManager(camera_index, width, height)

Parameter Type Default Description
camera_index int 0 OpenCV device index
width int 640 Requested frame width
height int 480 Requested frame height
Method / Property Returns Description
open() bool Open the device; returns True on success
read() tuple[bool, ndarray|None] Read one frame
release() None Release the capture device
is_opened bool True if device is currently open
__enter__ / __exit__ β€” Opens on enter, releases on exit
CameraManager.open_device(idx, w, h) context manager Factory @contextmanager

ObjectDetector(model, config)

Method Returns Description
detect(frame) tuple Run inference; see return table below
__enter__ / __exit__ β€” Clears heatmap accumulator on exit
ObjectDetector.from_model(name, config?) context manager Loads model + yields detector

detect() return values:

Field Type Description
detected_classes set[str] All class names above confidence threshold
annotated_frame np.ndarray Original-resolution BGR frame with bounding boxes
object_count int Total number of detections
class_confidences dict[str, float] Highest confidence per class name

DetectionApp(config)

Method Description
mainloop() Start the Tk event loop (blocks)
__enter__ / __exit__ Calls _close_app() on exit
DetectionApp.run_app(config?) Factory context manager; sets CTk theme

DetectionConfig fields

Field Type Default Description
model_name str "yolov8n.pt" YOLOv8 weight file name
target_class str "person" Class to watch for in headless mode
confidence_threshold float 0.5 Minimum detection confidence (0.0–1.0)
frame_width int 640 Requested camera frame width
frame_height int 480 Requested camera frame height
detection_size tuple[int,int] (416, 416) Resize resolution for YOLO inference
camera_index int 0 OpenCV camera index
target_fps int 30 FPS cap for the detection loop
show_heatmap bool False Enable detection heat-map overlay
filter_classes list[str] [] Restrict boxes to these classes; empty = all

Persistence:

config = DetectionConfig(confidence_threshold=0.7)
config.save() # β†’ ~/.yolov8_app.json

config = DetectionConfig.load() # restore from file

ModelLoader.load_model(model_name) β†’ YOLO | None

Tries four strategies in order:

  1. Package resource (if running inside a package)
  2. Script directory and adjacent eye/ subfolder
  3. Current working directory
  4. Auto-download from Ultralytics

Architecture Overview

EyeSession
β”œβ”€β”€ EyeSession.gui() β†’ DetectionApp.run_app() β†’ DetectionApp.__enter__/__exit__
└── EyeSession.headless() β†’ simple_detect()
 β”œβ”€β”€ CameraManager.__enter__/__exit__
 └── ObjectDetector.from_model().__enter__/__exit__

DetectionApp (GUI)
β”œβ”€β”€ Main thread ──→ CTk event loop + all widget mutations (via self.after())
β”‚ ↑
β”‚ polls frame queue at ~60 Hz
β”‚ ↑
└── Video thread ──→ camera.read() β†’ detector.detect() β†’ FramePacket β†’ Queue(maxsize=2)

Resource ownership
──────────────────
CameraManager.__exit__ β†’ cap.release()
ObjectDetector.__exit__ β†’ _heatmap = None
DetectionApp.__exit__ β†’ thread.join(2s) β†’ writer.release() β†’ camera.release() β†’ destroy()
EyeSession.__exit__ β†’ app._close_app() (if GUI) / cv2.destroyAllWindows() (if headless)

Examples Summary

# ── EyeSession (recommended) ─────────────────────────────────────────────────

frompyaitk.eyeimport EyeSession, DetectionConfig

# GUI
with EyeSession.gui() as session:
 session.run()
 print(session.last_detections)

# GUI with custom config
cfg = DetectionConfig(model_name="yolov8s.pt", show_heatmap=True)
with EyeSession.gui(config=cfg) as session:
 session.run()

# Headless
with EyeSession.headless("car") as session:
 detected = session.run()

# ── Functional API ───────────────────────────────────────────────────────────

frompyaitk.eyeimport launch_gui, simple_detect
launch_gui()
detected = simple_detect("person")

# ── Low-level context managers ───────────────────────────────────────────────

frompyaitk.eyeimport CameraManager, ObjectDetector
importcv2

with CameraManager(0) as cam:
 with ObjectDetector.from_model("yolov8n.pt") as det:
 ret, frame = cam.read()
 classes, annotated, count, confs = det.detect(frame)
cv2.destroyAllWindows()

# ── DetectionApp directly ────────────────────────────────────────────────────

frompyaitk.eyeimport DetectionApp, DetectionConfig
with DetectionApp.run_app(DetectionConfig()) as app:
 app.mainloop()

# ── Legacy aliases ───────────────────────────────────────────────────────────

frompyaitk.eyeimport EYE, OpenEYE
EYE() # headless detect
OpenEYE() # launch GUI

CLSE - Compositional Latent Synthesis Engine

A complete, self-contained Compositional Latent Synthesis Engine pipeline built entirely on NumPy, NLTK, scikit-learn, and PyTorch β€” no Stable Diffusion or external model weights required. Converts natural-language prompts into images through a multi-stage NLP β†’ neural model β†’ renderer architecture, with full support for procedural art, visual effects, animation, streaming, and a rich CLI.


What Is This?

The CLSE system is a nine-module package covering every layer of the Compositional Latent Synthesis Engine stack:

Module Class / Entry Point Responsibility
TTI_config.py TTIConfig, get_config() Master config β€” all tuneable parameters
TTI_core.py TTIImage, ImageCanvas, ColorUtils, ImageIO Pixel-level image engine (BMP/PNG/JPEG I/O, drawing)
TTI_art.py ProceduralArt, VisualEffects, StreamingWriter, AnimationEngine Procedural art, effects, animation, large-image streaming
TTI_ai.py TTIGenerator, NLPAnalyser, PromptAnalysis NLP pipeline + numpy VAE renderer + colour predictor
TTI_model.py TTIModel, TTITrainer, TTILoss 4-layer transformer VAE (3.8M params), training loop
TTI_dataset.py TTIDataset, Vocabulary Synthetic 50k-sample dataset generator with caching
TTI_pipeline.py TTIPipeline Full end-to-end pipeline connecting all modules
TTI_train.py CLI training script Production training with checkpointing and early stopping
TTI_main.py TTIPipeline faΓ§ade + main() Unified entry point + full CLI

Note

These *.py files are inner file of CLSE (Compositional Latent Synthesis Engine).


Quick Start

frompyaitk.CLSEimport TTIPipeline

pipe = TTIPipeline()

# Generate from a text prompt
img = pipe.generate("a calm blue ocean at sunset", output="ocean.png")

# Procedural art (no model needed)
img = pipe.art("mandelbrot", output="fractal.png")

# Apply an effect to an existing image
img = pipe.effect("sepia", "photo.png", output="photo_sepia.png")

# Analyse a prompt
analysis = pipe.analyse("dark gothic castle at midnight")
print(analysis.scene_type, analysis.colour_matches)

Module Guide


Configuration

frompyaitk.CLSEimport TTIConfig, get_config, update_config, reset_config

# Global singleton (auto-discovers tti_config.json next to the module)
cfg = get_config()

# Read settings
print(cfg.image.default_width) # 512
print(cfg.ai.model_type) # "vae_numpy"
print(cfg.art.fractal_max_iter) # 256
print(cfg.paths.output_dir) # "tti_output"

# Bulk update
update_config(
 image={"default_width": 1024, "default_height": 1024},
 ai={"seed": 42, "num_inference_steps": 100},
)

# Save / load JSON snapshot
cfg.save("tti_config.json")
cfg2 = TTIConfig.load("tti_config.json")

# Create all output directories
cfg.ensure_dirs()

# Reset to factory defaults
reset_config()

Config sections:

Section Dataclass Key fields
cfg.image ImageConfig default_width, default_height, default_format, background_color, jpeg_quality
cfg.ai AIConfig model_type, latent_dim, vocab_size, num_inference_steps, guidance_scale, seed
cfg.art ArtConfig fractal_max_iter, blur_default_radius, noise_default_intensity, animation_fps
cfg.paths PathConfig output_dir, model_dir, cache_dir, log_dir
cfg.log LogConfig level, log_to_file, log_filename, show_progress

Image Engine

frompyaitk.CLSEimport TTIImage, ImageCanvas, ColorUtils, ImageIO, ImageValidator

# Create a blank image
img = TTIImage(width=512, height=512, bpp=24, background=(20, 30, 60))

# Pixel operations
img.set_pixel(100, 100, (255, 0, 0))
color = img.get_pixel(100, 100) # (255, 0, 0)

# NumPy array interop
arr = img.to_array() # shape (H, W, 3), dtype=uint8
img.from_array(arr)

# Save and load
img.save("output.png")
img.save("output.jpg", fmt="jpeg")
img.save("output.bmp", fmt="bmp")
img2 = TTIImage.load("output.png")

# Drawing (ImageCanvas)
canvas = ImageCanvas(img)
canvas.line(0, 0, 511, 511, (255, 255, 0))
canvas.rectangle(50, 50, 200, 200, (255, 100, 0), filled=True)
canvas.circle(256, 256, 100, (0, 200, 255), filled=False)
canvas.fill_background((10, 10, 30))

# Colour utilities
rgb = ColorUtils.hsv_to_rgb(0.6, 0.8, 0.9)
blended = ColorUtils.lerp((255, 0, 0), (0, 0, 255), t=0.5)
rgba = ColorUtils.to_rgba((100, 150, 200)) # adds alpha=255
clamped = ColorUtils.clamp(300) # β†’ 255

# Multi-format I/O (static)
img = ImageIO.load("photo.png")
ImageIO.save(img, "photo.jpeg", quality=85)

# Integrity check
ImageValidator.validate(img) # raises TTIImageError if corrupt

Exception hierarchy:

TTIError
β”œβ”€β”€ TTIImageError β€” pixel-level or dimension errors
└── TTIIOError β€” file read/write failures

Procedural Art & Effects

ProceduralArt

All methods are static and return TTIImage objects.

frompyaitk.CLSEimport ProceduralArt, VisualEffects, StreamingWriter, AnimationEngine

# Fractals
img = ProceduralArt.mandelbrot_set(width=800, height=600)
img = ProceduralArt.julia_set(800, 600, c_real=-0.7, c_imag=0.27015)
img = ProceduralArt.sierpinski_triangle(512, 512, depth=7)

# Patterns
img = ProceduralArt.plasma(512, 512)
img = ProceduralArt.voronoi(512, 512, n_cells=25, seed=42)
img = ProceduralArt.perlin_noise_image(512, 512, octaves=4)

# Gradients
img = VisualEffects.create_linear_gradient(512, 512, (70, 130, 200), (200, 80, 120))
img = VisualEffects.create_radial_gradient(512, 512, (255, 220, 50), (30, 30, 120))

VisualEffects

Apply filters to existing TTIImage objects (all return the modified image):

img = VisualEffects.blur(img, radius=3)
img = VisualEffects.gaussian_blur(img, sigma=2.0)
img = VisualEffects.sharpen(img, factor=1.5)
img = VisualEffects.edge_detect(img)
img = VisualEffects.emboss(img)
img = VisualEffects.grayscale(img)
img = VisualEffects.sepia(img, strength=0.8)
img = VisualEffects.invert(img)
img = VisualEffects.add_noise(img, intensity=0.15)
img = VisualEffects.pixelate(img, block_size=10)
img = VisualEffects.vignette(img, strength=0.6)
img = VisualEffects.adjust_brightness(img, factor=1.2)
img = VisualEffects.adjust_contrast(img, factor=1.3)
img = VisualEffects.blend(img1, img2, alpha=0.5)

StreamingWriter β€” large images without full RAM

frompyaitk.CLSEimport StreamingWriter

# Write a 4000Γ—4000 image row by row
with StreamingWriter("large.png", width=4000, height=4000, bpp=24) as sw:
 for y in range(4000):
 row = [(y % 255, 100, 200)] * 4000 # list of RGB tuples
 sw.write_row(row)

AnimationEngine β€” frame sequences

frompyaitk.CLSEimport AnimationEngine

engine = AnimationEngine(fps=24)
frames = engine.generate_frames(base_img, n_frames=48, mode="zoom")
engine.save_frames(frames, output_dir="frames/", fmt="png")

CustomBitDepth β€” non-standard pixel formats

frompyaitk.CLSEimport CustomBitDepth

# 16-bit per channel, 3 channels
cbd = CustomBitDepth(width=256, height=256, bits_per_channel=16, n_channels=3)
cbd.set_pixel(0, 0, [65535, 0, 32768])
cbd.save("high_depth.custimg")
cbd.save_preview("preview.png") # downsample to 8-bit PNG for viewing

AI Engine

frompyaitk.CLSEimport TTIGenerator, NLPAnalyser

# NLP analysis
nlp = NLPAnalyser()
analysis = nlp.analyse("a mysterious purple galaxy with glowing stars")

print(analysis.scene_type) # "starfield"
print(analysis.nouns) # ["galaxy", "stars"]
print(analysis.adjectives) # ["mysterious", "purple", "glowing"]
print(analysis.colour_matches) # [("purple", (138,43,226)), ("gold", (255,215,0)), ...]
print(analysis.modifiers) # {"mysterious": 0.8, "glowing": 0.6}
print(analysis.filtered_tokens) # ["mysterious", "purple", "galaxy", "glowing", "stars"]
print(analysis.complexity()) # 0.72 (float 0–1)

# Full generation
frompyaitk.CLSEimport get_config
gen = TTIGenerator(get_config())
img = gen.generate("stormy sea at dusk", width=512, height=512, seed=99)
img.save("storm.png")

# Variations
imgs = gen.generate_variations("neon city at night", n_variations=4)
for i, img in enumerate(imgs):
 img.save(f"variation_{i}.png")

# Interpolation between two prompts
imgs = gen.interpolate("sunrise", "midnight", steps=6)
for i, img in enumerate(imgs):
 img.save(f"interp_{i:02d}.png")

AI pipeline (internal stages):

NLPAnalyser β†’ 256-d SemanticVector (TF-IDF + PCA, NLTK tokeniser)
ColourPredictor β†’ PaletteSpec (sklearn KNN on 170+ colour keywords)
SceneComposer β†’ 128-d LatentCode (numpy VAE encoder)
ImageDecoder β†’ TTIImage (14 scene-type renderers)

Neural Architecture

frompyaitk.CLSEimport TTIModel, TTIModelLarge, ModelConfig, TTITrainer, TTILoss

# Default model (4-layer transformer, 3.8M params)
cfg = ModelConfig(vocab_size=8192, embed_dim=256, n_layers=4, n_heads=8, latent_dim=128)
model = TTIModel(cfg)

# Large variant (6-layer, 512-d, ~14M params)
model_large = TTIModelLarge()

# Forward pass
importtorch
tokens = torch.randint(0, 8192, (4, 32)) # batch=4, seq_len=32
out = model(tokens)
# out.scene_logits : (4, 15) β€” 15-class scene prediction
# out.colour_pred : (4, 18) β€” 6 colours Γ— RGB
# out.param_pred : (4, 64) β€” scene renderer parameters
# out.mu, out.logvar: (4, 128) β€” VAE latent distribution

# Multi-task loss
criterion = TTILoss(scene_weight=1.0, colour_weight=0.5, param_weight=0.3, kl_weight=0.01)
loss = criterion(out, scene_labels, colour_targets, param_targets)

# Training loop
trainer = TTITrainer(model, dataset, val_dataset, output_dir="tti_models/")
history = trainer.train(epochs=20, batch_size=64, lr=3e-4)
print(history["best_val_loss"])

Model architecture:

TokenEmbedding β€” learned token + sinusoidal positional embeddings
TransformerEncoder β€” 4Γ— MultiHeadSelfAttention + FFN (BERT-style, pre-LN)
 β”œβ”€β”€ ColourHead β€” 3-layer MLP β†’ 18-d colour palette
 β”œβ”€β”€ SceneClassifier β€” 2-layer MLP β†’ 15-class scene logit
 └── ParamDecoder β€” VAE: ΞΌ/Οƒ β†’ z β†’ 64-d parameter vector

Dataset Generator

frompyaitk.CLSEimport TTIDataset, Vocabulary, build_dataset

# Build a 50,000-sample dataset (cached to disk with SHA-256 integrity check)
dataset = build_dataset(
 n_samples=50_000,
 cache_dir="tti_models/",
 force_rebuild=False, # use cache if available
)

# Train / val / test splits
train_ds, val_ds, test_ds = dataset.splits()
print(len(train_ds), len(val_ds), len(test_ds)) # 40000, 5000, 5000

# DataLoader-compatible access
sample = train_ds[0]
# sample["token_ids"] : torch.LongTensor (32,)
# sample["scene_label"] : int (0–14)
# sample["colour_vec"] : torch.FloatTensor (18,)
# sample["param_vec"] : torch.FloatTensor (64,)
# sample["prompt"] : str

# Vocabulary
vocab = Vocabulary.load("tti_models/vocab.json")
ids = vocab.encode("a stormy sea at dusk") # list of int
text = vocab.decode(ids) # str
print(vocab.size) # 8192

Dataset statistics (default build):

Split Samples
Train 40,000
Validation 5,000
Test 5,000
Vocab size 8,192
Scene classes 15
Colour dims 18 (6 colours Γ— RGB)
Parameter dims 64

Unified Pipeline

frompyaitk.CLSEimport TTIPipeline

pipe = TTIPipeline()

# Generate
img = pipe.generate("a bright rainbow over a misty waterfall", output="rainbow.png")

# Variations
imgs = pipe.variations("neon city at night", n=4, output_dir="variations/")

# Interpolation
imgs = pipe.interpolate("sunrise", "midnight", steps=6, output_dir="interp/")

# Procedural art
img = pipe.art("julia", output="julia.png", c_real=-0.4, c_imag=0.6)

# Effects
img = pipe.effect("sepia", input_path="photo.png", output="photo_sepia.png")

# NLP analysis only
info = pipe.analyse("dark gothic castle at midnight")

# Animation
frames = pipe.animate("sunset over the ocean", n_frames=24, output_dir="frames/")

# Stream a very large image (memory-safe)
pipe.stream_large("huge.png", width=4000, height=4000, pattern="gradient")

# Model info
print(pipe.model_info())

# Config management
pipe.show_config()
pipe.save_config("snapshot.json")

Architecture Overview

TTI_main.py / TTI_pipeline.py ← unified faΓ§ade
β”‚
β”œβ”€β”€ TTI_config.py ← all settings (reads from config.pbcfg / tti_config.json)
β”‚
β”œβ”€β”€ TTI_ai.py ← NLP + VAE renderer (no model weights needed)
β”‚ β”œβ”€β”€ NLPAnalyser NLTK tokenise β†’ TF-IDF β†’ PCA β†’ SemanticVector
β”‚ β”œβ”€β”€ ColourPredictor KNN on 170+ colour keywords β†’ PaletteSpec
β”‚ β”œβ”€β”€ SceneComposer numpy VAE encoder β†’ 128-d LatentCode
β”‚ └── ImageDecoder 14 scene-type renderers β†’ TTIImage
β”‚
β”œβ”€β”€ TTI_model.py ← PyTorch transformer VAE (optional, boosts quality)
β”‚ β”œβ”€β”€ TokenEmbedding learnable + positional
β”‚ β”œβ”€β”€ TransformerEncoder 4-layer BERT-style
β”‚ β”œβ”€β”€ ColourHead β†’ 18-d palette
β”‚ β”œβ”€β”€ SceneClassifier β†’ 15-class scene
β”‚ └── ParamDecoder β†’ 64-d renderer params (VAE)
β”‚
β”œβ”€β”€ TTI_dataset.py ← 50k-sample synthetic generator + Vocabulary + DataLoader
β”‚
β”œβ”€β”€ TTI_train.py ← training script (gradient ckpt, cosine LR, early stop)
β”‚
β”œβ”€β”€ TTI_core.py ← pixel engine (TTIImage, ImageCanvas, ColorUtils, I/O)
β”‚
└── TTI_art.py ← procedural art, effects, streaming, animation

Examples Summary

frompyaitk.CLSEimport TTIPipeline
frompyaitk.CLSEimport ProceduralArt, VisualEffects
frompyaitk.CLSEimport NLPAnalyser
frompyaitk.CLSEimport get_config, update_config

pipe = TTIPipeline()

# Generate
img = pipe.generate("stormy sea at midnight", output="storm.png")

# Batch
pipe.generate_batch(["sunrise", "sunset", "noon"], output_dir="batch/")

# Variations + interpolation
pipe.generate_variations("neon city", n=4, output_dir="vars/")
pipe.interpolate("calm lake", "raging ocean", steps=6, output_dir="interp/")

# Procedural art
pipe.art("mandelbrot", output="mandelbrot.png", width=800, height=600)
pipe.art("julia", output="julia.png", c_real=-0.4, c_imag=0.6)

# Effects
pipe.effect("sepia", "input.png", output="sepia.png")
pipe.effect("vignette", "input.png", strength=0.7, output="vig.png")

# Analysis
info = pipe.analyse("mysterious purple galaxy")
print(info.scene_type, info.complexity())

# Config override
update_config(image={"default_width": 1024}, ai={"seed": 7})

# Full demo
pipe.demo(output_dir="tti_demo/")

Camera β€” Camera Module

A Tkinter-based camera viewer with live QR/barcode scanning, image capture, video recording, and full context-manager support. Wraps OpenCV and pyzbar behind a clean class-based API with thread-safe state management.


What Is This?

Camera provides a camera module that combines:

  • Live video preview β€” Tkinter GUI window displaying frames from any OpenCV-compatible device
  • QR / barcode scanning β€” auto-decodes QR codes and barcodes from every frame via pyzbar; accumulates all unique payloads thread-safely
  • Image capture β€” saves the current frame as a timestamped .png with one method call
  • Video recording β€” start/stop writing annotated frames to a timestamped .avi file using XVID codec
  • Context-manager protocol β€” Camera.open() creates the Tk root, opens the device, and releases everything on exit
  • Functional entry-point β€” Start() launches the GUI and returns all scanned data in one call
  • Auto-install of pyzbar β€” silently attempts pip install pyzbar if the library is missing; degrades gracefully if it still can't be imported
  • Thread-safe design β€” separate RLock guards for scanned data and video writer; frame loop never blocks on recording

Installation

On Linux, pyzbar also needs the native zbar library:

sudoaptinstalllibzbar0

On Windows, the pyzbar wheel bundles the DLL automatically.


How to Use

1. Simplest β€” functional one-liner

Launches the GUI; blocks until the window is closed; returns all scanned payloads.

frompyaitk.Cameraimport Start

data = Start()
for item in data:
 print(item)

2. Context manager (recommended)

frompyaitk.Cameraimport Camera

with Camera.open() as cam:
 cam.run() # blocks until the window is closed

print(cam.scanned_data) # set of all scanned QR/barcode strings

3. Custom device and output directory

frompyaitk.Cameraimport Camera

with Camera.open(device=1, output_dir="./recordings") as cam:
 cam.run()

4. Programmatic capture and recording

frompyaitk.Cameraimport Camera
importtkinterastk

root = tk.Tk()
cam = Camera(root, device=0, output_dir="./output")

# Capture a single frame immediately
path = cam.capture_image()
print(f"Saved: {path}")

# Start and stop recording
record_path = cam.start_recording()
print(f"Recording to: {record_path}")

# … run some frames …
cam.run() # blocks until window closed

cam.stop_recording()
cam.close()

5. Inspect scanned data before closing

frompyaitk.Cameraimport Camera
importthreading

with Camera.open() as cam:
 # Poll scanned_data from another thread
 defmonitor():
 importtime
 while not cam.is_closed:
 print("Scanned so far:", cam.scanned_data)
 time.sleep(2.0)

 t = threading.Thread(target=monitor, daemon=True)
 t.start()
 cam.run()

print("Final scanned data:", cam.scanned_data)

6. State checks

with Camera.open() as cam:
 print(cam.is_recording) # False
 cam.start_recording()
 print(cam.is_recording) # True
 cam.stop_recording()
 print(cam.is_closed) # False
 cam.run()

print(cam.is_closed) # True

GUI Reference

The camera window opens with three buttons:

Button Action
πŸ“Έ Capture Image Saves current frame as captured_YYYYMMDD_HHMMSS.png in output_dir; shows a confirmation dialog
πŸŽ₯ Start Recording / ⏹️ Stop Recording Toggles video recording; saves to output_YYYYMMDD_HHMMSS.avi in output_dir
❌ Quit Stops recording (if active), releases the camera, and destroys the window

Closing the window via the title bar Γ— button also triggers a clean shutdown.


API Reference

Start(device, output_dir) β†’ set[str]

Functional entry-point. Launches the GUI, blocks until the window is closed, and returns all scanned QR/barcode payloads.

Parameter Type Default Description
device int 0 OpenCV camera device index
output_dir Path or str "." Directory for saved images and videos

Camera.open(device, output_dir) β†’ context manager

Class-level @contextmanager factory. Creates a tk.Tk root, instantiates Camera, yields it, and calls close() on exit.

with Camera.open(device=0, output_dir="./out") as cam:
 cam.run()

Camera(window, device, output_dir)

Direct constructor for advanced use when you manage the Tk root yourself.

Parameter Type Default Description
window tk.Tk β€” Root Tkinter window
device int 0 OpenCV capture device index
output_dir Path or str "." Directory for saved images and videos

Raises: CameraError if the device cannot be opened.


Instance methods

Method Returns Description
run() None Enter the Tk main-loop (blocks until window closed)
capture_image() Path or None Capture current frame to output_dir; returns saved path
start_recording() Path or None Begin writing frames to .avi; returns output path
stop_recording() None Stop recording and flush the video file
close() None Release all resources; safe to call multiple times

Properties

Property Type Description
scanned_data set[str] Thread-safe snapshot of all decoded QR/barcode payloads
is_recording bool True while a video is being written
is_closed bool True after close() has been called

Output Files

All files are written to output_dir (default: current directory).

File pattern Format Created by
captured_YYYYMMDD_HHMMSS.png PNG capture_image() / πŸ“Έ button
output_YYYYMMDD_HHMMSS.avi AVI (XVID, 20 fps) start_recording() / πŸŽ₯ button

Exception Reference

CameraError(RuntimeError)
β”œβ”€β”€ Raised when the camera device cannot be opened
└── Raised when run() is called after the camera is already closed

Architecture Notes

  • Frame loop β€” driven by window.after(_FRAME_DELAY_MS, _update_frame) (10 ms β‰ˆ 100 fps cap); never blocks the Tk event loop
  • QR decoding β€” runs on every frame inside the frame loop; new unique payloads are added to _scanned_data under _data_lock
  • Video writer β€” guarded by _writer_lock so the recording flag and cv2.VideoWriter are always consistent across threads
  • pyzbar graceful degradation β€” if not installed, a pip install is attempted once at import time; if still unavailable, QR scanning is silently disabled and the rest of the module works normally
  • Clean shutdown β€” close() stops recording, releases the OpenCV capture device, and destroys the Tk window; idempotent (safe to call multiple times)

Examples Summary

# Simplest: launch and collect scanned QR data
frompyaitk.Cameraimport Start
data = Start()

# Context manager
frompyaitk.Cameraimport Camera
with Camera.open(device=0, output_dir="./out") as cam:
 cam.run()
print(cam.scanned_data)

# Custom device
with Camera.open(device=1) as cam:
 cam.run()

# Programmatic capture before running
importtkinterastk
frompyaitk.Cameraimport Camera
root = tk.Tk()
cam = Camera(root, output_dir="./shots")
cam.capture_image() # save a frame immediately
cam.start_recording() # start video
cam.run() # blocks
cam.stop_recording()
cam.close()

# State checks
with Camera.open() as cam:
 print(cam.is_recording) # False
 cam.start_recording()
 print(cam.is_recording) # True
 cam.run()

Memory β€” Memory Module for Pythonaibrain

A two-tier episodic memory system for Brain and AdvanceBrain. Memory provides a thread-safe, JSON-backed key/value store. SmartMemory extends it transparently with a full ML pipeline β€” TF-IDF β†’ Autoencoder β†’ Clustering β†’ IntentClassifier β€” enabling semantic search over conversation history without changing any call sites.


What Is This?

Memory provides two public classes and a factory function:

Class / Function Description
Memory Thread-safe, JSON-backed key/value episodic store with LRU eviction
SmartMemory Drop-in superset of Memory with semantic search, intent prediction, and cluster analytics
build_memory() Factory that returns SmartMemory when available, Memory otherwise
MemoryEntry Dataclass holding a single episodic record with timestamps and access metadata

Key properties shared by both classes:

  • Thread-safe β€” all public methods acquire an RLock; the background ML fit thread takes only a snapshot so reads/writes are never blocked
  • Atomic writes β€” save_memory() writes to a .tmp file then renames it, preventing corruption on crash
  • LRU eviction β€” when the store reaches max_entries, the least-recently-used key is evicted
  • Backward-compatible format β€” transparently migrates legacy flat {key: value} JSON files to the v2 rich format
  • Graceful degradation β€” if SummarizerAI is absent, SmartMemory silently falls back to plain Memory behaviour

How to Use

1. Plain Memory (classic key/value)

frompyaitk.Memoryimport Memory

mem = Memory(path="memory.json")

mem.remember("user_name", "Divyanshu")
mem.remember("last_topic", "Python programming")

print(mem.recall("user_name")) # "Divyanshu"
print(mem.recall("missing_key", default="N/A")) # "N/A"

mem.save_memory()
mem.load_memory() # reload from disk

2. SmartMemory β€” drop-in with ML extras

frompyaitk.Memoryimport SmartMemory

mem = SmartMemory(
 path="memory.json",
 auto_fit=True, # refit every fit_interval calls to remember()
 fit_interval=20, # default: 20
)

mem.remember("Hello! How are you?", "I'm doing great, thanks!")
mem.remember("Tell me a joke", "Why did the chicken cross the road?")
# … add more entries …

# Semantic search
results = mem.semantic_search("a funny story", top_k=3)
for r in results:
 print(r["score"], r["key"], r["value"])

# Intent prediction
intent = mem.predict_intent("What's the weather today?")
print(intent) # e.g. "weather_query"

# Export cluster/intent analytics report
mem.export_report("memory_report.json")
mem.print_report()

mem.save_memory()

3. build_memory() factory (recommended for Brain)

frompyaitk.Memoryimport build_memory

# Returns SmartMemory if SummarizerAI is available, Memory otherwise
mem = build_memory("memory.json", smart=True, fit_interval=50)
mem.remember("greeting", "Hello!")
mem.save_memory()

4. Checking and controlling the summarizer

frompyaitk.Memoryimport SmartMemory

mem = SmartMemory(path="memory.json", auto_fit=False)

# Manually trigger a (blocking) fit
success = mem.fit_summarizer()
print(success) # True / False

# Check fit state
print(mem.is_summarizer_fitted()) # True after fit

# Get the report object
report = mem.get_report() # MemorySummaryReport dataclass or None

5. Extended Memory API

frompyaitk.Memoryimport Memory

mem = Memory()

mem.remember("a", "alpha")
mem.remember("b", "beta")

# Delete one entry
mem.forget("a")

# Check membership
print("b" in mem) # True
print("a" in mem) # False

# Iterate
for key in mem.keys():
 print(key)

for key, value in mem.items():
 print(key, "β†’", value)

# Snapshot as plain dict
snapshot = mem.to_dict() # {"b": "beta"}

print(mem.size) # 1
print(len(mem)) # 1

# Clear everything (in-memory only; disk unchanged until save_memory)
mem.wipe()

6. Inspecting MemoryEntry metadata

frompyaitk.Memoryimport Memory

mem = Memory()
mem.remember("topic", "AI and robotics")

# Access internal entry (advanced)
entry = mem._entries["topic"]
print(entry.key) # "topic"
print(entry.value) # "AI and robotics"
print(entry.timestamp) # Unix timestamp of creation
print(entry.access_count) # Number of recalls
print(entry.last_accessed) # Unix timestamp of last recall
print(entry.to_dict()) # Full dict for serialization

API Reference

Memory(path, max_entries, auto_load)

Parameter Type Default Description
path str memory.json Path to the JSON persistence file
max_entries int 10_000 Max entries before LRU eviction
auto_load bool True Load from disk automatically if file exists on init

Core contract methods (used by Brain / AdvanceBrain)

Method Returns Description
load_memory() None Load (or reload) from disk; silent no-op if file absent
remember(key, value) None Store or update; evicts LRU entry when at max_entries
recall(key, default="") str Retrieve value for key; returns default if not found
save_memory() None Atomically persist current state to disk

Extended methods

Method Returns Description
forget(key) bool Delete one entry; returns True if key existed
wipe() None Clear all in-memory entries (disk unchanged)
keys() Iterator[str] Iterate over all stored keys
items() Iterator[tuple[str,str]] Iterate over (key, value) pairs
to_dict() dict[str, str] Plain {key: value} snapshot without metadata

Properties

Property Type Description
path Path Resolved path to the backing JSON file
size int Number of currently stored entries

SmartMemory(path, max_entries, auto_load, auto_fit, fit_interval, summarizer_config)

Inherits all Memory methods. Additional constructor parameters:

Parameter Type Default Description
auto_fit bool True Auto-refit summarizer every fit_interval calls to remember()
fit_interval int 20 Number of remember() calls between automatic refits
summarizer_config dict or None None Pass custom config to MemorySummarizer; None = from .pbcfg

SmartMemory-only methods

Method Returns Description
fit_summarizer(force=False) bool (Re-)train ML pipeline; blocks caller; returns True on success
semantic_search(text, top_k=3) list[dict] Ranked semantic matches; falls back to substring search if unfit
predict_intent(text) str Predict intent of query from trained classifier
get_report() MemorySummaryReport or None Return cluster/intent analysis report
export_report(path) bool Write report as JSON; returns True on success
print_report() None Pretty-print cluster report to stdout
is_summarizer_fitted() bool True if summarizer is trained and not dirty

semantic_search() result format

Each item in the returned list:

Field Type Description
score float Cosine similarity score 0.0–1.0
key str Original input key stored in memory
value str Stored response value
intent str Predicted intent label
cluster int Cluster assignment index (-1 if unassigned)

build_memory(path, smart, **kwargs) β†’ Memory

Factory function. Returns SmartMemory when smart=True and SummarizerAI is importable; returns plain Memory otherwise.

Parameter Type Default Description
path str memory.json Path for JSON persistence
smart bool True Attempt to use SmartMemory
**kwargs β€” β€” Forwarded to SmartMemory or Memory constructor

MemoryEntry fields

Field Type Description
key str Entry key
value str Stored value
timestamp float Unix timestamp of creation
access_count int Number of times recall() accessed this key
last_accessed float Unix timestamp of most recent recall

Architecture

Memory
β”œβ”€β”€ _entries : OrderedDict[str, MemoryEntry] ← LRU-ordered episodic store
β”œβ”€β”€ _lock : threading.RLock ← protects all mutations
└── _path : Path ← JSON backing file

SmartMemory(Memory)
β”œβ”€β”€ _summarizer : MemorySummarizer | None ← ML pipeline
β”œβ”€β”€ _dirty : bool ← needs refit flag
β”œβ”€β”€ _auto_fit : bool ← enable background refits
β”œβ”€β”€ _fit_interval : int ← refit every N remembers
β”œβ”€β”€ _fit_lock : threading.Lock ← serializes fit calls
└── _fit_thread : Thread | None ← background training thread

MemorySummarizer (optional external module)
└── TF-IDF β†’ Autoencoder β†’ KMeans/DBSCAN β†’ IntentClassifier β†’ PatternMatcher

Background fit behaviour:

  • Every fit_interval calls to remember(), _fit_background() spawns a daemon thread
  • The thread takes a snapshot of _store under the lock, then trains outside it β€” memory reads/writes are never blocked by training
  • If a fit is already running, additional triggers are skipped
  • semantic_search() triggers a blocking fit_summarizer() automatically if the summarizer is unfit (_dirty=True)

File Format

Memory is persisted as a JSON file with a version marker:

{
"__version__":2,
"saved_at":1718000000.0,
"entry_count":3,
"entries":[
{
"key":"user_name",
"value":"Divyanshu",
"timestamp":1718000000.0,
"access_count":4,
"last_accessed":1718001000.0
}
]
}

Legacy flat {key: value} files from v1 are automatically migrated on load.


Examples Summary

# Plain Memory
frompyaitk.Memoryimport Memory
mem = Memory("memory.json")
mem.remember("name", "Divyanshu")
print(mem.recall("name"))
mem.save_memory()

# SmartMemory
frompyaitk.Memoryimport SmartMemory
mem = SmartMemory("memory.json", auto_fit=True, fit_interval=10)
mem.remember("How are you?", "I'm great!")
results = mem.semantic_search("how are things", top_k=3)
intent = mem.predict_intent("What's the weather?")
mem.export_report("report.json")
mem.save_memory()

# Factory (recommended)
frompyaitk.Memoryimport build_memory
mem = build_memory("memory.json", smart=True, fit_interval=50)
mem.remember("greeting", "Hello!")
mem.save_memory()

# Extended API
mem.forget("greeting")
print("greeting" in mem) # False
print(mem.size)
print(mem.to_dict())
mem.wipe()

Pythonaibrain Config

The master configuration layer for the entire Pythonaibrain / pyaitk framework. All subsystems β€” Brain, STT, TTS, NER, TTI, Memory Summarizer, LLM β€” read their settings from a single .pbcfg file via typed dataclass section objects on one unified AppConfig instance.


What Is This?

Config provides:

  • .pbcfg file format β€” INI-style config file with ; # comment support and inline comments
  • AppConfig β€” unified config manager that reads/writes all sections and exposes them as typed dataclasses
  • 22 section dataclasses β€” one per subsystem, each with typed fields and sensible defaults
  • Auto-discovery β€” searches upward from cwd to find config.pbcfg, falls back to built-in defaults
  • Typed get() / set() β€” generic read/write with optional type casting and fallback
  • generate_default_config() β€” writes a fully-populated default .pbcfg to disk
  • JSON export β€” serialize the entire config to JSON string or file
  • Module-level singleton β€” get_config() returns the same instance across the entire application

File Format (.pbcfg)

The config file uses standard INI format. Comments start with ; or #. Inline comments are supported.

; PythonAIBrain configuration file

[brain]
intents_path=./intents.json
condition=true
smart_memory=true
memory_path=memory.json
memory_fit_interval=20
username=user_name
download=false

[model]
model_path=model.pth
dimension_path=dimensions.json
batch_size=8
learning_rate=0.001
epochs=100

[llm]
n_ctx=2048
n_threads=0; 0 β†’ use os.cpu_count()
max_tokens=512
verbose=false

[tts]
rate=150
volume=1.0
voice=david
default_text=Hello from PyAI
output_path=; if set, saves audio here instead of playing

[stt]
energy_threshold=; empty β†’ auto-calibrate
dynamic_energy_threshold=true
pause_threshold=0.8
phrase_time_limit=; hard cap per utterance (seconds)
timeout=5.0
ambient_noise_duration=0.5
connectivity_host=8.8.8.8
connectivity_port=53
connectivity_timeout=2.0
preferred_engine=None; None | google | pocketsphinx
sphinx_language=en-US
google_language=en-US
google_api_key=; empty β†’ free tier
max_retries=3
retry_delay=0.5

[logging]
level=INFO
format=%(asctime)s [%(levelname)s] %(name)s - %(message)s

[weather]
base_url=https://api.openweathermap.org/data/2.5/weather
units=metric

[memory]
auto_load=true
auto_fit=true

[search]
max_results=5

[webassistant]
intents_path=./Webintents.json
model_path=WebAssistantModel.pth
dimension_path=WebAssistantDimensions.json
batch_size=8
learning_rate=0.001
epochs=100

[embedding]
tfidf_max_features=5000
tfidf_ngram_range=1,3
tfidf_sublinear_tf=true
embed_dim=128
vocab_size=10000

[clustering]
n_clusters=8
kmeans_max_iter=300
kmeans_random_state=42
dbscan_eps=0.5
dbscan_min_samples=2
agglo_linkage=ward
agglo_distance_threshold=

[classifier]
lr_max_iter=500
lr_c=1.0
lr_solver=lbfgs
lr_multi_class=auto
similarity_threshold=0.65

[summarizer]
latent_dim=64
hidden_dim=256
ae_epochs=30
ae_lr=0.001
ae_batch_size=16
top_patterns_per_cluster=3
min_cluster_size=2

[tti_image]
default_width=512
default_height=512
default_bpp=24
default_format=png
background_color=255,255,255
jpeg_quality=92

[tti_ai]
nlp_backend=nltk
max_prompt_tokens=128
use_stopword_filter=true
model_type=vae_numpy
latent_dim=128
text_embed_dim=256
vocab_size=4096
hidden_dim=512
palette_clusters=8
num_inference_steps=50
guidance_scale=7.5
seed=

[tti_art]
fractal_max_iter=256
blur_default_radius=2
noise_default_intensity=0.15
animation_fps=24
streaming_chunk_mb=32

[tti_paths]
output_dir=tti_output
model_dir=tti_models
cache_dir=tti_cache
log_dir=tti_logs

[postprocessor]
min_length=1
max_length=
allowed_labels=
blocked_labels=
deduplicate=true
merge_adjacent=false
lowercase_labels=false
strip_punct=true
custom_label_map=

[preprocessor]
lowercase=false
remove_urls=true
remove_emails=false
remove_html_tags=true
normalize_whitespace=true
normalize_unicode=true
max_length=
custom_patterns=

Boolean values accept: true/false, yes/no, on/off, 1/0. Empty values (e.g. timeout =) resolve to None for optional fields. Tuple fields (e.g. tfidf_ngram_range, background_color) are stored as comma-separated integers.


How to Use

1. Module-level singleton (recommended)

frompyaitk.configimport get_config

cfg = get_config() # auto-discovers config.pbcfg from cwd upward
print(cfg.brain.smart_memory) # True
print(cfg.model.epochs) # 100
print(cfg.stt.pause_threshold) # 0.8

2. Load a specific file

frompyaitk.configimport AppConfig

cfg = AppConfig("myproject.pbcfg")
print(cfg.tts.voice) # "david"
print(cfg.llm.n_ctx) # 2048

3. Auto-discover

cfg = AppConfig.discover() # search cwd β†’ home
cfg = AppConfig.discover("/my/dir") # search from a custom start path

4. Build from a dict

cfg = AppConfig.from_dict({
 "brain": {"smart_memory": True, "username": "divyanshu"},
 "model": {"epochs": 50, "learning_rate": 0.0005},
})

5. Read, mutate, and save

cfg = get_config()

# Read with typed access
print(cfg.brain.intents_path)
print(cfg.embedding.tfidf_max_features)

# Generic get with cast and fallback
val = cfg.get("tti_ai", "latent_dim", fallback=128, cast=int)

# Mutate in memory
cfg.set("tti_ai", "seed", 42)
cfg.set("tts", "rate", 180)

# Persist to disk
cfg.save() # saves to original path
cfg.save("backup.pbcfg") # save to alternate path

6. Reload from disk

cfg.load() # reload from current path
cfg.load("other.pbcfg") # reload from different file

7. Generate a default config file

frompyaitk.configimport generate_default_config

path = generate_default_config() # β†’ ./config.pbcfg
path = generate_default_config("myapp.pbcfg")

8. Reset to factory defaults

frompyaitk.configimport reset_config

cfg = reset_config() # clears singleton, returns AppConfig with all defaults

9. Export to JSON

json_str = cfg.to_json(indent=2)
cfg.save_json("config.json")

10. Inspect all settings

print(cfg.dump()) # human-readable dump of all sections
d = cfg.as_dict() # nested dict {section: {key: value}}

Section Reference

[brain] β†’ cfg.brain (BrainConfig)

Key Type Default Description
intents_path str ./intents.json Path to intents JSON file
condition bool True Enable dynamic intent learning from web search
smart_memory bool True Use SmartMemory (semantic search + clustering)
memory_path str memory.json Path for memory persistence
memory_fit_interval int 20 Auto-fit SmartMemory every N stored memories
username str user_name Key used for user name storage in memory
download bool False Auto-download NLTK data on Brain init

[model] β†’ cfg.model (ModelConfig)

Key Type Default Description
model_path str model.pth Path to save/load model weights
dimension_path str dimensions.json Path to save/load vocab/intent map
batch_size int 8 Training batch size
learning_rate float 0.001 Adam optimizer learning rate
epochs int 100 Training epochs

[llm] β†’ cfg.llm (LLMConfig)

Key Type Default Description
n_ctx int 2048 LLM context window size
n_threads int 0 CPU threads; 0 = os.cpu_count()
max_tokens int 512 Max tokens to generate per response
verbose bool False Enable llama.cpp verbose logging

[tts] β†’ cfg.tts (TTSConfig)

Key Type Default Description
rate int 150 Words per minute
volume float 1.0 Volume 0.0–1.0 (validated in __post_init__)
voice str david Voice name fragment (fuzzy matched)
default_text str Hello from PyAI Fallback text for empty say() calls
output_path str or None None Save to WAV file instead of playing

[stt] β†’ cfg.stt (STTConfig)

Key Type Default Description
energy_threshold float or None None Mic sensitivity; None = auto-calibrate
dynamic_energy_threshold bool True Continuously adjust threshold
pause_threshold float 0.8 Silence seconds marking end of phrase
phrase_time_limit float or None None Hard cap per utterance in seconds
timeout float or None 5.0 Seconds to wait for speech to start
ambient_noise_duration float 0.5 Seconds to sample noise floor before listening
connectivity_host str 8.8.8.8 Host used for the network connectivity probe
connectivity_port int 53 Port for the connectivity probe
connectivity_timeout float 2.0 Timeout for the connectivity probe
preferred_engine Engine or None None Force GOOGLE or POCKETSPHINX; None = auto
sphinx_language str en-US PocketSphinx language code
google_language str en-US Google Speech API BCP-47 language tag
google_api_key str or None None Google API key; None = free tier
max_retries int 3 Max retry attempts on service errors
retry_delay float 0.5 Seconds between retries

[logging] β†’ cfg.logging (LoggingConfig)

Key Type Default Description
level str INFO Root logging level
format str %(asctime)s [%(levelname)s] %(name)s - %(message)s Log record format

[memory] β†’ cfg.memory (MemoryConfig)

Key Type Default Description
auto_load bool True Load memory from disk on init
auto_fit bool True Auto-fit SmartMemory summarizer on load

[embedding] β†’ cfg.embedding (EmbeddingConfig)

Key Type Default Description
tfidf_max_features int 5000 Max vocabulary size for TF-IDF
tfidf_ngram_range tuple (1, 3) N-gram range (stored as 1,3 in file)
tfidf_sublinear_tf bool True Apply sublinear TF scaling
embed_dim int 128 Learned embedding dimension
vocab_size int 10000 Vocabulary size for learned embeddings

[clustering] β†’ cfg.clustering (ClusteringConfig)

Key Type Default Description
n_clusters int 8 KMeans cluster count
kmeans_max_iter int 300 KMeans max iterations
kmeans_random_state int 42 KMeans random seed
dbscan_eps float 0.5 DBSCAN epsilon radius
dbscan_min_samples int 2 DBSCAN minimum samples per cluster
agglo_linkage str ward Agglomerative linkage criterion
agglo_distance_threshold float or None None Agglomerative distance threshold

[classifier] β†’ cfg.classifier (ClassifierConfig)

Key Type Default Description
lr_max_iter int 500 Logistic Regression max iterations
lr_c float 1.0 LR regularization strength (lower = stronger)
lr_solver str lbfgs LR solver
lr_multi_class str auto LR multi-class strategy
similarity_threshold float 0.65 Min cosine similarity for pattern matching

[summarizer] β†’ cfg.summarizer (SummarizerConfig)

Key Type Default Description
latent_dim int 64 Autoencoder latent space dimension
hidden_dim int 256 Autoencoder hidden layer size
ae_epochs int 30 Autoencoder training epochs
ae_lr float 0.001 Autoencoder learning rate
ae_batch_size int 16 Autoencoder training batch size
top_patterns_per_cluster int 3 Top patterns to extract per cluster
min_cluster_size int 2 Minimum cluster size for summarization

[tti_image] β†’ cfg.tti_image (TTIImageConfig)

Key Type Default Description
default_width int 512 Output image width in pixels
default_height int 512 Output image height in pixels
default_bpp int 24 Bits per pixel
default_format str png Output format: png, bmp, jpeg
background_color tuple (255, 255, 255) RGB background color (stored as 255,255,255)
jpeg_quality int 92 JPEG compression quality (1–95)

[tti_ai] β†’ cfg.tti_ai (TTIAIConfig)

Key Type Default Description
nlp_backend str nltk NLP tokenizer: nltk or spacy
max_prompt_tokens int 128 Max tokens per prompt
use_stopword_filter bool True Filter stopwords from prompts
model_type str vae_numpy AI model type: vae_numpy or torch_vae
latent_dim int 128 VAE latent space dimension
text_embed_dim int 256 Text embedding dimension
vocab_size int 4096 Vocabulary size for text encoding
hidden_dim int 512 Hidden layer size
palette_clusters int 8 KMeans clusters for palette extraction
palette_model_path str tti_palette_model.pkl Path for saved palette model
num_inference_steps int 50 Diffusion inference steps
guidance_scale float 7.5 Classifier-free guidance scale
seed int or None None Random seed; empty in file resolves to None

[tti_art] β†’ cfg.tti_art (TTIArtConfig)

Key Type Default Description
fractal_max_iter int 256 Max iterations for fractal generation
blur_default_radius int 2 Default Gaussian blur radius
noise_default_intensity float 0.15 Default noise overlay intensity
animation_fps int 24 Frames per second for animations
streaming_chunk_mb int 32 Chunk size for streaming output (MB)

[tti_paths] β†’ cfg.tti_paths (TTIPathConfig)

Key Type Default Description
output_dir str tti_output Directory for generated images
model_dir str tti_models Directory for saved models
cache_dir str tti_cache Directory for cached data
log_dir str tti_logs Directory for TTI log files

Call cfg.ensure_tti_dirs() (or cfg.tti_paths.ensure_dirs()) to create all four directories.

[postprocessor] β†’ cfg.postprocessor (PostprocessorConfig)

Key Type Default Description
min_length int 1 Minimum entity text length
max_length int or None None Maximum entity text length
allowed_labels set[str] or None None Allow only these labels; None = allow all
blocked_labels set[str] {} Always exclude these labels
deduplicate bool True Remove duplicate spans
merge_adjacent bool False Merge consecutive same-label spans
lowercase_labels bool False Lowercase all label names
strip_punct bool True Strip leading/trailing punctuation from text
custom_label_map dict[str, str] {} Rename labels e.g. {"PERSON": "PER"}

[preprocessor] β†’ cfg.preprocessor (PreprocessorConfig)

Key Type Default Description
lowercase bool False Lowercase the text
remove_urls bool True Strip http://, https://, www. URLs
remove_emails bool False Strip email addresses
remove_html_tags bool True Strip HTML tags
normalize_whitespace bool True Collapse whitespace to single spaces
normalize_unicode bool True NFC Unicode normalization
max_length int or None None Truncate text to this many characters
custom_patterns list[str] [] Regex patterns to strip

AppConfig API Reference

Method / Property Returns Description
load(path?) AppConfig Parse .pbcfg file and populate all sections
save(path?) AppConfig Write current config to .pbcfg
get(section, key, fallback, cast) Any Read a raw value with optional type casting
set(section, key, value) None Write a value into the in-memory config (call save() to persist)
to_json(indent?) str Serialize all sections to a JSON string
save_json(filepath?) None Write config to a JSON file
as_dict() dict Return entire config as a nested {section: {key: value}} dict
dump() str Human-readable summary of all settings
ensure_tti_dirs() None Create TTI output/model/cache/log directories if missing
AppConfig.discover(start?) AppConfig Search cwd β†’ home for config.pbcfg; use defaults if not found
AppConfig.from_dict(data, path?) AppConfig Build config from a nested dict

Module-Level Helpers

frompyaitk.configimport get_config, reset_config, generate_default_config

# Get or create the global singleton
cfg = get_config()

# Force a specific file into the singleton
cfg = get_config("custom.pbcfg")

# Reset singleton to factory defaults
cfg = reset_config()

# Write a default config file to disk
path = generate_default_config() # β†’ ./config.pbcfg
path = generate_default_config("myapp.pbcfg")

Examples Summary

frompyaitk.configimport AppConfig, get_config, generate_default_config

# Generate a default file
generate_default_config("myproject.pbcfg")

# Load and read
cfg = AppConfig("myproject.pbcfg")
print(cfg.brain.smart_memory) # True
print(cfg.model.epochs) # 100
print(cfg.stt.google_language) # "en-US"
print(cfg.tti_image.default_format) # "png"
print(cfg.embedding.tfidf_ngram_range) # (1, 3)

# Mutate and save
cfg.set("model", "epochs", 200)
cfg.set("tts", "voice", "zira")
cfg.set("tti_ai", "seed", 1337)
cfg.save()

# Generic typed get
val = cfg.get("clustering", "n_clusters", fallback=8, cast=int)

# JSON export
cfg.save_json("config_backup.json")

# Factory methods
cfg2 = AppConfig.discover()
cfg3 = AppConfig.from_dict({
 "brain": {"username": "divyanshu", "smart_memory": True},
 "llm": {"max_tokens": 256},
})

# Create TTI directories
cfg.ensure_tti_dirs()

# Inspect
print(cfg.dump())

PyAgent β€” ZENTRAA CLI Reference

ZENTRAA (Zone for Encrypted Networked Talks & Real-time AI Agent)
Powered by Pythonaibrain v1.1.9 Β· Author: Divyanshu Sinha
Encryption: RSA-2048-OAEP Β· AES-256-GCM Β· RSA-PSS Β· Curve25519


Installation

pipinstall"pythonaibrain[zentraa]"

Linux only β€” install PortAudio before pip if you plan to use the TIGER AI voice features:

sudoaptinstallportaudio19-devpython3-pyaudio

Quick Start

Every command is available through the pythonaibrain (or pyaitk) dispatcher:

pythonaibrain zentraa <command> [options]

Or as a standalone alias:

zentraa-server / zentraa-client / zentraa-tiger-ai / zentraa-web

Typical startup order:

1. zentraa server ← start first
2. zentraa web ← optional browser bridge
3. zentraa ai ← optional TIGER AI agent
4. zentraa client ← one per human user

Commands

zentraa server β€” TCP Chat Server

Starts the encrypted ZENTRAA chat server that all clients connect to.

pythonaibrainzentraaserver
pythonaibrainzentraaserver--host0.0.0.0--port9999
pythonaibrainzentraaserver--config/path/to/ZENTRAA.pbcfg
Option Short Default Description
--config -c ZENTRAA.pbcfg Path to config file
--host -H from config Override bind host
--port -p from config Override bind port
--help Show help and exit

zentraa client β€” Chat Client (TUI)

Interactive terminal client for human users. Supports direct messages, broadcasts, and talking to TIGER AI.

pythonaibrainzentraaclient
pythonaibrainzentraaclient--host127.0.0.1--port9999--useridAlice
pythonaibrainzentraaclient--config/path/to/ZENTRAA.pbcfg
Option Short Default Description
--config -c ZENTRAA.pbcfg Path to config file
--host -H from config Server host
--port -p from config Server port
--userid -u prompted Your user ID
--help Show help and exit

In-chat commands

Command Description
<message> Broadcast to all users
@<userid> <message> Direct message a user
@uid1 @uid2 ... <message> Multi-user direct message
@ai <message> Ask TIGER AI privately
@ai @<uid> <message> Ask AI, share reply with <uid>
/help Show help
/clear or /cls Clear the screen
/ai Show TIGER AI info
/setting View current settings
/users List online users
/me <action> Send an action / emote message
/whois <userid> Show info about a user
/ping Ping the server manually
/stats Show session statistics
/notify <on|off> Toggle bell notifications
/timestamps <on|off> Toggle message timestamps
/quit or /exit Disconnect

Keyboard shortcuts

Key Action
↑ / ↓ Browse command history
Tab Autocomplete @userid or /command
Ctrl+C / Ctrl+D Quit

zentraa ai β€” TIGER AI Agent

Connects an automated AI agent (TIGER AI) to the server. Other users can query it with @ai <message>.

pythonaibrainzentraaai
pythonaibrainzentraaai--smart
pythonaibrainzentraaai--basic--host127.0.0.1--port9999
pythonaibrainzentraaai--config/path/to/ZENTRAA.pbcfg
Option Short Default Description
--config -c ZENTRAA.pbcfg Path to config file
--host -H from config Server host
--port -p from config Server port
--smart from config Force AdvanceBrain (LLM mode)
--basic from config Force Brain (intent-matching mode)
--help Show help and exit

--smart and --basic are mutually exclusive. If neither is passed, the value from ZENTRAA.pbcfg is used.


zentraa web β€” HTTP / WebSocket Bridge

Starts the HTTP and WebSocket bridge so browser clients can connect to the ZENTRAA TCP server. The bridge auto-selects a free port starting from 7080 unless --no-auto-port is set.

pythonaibrainzentraaweb
pythonaibrainzentraaweb--http-port7080--tcp-port9999
pythonaibrainzentraaweb--no-auto-port--max-upload-mb128--history1000
Option Default Description
--host 0.0.0.0 HTTP bind host
--http-port 7080 (auto) HTTP / WebSocket port
--tcp-host 127.0.0.1 ZENTRAA TCP server host
--tcp-port 9999 ZENTRAA TCP server port
--no-auto-port off Fail instead of scanning for a free port
--max-upload-mb 64 Maximum file upload size in MB
--history 500 Messages stored per conversation
--ping-interval 20 WebSocket ping interval in seconds
--help Show help and exit

Once running, open your browser at:

http://localhost:7080

Bridge features

  • Typing indicators
  • Read receipts
  • Emoji reactions
  • Message delivery confirmations
  • Rich user list (online/offline status)
  • File upload via POST /api/upload (chunked, ≀128 KB per chunk)
  • REST GET /api/users β€” online users and metadata
  • REST GET /api/history/{conv_id} β€” last N messages

Global CLI Flags

pythonaibrain--version# Print version
pythonaibrain--info# Package metadata + module availability
pythonaibrain--modules# Per-module availability table
pythonaibrain--help# Full help

Configuration

All commands read ZENTRAA.pbcfg from the current directory by default. Pass --config to use a different path.

A minimal config example:

[network]
host=0.0.0.0
port=9999

[client]
default_host=127.0.0.1
default_port=9999

[ai]
smart_ai=true

[ui]
banner_style=full

.env files and RSA key files (.pem) in .zentraa_keys/ are generated locally and are never included in the package. Keep them out of version control.


Architecture

Browser ──WS/JSON──► HTTP Bridge (zentraa web)
 β”‚
 TCP/Encrypted
 β”‚
 ZENTRAA Server (zentraa server)
 β”‚
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 Chat Clients TIGER AI Agent
 (zentraa client) (zentraa ai)

License

  • PyAgent / ZENTRAA: LGPL-3.0-or-later
  • pyaitk.CLSE: AGPL-3.0-or-later (see pyaitk/CLSE/LICENSE.txt)

Visit PyPI for installation and more details.

Visit GitHub for more detail about package.

Visit Pythonaibrain Issues for any issues.


Start building your AI assistant today with Pythonaibrain!

Project details

Verified details

These details have been verified by PyPI
Maintainers
πŸ‘ Avatar for DivyanshuSinha from gravatar.com
DivyanshuSinha

Unverified details

These details have not been verified by PyPI
Project links
Meta
  • License Expression: LGPL-3.0-or-later AND AGPL-3.0-or-later
    SPDX License Expression
  • Author: Divyanshu Sinha
  • Tags ai , artificial-intelligence , assistant , tts , stt , speech , nlp , nlp-library , text-to-speech , speech-to-text , object-detection , computer-vision , named-entity-recognition , ner , image-generation , offline-ai , math-ai , summarizer , memory
  • Requires: Python >=3.9
  • Provides-Extra: core , tts , stt , camera , itt , context , ner , memory , math , search , pptx , pdf , eye , summarizer , clse , all , dev , docs , zentraa

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pythonaibrain-1.1.9.tar.gz (28.7 MB view details)

Uploaded Source

Built Distribution

Filter files by name, interpreter, ABI, and platform.

If you're not sure about the file name format, learn more about wheel file names.

Copy a direct link to the current filters

pythonaibrain-1.1.9-py3-none-any.whl (28.7 MB view details)

Uploaded Python 3

File details

Details for the file pythonaibrain-1.1.9.tar.gz.

File metadata

  • Download URL: pythonaibrain-1.1.9.tar.gz
  • Upload date:
  • Size: 28.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for pythonaibrain-1.1.9.tar.gz
Algorithm Hash digest
SHA256 3e9dee64045d4182db2fee4f265760a779f5cc621a54ab423d123de2becf1f84
MD5 73572839861d733a051792e0d9b401d5
BLAKE2b-256 79dbda65aac591a8a0da6b80e66c4b41e0d06f35d09e60cf27847e519ce070ea

See more details on using hashes here.

File details

Details for the file pythonaibrain-1.1.9-py3-none-any.whl.

File metadata

  • Download URL: pythonaibrain-1.1.9-py3-none-any.whl
  • Upload date:
  • Size: 28.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for pythonaibrain-1.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 e39701c87dfe0982144b1e12ee7d458920f81d59bde1adfdee3eb33243dcc433
MD5 8e47d5b5b846bbf0af8dc6ae2e26b7a2
BLAKE2b-256 23debf22001afc78dc19b423d0ef4dee78fe02d41d4546e99f20bf184124c50e

See more details on using hashes here.

Supported by

πŸ‘ Image
AWS Cloud computing and Security Sponsor πŸ‘ Image
Datadog Monitoring πŸ‘ Image
Depot Continuous Integration πŸ‘ Image
Fastly CDN πŸ‘ Image
Google Download Analytics πŸ‘ Image
Pingdom Monitoring πŸ‘ Image
Sentry Error logging πŸ‘ Image
StatusPage Status page