pythonaibrain 1.1.9
pip install pythonaibrain
Released:
A versatile, plug-and-play Python AI toolkit for building offline intelligent assistants.
Navigation
Verified details
These details have been verified by PyPIMaintainers
π Avatar for DivyanshuSinha from gravatar.comDivyanshuSinha
Unverified details
These details have not been verified by PyPIProject links
Meta
-
License Expression: LGPL-3.0-or-later AND AGPL-3.0-or-later
SPDX License Expression - Author: Divyanshu Sinha
- Tags ai , artificial-intelligence , assistant , tts , stt , speech , nlp , nlp-library , text-to-speech , speech-to-text , object-detection , computer-vision , named-entity-recognition , ner , image-generation , offline-ai , math-ai , summarizer , memory
- Requires: Python >=3.9
-
Provides-Extra:
core,tts,stt,camera,itt,context,ner,memory,math,search,pptx,pdf,eye,summarizer,clse,all,dev,docs,zentraa
Classifiers
- Development Status
- Intended Audience
- Natural Language
- Operating System
- Programming Language
- Topic
Project description
Pythonaibrain
Pythonaibrain is a versatile, plug-and-play Python package designed to help you build offline intelligent AI assistants and applications effortlessly. With modules covering speech recognition, text-to-speech, natural language understanding, and more, Pythonaibrain lets you create powerful AI solutions without deep expertise or complex setup. Whether youβre a beginner or an experienced developer, get ready to bring your AI ideas to life quickly and efficiently.
Requirements
- Python 3.9 or later
- pip 23+
Installation
Minimal install (core package only)
pipinstallpythonaibrain==1.1.9
Install with specific modules
Pick only what you need:
# Text-to-Speech pipinstall"pythonaibrain[tts]==1.1.9" # Speech-to-Text pipinstall"pythonaibrain[stt]==1.1.9" # Camera + QR/barcode pipinstall"pythonaibrain[camera]==1.1.9" # Image-to-Text (OCR) pipinstall"pythonaibrain[itt]==1.1.9" # AI Brain + NLP pipinstall"pythonaibrain[core]==1.1.9" # Named-Entity Recognition pipinstall"pythonaibrain[ner]==1.1.9" # Memory management pipinstall"pythonaibrain[memory]==1.1.9" # Mathematical AI + CAS pipinstall"pythonaibrain[math]==1.1.9" # Internet Search pipinstall"pythonaibrain[search]==1.1.9" # PowerPoint extraction pipinstall"pythonaibrain[pptx]==1.1.9" # PDF extraction pipinstall"pythonaibrain[pdf]==1.1.9" # Real-time object detection (YOLOv8) pipinstall"pythonaibrain[eye]==1.1.9" # Text summarisation pipinstall"pythonaibrain[summarizer]==1.1.9" # ZENTRAA encrypted chat server + clients pipinstall"pythonaibrain[zentraa]==1.1.9" # CLSE β image generation β AGPL-3.0-or-later pipinstall"pythonaibrain[clse]==1.1.9"
Install multiple modules at once
pipinstall"pythonaibrain[core,tts,stt]==1.1.9" pipinstall"pythonaibrain[core,ner,memory,summarizer]==1.1.9"
Install everything
pipinstall"pythonaibrain[all]==1.1.9"
Linux β System Dependencies
Some modules require native system libraries. Install them before running pip:
# STT (Speech-to-Text) β PortAudio sudoaptinstallportaudio19-devpython3-pyaudio # Camera / QR barcode β zbar sudoaptinstalllibzbar0
Post-install Steps
NLTK data (required by Brain, Context, NER, SummarizerAI)
importpyaitk pyaitk.InstallNLTKData()
Or manually:
python-mnltk.downloaderpunktpunkt_tabaveraged_perceptron_taggerstopwordswordnetwordsmaxent_ne_chunker
spaCy model (required by NER, optional for Core)
python-mspacydownloaden_core_web_sm
STT offline engine (optional β pocketsphinx)
pipinstallpocketsphinx
Verify Installation
importpyaitk print(pyaitk.get_version()) # 1.1.9 info = pyaitk.get_info() print(info) availability = pyaitk.check_module_availability() for module, ok in availability.items(): status = "β" if ok else "β" print(f" {status}{module}")
Or from the terminal:
pythonaibrain--version pythonaibrain--modules pythonaibrain--info
Import Usage
Each module is imported directly by its submodule path:
importpyaitk.core importpyaitk.TTS importpyaitk.STT importpyaitk.Camera importpyaitk.NER importpyaitk.Memory importpyaitk.MathAI importpyaitk.Search importpyaitk.SummarizerAI importpyaitk.CLSE importpyaitk.eye importpyaitk.ITT importpyaitk.PPTExtract importpyaitk.Grammar
License Notice
| Component | License |
|---|---|
| Pythonaibrain (all modules) | LGPL-3.0-or-later |
pyaitk.CLSE |
AGPL-3.0-or-later |
If you use pyaitk.CLSE in your application, the AGPL requires you to release your application's source code. See pyaitk/CLSE/LICENSE.txt for full terms.
About Pythonaibrain Package
Pythonaibrain package consists of pyaitk (which means Python AI Toolkit) module and PyAgent modules, pyaitk provides you methods to create your AI models/chatbots where as PyAgent provides best GUI and Web supports for intrection with models/chatbots. It also provide best contorl on your device. In this package you got default pre-trained models.
pyaitk (Python AI ToolKit)
pyaitk provides various type of methods and functions to create an advance AI.
How to import pyaitk
After pythonaibrain installations run this code for importing pyaitk
importpyaitk
Core Component
The central orchestration layer of the Pythonaibrain (pyaitk) framework. Wires together intent classification, neural chatbot training, memory, NER, translation, frame analysis, weather, and an optional quantized LLM β all under two primary entry points: Brain and AdvanceBrain.
What Is This?
pyaitk.core is the heart of the pyaitk package, providing:
Brainβ intent-based chatbot with memory, NER, translation, frame classification, grammar correction, and TTSAdvanceBrainβ routes through a local quantized LLM (pythonaibrain-llm) for open-ended generation, with the same interface asBrainVectorizerModeβ switchable feature extraction: Binary BoW (NumPy), TF-IDF (scikit-learn), or Gensim TF-IDFIntentsManagerβ load, save, and extendintents.jsonfiles dynamically- Frame classification β lightweight neural sentence-type classifier (Statement / Question / Command / Name / β¦)
- NER integration β entity extraction via the NER subsystem, with optional in-call training
- Translation β GRU seq-to-seq translator (multilingual β English), trained once per process
- Language detection β keyword-based classifier for English / Hindi / French / Spanish
- Weather API β city-level weather lookup via OpenWeatherMap
configure()β load a.pbcfgfile to override all settings before constructing any Brain- Structured logging β all subsystems use Python's
loggingmodule, config-driven level and format
Installation
# Optional: AdvanceBrain LLM support
pipinstallpythonaibrain-llm
Set your OpenWeatherMap API key in a .env file:
weather_api_key=YOUR_KEY_HERE
Quick Start
frompyaitk.coreimport Brain with Brain() as brain: brain.load() print(brain.process_messages("Hello"))
Configuration
Load a .pbcfg file before constructing any Brain to override all settings:
importpyaitk.coreascore core.configure("project.pbcfg") brain = core.Brain()
configure() updates the global config singleton, applies logging settings, and returns the loaded AppConfig.
VectorizerMode β Feature Extraction
Controls how text is converted to feature vectors for the neural intent classifier.
| Mode | Backend | Notes |
|---|---|---|
BOW (default) |
Pure NumPy | Binary Bag-of-Words; zero extra deps |
TFIDF |
scikit-learn | TF-IDF weighting via TfidfVectorizer |
GENSIM |
Gensim | Requires pip install gensim |
frompyaitk.coreimport Brain, VectorizerMode brain = Brain(vectorizer_mode=VectorizerMode.TFIDF) brain = Brain(vectorizer_mode=VectorizerMode.GENSIM) brain = Brain() # default BOW
Brain β Intent-Based Chatbot
Basic usage
frompyaitk.coreimport Brain # Train from scratch with Brain() as brain: brain.train() brain.save() print(brain.ask("What is the weather?")) # Load a saved model with Brain() as brain: brain.load() response = brain.process_messages("Tell me a joke", grammar=True) print(response)
Constructor parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
intents_path |
str or None |
config | Path to intents.json |
condition |
bool or None |
config | Enable dynamic intent learning from search results |
download |
bool or None |
config | Auto-download NLTK data on init |
memory_path |
str or None |
config | Path for memory persistence file |
smart_memory |
bool or None |
config | Use SmartMemory (semantic search + clustering) |
memory_fit_interval |
int or None |
config | Auto-fit SmartMemory every N memories |
config |
AppConfig or None |
global | Override global config for this instance |
vectorizer_mode |
VectorizerMode |
BOW |
Feature extraction strategy |
**function_mapping |
Any |
β | Map intent tags to Python callables |
Methods
| Method | Returns | Description |
|---|---|---|
train() |
None |
Parse intents, build features, train neural model |
load() |
None |
Load saved model from paths in config |
save() |
None |
Save model weights and dimension file |
process_messages(message, grammar, TTS) |
str |
Classify intent, respond, remember, optionally speak |
talk(message, grammar, TTS) |
str |
Attempt web search first, fall back to process_messages |
ask(query, TTS) |
str |
talk() + typewriter-style write() output |
write(message, set_timer, TTS) |
None |
Print message character by character with optional TTS |
translator(message) |
str |
Translate input to English via GRU seq-to-seq model |
classify_language(message) |
str |
Detect language: english, hindi, french, spanish |
predict_message_type(message) |
str |
Classify sentence type: Statement / Question / Command / β¦ |
predict_entitie(message, train) |
list |
Extract named entities via NER pipeline |
memorize_user_name(message) |
None |
Detect and store user name from message |
recall_user_name() |
str |
Retrieve stored user name from memory |
search_memory(query, top_k) |
list |
Semantic or substring search over conversation history |
memory_intent(query) |
str |
Predict intent of query from stored patterns |
memory_report() |
Any |
Return SmartMemory cluster/intent analysis report |
export_memory_report(path) |
bool |
Export memory report as JSON |
fit_memory() |
bool |
Manually retrain SmartMemory summarizer |
is_loaded() / is_saved() / is_trained() |
bool |
State flags |
count_query() |
int |
Total number of queries processed |
pyai_say(*msg) |
None |
Print with PYAI : prefix |
Function mapping
Map intent tags to Python callables β called automatically when that intent is predicted:
defopen_calculator(): importsubprocess; subprocess.Popen("calc.exe") with Brain(calculator=open_calculator) as brain: brain.load() brain.process_messages("open calculator")
Memory
Brain uses SmartMemory by default (falls back to plain Memory if the summarizer module is absent). Every process_messages() call automatically stores the exchange.
# Semantic search over history results = brain.search_memory("weather", top_k=3) # Export memory analytics brain.export_memory_report("memory_report.json") # Manually trigger SmartMemory refit brain.fit_memory()
AdvanceBrain β LLM-powered Brain
Routes responses through a local quantized LLM from pythonaibrain-llm. Falls back to the intent classifier when advance=False.
frompyaitk.coreimport AdvanceBrain with AdvanceBrain() as brain: brain.load() print(brain.process_messages("What is Python?")) # Skip LLM, use intent classifier only print(brain.process_messages("Hello", advance=False))
AdvanceBrain has the same train(), load(), save(), translator(), classify_language(), predict_message_type(), and predict_entitie() methods as Brain.
Requires:
pip install pythonaibrain-llmThe LLM is lazy-loaded on firstprocess_messages(advance=True)call.
IntentsManager β Dynamic Intent Management
frompyaitk.coreimport IntentsManager im = IntentsManager("intents.json") # Add or extend an intent im.add_intent( tag="greeting", patterns=["Hi", "Hey there"], responses=["Hello!", "Hey! How can I help?"] ) # Add a search-derived intent im.add_search_intent("best Python books", ["Python Crash Course", "Fluent Python"])
Utility Functions
Weather
frompyaitk.coreimport get_weather weather = get_weather("London") # e.g. "Clouds"
Requires weather_api_key in .env. Also available: longitude(city), latitude(city), humidity(city), temperature(city).
Frame classification
frompyaitk.coreimport predict_frame, VectorizerMode frame = predict_frame("What time is it?") # β "Question" frame = predict_frame("Open the door") # β "Command" frame = predict_frame("The sky is blue") # β "Statement" frame = predict_frame("My name is Divyanshu") # β "Name"
Supported frame types: Statement, Question, Command, Answer, Name, Know, Shutdown, Make Dir, Start.
Translation
frompyaitk.coreimport translate_to_en text = translate_to_en("tum kaise ho") # β "how are you" text = translate_to_en("hola como estas")
Model trains once per process on a built-in multilingual corpus (Hindi, French, Spanish, English).
Language detection
frompyaitk.coreimport language_classifier lang = language_classifier("tum kaise ho") # β "hindi" lang = language_classifier("je m'appelle") # β "french" lang = language_classifier("hola como estas") # β "spanish" lang = language_classifier("Hello there") # β "english"
NER
frompyaitk.coreimport Brain with Brain() as brain: brain.load() entities = brain.predict_entitie("Apple was founded by Steve Jobs.") print(entities) # list of Entity objects
Or directly:
frompyaitk.coreimport predictNER entities = predictNER("NASA launched Voyager 1.") entities = predictNER("LeBron James plays for the Lakers.", train=True) # retrain first
Full Example
frompyaitk.coreimport Brain, VectorizerMode, configure # Optional: load a custom config before anything else configure("myproject.pbcfg") # Train and save with Brain(vectorizer_mode=VectorizerMode.TFIDF) as brain: brain.train() brain.save() # Load and interact with Brain() as brain: brain.load() # Chat print(brain.ask("Tell me a joke")) print(brain.process_messages("What is the weather in Delhi?")) # Language tools print(brain.classifier_language("tum kaise ho")) # hindi print(brain.translator("mera naam ravi hai")) # my name is ravi print(brain.predict_message_type("Shutdown /s")) # Shutdown # NER entities = brain.predict_entitie("Elon Musk founded SpaceX.") print(entities) # Memory brain.memorize_user_name("My name is Divyanshu") print(brain.recall_user_name()) results = brain.search_memory("joke", top_k=3) brain.export_memory_report("report.json") # Stats print(brain.count_query()) print(brain.is_loaded(), brain.is_saved())
Architecture Overview
Brain / AdvanceBrain
βββ ChatbotAssistant β intent parse β feature matrix β PyTorch neural classifier
β βββ VectorizerMode β BOW (NumPy) | TF-IDF (sklearn) | Gensim TF-IDF
β βββ IntentsManager β load/save/extend intents.json
βββ Memory / SmartMemory β conversation persistence + semantic search
βββ NERPipeline β named entity recognition (spaCy-based)
βββ GrammarCorrector β optional post-processing on responses
βββ TTS (speak) β optional audio output
βββ Search β fallback web search for unknown intents
βββ FrameClassifier β sentence-type classifier (singleton, trains once)
βββ GRU Translator β multilingual β English (singleton, trains once)
βββ _PyAILLM β lazy-loaded quantized LLM (AdvanceBrain only)
βββ WeatherAPI β OpenWeatherMap lookup
TTS - Text To Speech
A fully offline, cross-platform Text-to-Speech module built on the pyttsx3 backend. Works on Windows (SAPI5), macOS (NSSpeechSynthesizer), and Linux (eSpeak) without any network calls.
How to Use
1. One-shot (simplest)
frompyaitk.TTSimport TTS TTS().say("Hello, world!")
2. Context manager (recommended)
The engine is initialised once and shut down cleanly on exit β even if an exception is raised.
frompyaitk.TTSimport TTS, TTSConfig with TTS(TTSConfig(voice="zira", rate=160)) as tts: tts.say("Hello!") tts.say("How are you?")
3. Module-level convenience function
frompyaitk.TTSimport speak speak("Hello world!") speak("Bonjour", voice="fiona", rate=140)
4. Save speech to a WAV file
frompyaitk.TTSimport TTS, TTSConfig cfg = TTSConfig(output_path="greeting.wav") with TTS(cfg) as tts: saved_path = tts.save("Hello, this will be saved to a file.") print(f"Saved to: {saved_path}")
You can also override the path at call time:
with TTS() as tts: tts.save("Custom path example", path="custom_output.wav")
5. List available voices
frompyaitk.TTSimport TTS with TTS() as tts: for name in tts.available_voices(): print(name)
6. Get platform-appropriate voice hints
frompyaitk.TTSimport TTS with TTS() as tts: print(tts.platform_voice_hints()) # e.g. ['david', 'zira', 'mark'] on Windows
Configuration (TTSConfig)
All parameters are optional and fall back to values from your project config (config.tts.*).
| Parameter | Type | Description |
|---|---|---|
rate |
int |
Speech rate in words per minute |
volume |
float |
Volume level from 0.0 (silent) to 1.0 (full) |
voice |
str |
Voice name fragment (fuzzy matched) |
default_text |
str |
Fallback text when say() is called with no argument |
output_path |
str or None |
Default WAV output path for save() |
cfg = TTSConfig(rate=180, volume=0.8, voice="samantha")
Voice Matching
Voices are resolved from a short fragment string using a ranked strategy:
- Direct substring match β
"zira"matches"Microsoft Zira Desktop" - Platform-hint-assisted match β uses known fragments for the current OS
- First installed voice β fallback with a warning logged
Platform voice hints:
| Voice | Gender | Style | OS |
|---|---|---|---|
| David | Male | en-US | Windows |
| Mark | Male | en-US | Windows |
| Zira | Female | en-US | Windows |
| Alex | Male | en-US | macOS |
| Samantha | Female | en-US | macOS |
| Victoria | Female | en-US | macOS |
| Fred | Male | Robotic | macOS |
| Daniel | Male | en-GB | macOS |
| Fiona | Female | en-GB | macOS |
| English | β | Default eSpeak | Linux |
| English-US | β | American English | Linux |
| English-UK | β | British English | Linux |
| MB-EN1 | β | MBROLA English 1 | Linux |
| MB-FR1 | β | MBROLA French 1 | Linux |
Examples Summary
# Quickest usage frompyaitk.TTSimport speak speak("Hello!") # Full control with context manager frompyaitk.TTSimport TTS, TTSConfig with TTS(TTSConfig(voice="david", rate=175, volume=0.9)) as tts: tts.say("Line one.") tts.say("Line two.") # Save to file with TTS(TTSConfig(voice="samantha")) as tts: tts.save("Saved audio example.", path="demo.wav") # List voices with TTS() as tts: voices = tts.available_voices() print(voices)
STT β Speech-to-Text Module
A fully cross-platform Speech-to-Text library supporting both online (Google Speech Recognition) and offline (PocketSphinx) backends, with automatic engine selection based on network availability.
What Is This?
- Dual engine support β Google Speech Recognition (online) and PocketSphinx (offline), selected automatically
- Auto engine detection β checks network connectivity at runtime and picks the best available engine
- Context manager interface β opens the microphone once and releases it cleanly on exit, even on exception
- Ambient noise calibration β samples the noise floor before each listen to improve accuracy
- Configurable capture parameters β energy threshold, pause detection, phrase time limits, and timeouts
- Retry logic β automatically retries on transient service errors with configurable delay
- Structured logging β all internal events use Python's
loggingmodule; no bareprintstatements in library code - Custom exception hierarchy β fine-grained errors (
STTAudioError,STTRecognitionError,STTServiceError,STTEngineError) for clean error handling
Installation
On Linux, you may also need:
sudoaptinstallportaudio19-devpython3-pyaudio
How to Use
1. One-shot (simplest)
frompyaitk.sttimport STT stt = STT() text = stt.listen() print(text)
2. Context manager (recommended)
The microphone is opened once and released cleanly on exit β even if an exception is raised.
frompyaitk.sttimport STT with STT() as stt: text = stt.listen() print(text)
3. Force a specific engine
frompyaitk.sttimport STT, STTConfig, Engine # Force offline mode cfg = STTConfig(preferred_engine=Engine.POCKETSPHINX) with STT(config=cfg) as stt: text = stt.listen() print(text) # Force online Google mode cfg = STTConfig(preferred_engine=Engine.GOOGLE) with STT(config=cfg) as stt: text = stt.listen() print(text)
4. Check which engine is active
frompyaitk.sttimport STT with STT() as stt: print(stt.active_engine) # Engine.GOOGLE or Engine.POCKETSPHINX text = stt.listen()
5. Custom configuration
frompyaitk.sttimport STT, STTConfig cfg = STTConfig( timeout=8.0, # wait up to 8 seconds for speech to start phrase_time_limit=15.0, # cap each utterance at 15 seconds pause_threshold=1.0, # 1 second of silence = end of phrase ambient_noise_duration=2.0, # spend 2 seconds calibrating noise floor google_language="en-GB", # use British English max_retries=5, # retry up to 5 times on service errors ) with STT(config=cfg) as stt: text = stt.listen() print(text)
6. Multi-turn listening loop
frompyaitk.sttimport STT, STTAudioError, STTRecognitionError with STT() as stt: while True: try: text = stt.listen() print(f"You said: {text}") if "stop" in text.lower(): break except STTAudioError: print("No speech detected, trying againβ¦") except STTRecognitionError: print("Couldn't understand that, please repeat.")
Configuration (STTConfig)
All parameters are optional and fall back to values from your project config (config.stt.*).
| Parameter | Type | Description |
|---|---|---|
energy_threshold |
float or None |
Mic sensitivity; None = auto-calibrate |
dynamic_energy_threshold |
bool |
Continuously adjust threshold for ambient noise |
pause_threshold |
float |
Seconds of silence that mark end of phrase |
phrase_time_limit |
float or None |
Hard cap per utterance in seconds |
timeout |
float or None |
Seconds to wait for speech to begin |
ambient_noise_duration |
float |
Seconds spent sampling the noise floor before listening |
preferred_engine |
Engine or None |
Force a specific engine; None = auto-detect |
google_language |
str |
BCP-47 language tag for Google (e.g. "en-US", "fr-FR") |
google_api_key |
str or None |
Google API key; None = free tier |
sphinx_language |
str |
Language/model name for PocketSphinx |
max_retries |
int |
Max retry attempts on transient service errors |
retry_delay |
float |
Seconds to wait between retries |
connectivity_host |
str |
Host used for the network connectivity check |
connectivity_port |
int |
Port used for the connectivity check |
connectivity_timeout |
float |
Timeout for the connectivity probe |
Engine Selection
The engine is resolved in this order:
config.preferred_engineβ if explicitly set, always used- Network probe β a TCP connection to
connectivity_host:connectivity_portis attempted- Success β
Engine.GOOGLE(online) - Failure β
Engine.POCKETSPHINX(offline)
- Success β
| Engine | Mode | Requires | Best for |
|---|---|---|---|
GOOGLE |
Online | Internet access | High accuracy, broad language support |
POCKETSPHINX |
Offline | pocketsphinx |
Air-gapped / low-latency use |
Exception Hierarchy
STTError
βββ STTAudioError β mic could not be opened, or no speech detected in time
βββ STTRecognitionError β audio captured but speech was unintelligible
βββ STTServiceError β online service was unreachable or returned an error
βββ STTEngineError β PocketSphinx failed to init or missing language data
Example error handling:
frompyaitk.STTimport STT, STTAudioError, STTRecognitionError, STTServiceError, STTEngineError try: with STT() as stt: text = stt.listen() print(text) except STTAudioError as e: print(f"Mic issue: {e}") except STTRecognitionError as e: print(f"Could not understand: {e}") except STTServiceError as e: print(f"Service unreachable: {e}") except STTEngineError as e: print(f"Offline engine error: {e}")
Examples Summary
# Quickest usage frompyaitk.STTimport STT text = STT().listen() # Context manager with config frompyaitk.STTimport STT, STTConfig, Engine cfg = STTConfig(preferred_engine=Engine.GOOGLE, google_language="fr-FR", timeout=6.0) with STT(config=cfg) as stt: text = stt.listen() print(text) # Resilient loop with error recovery frompyaitk.STTimport STT, STTAudioError, STTRecognitionError with STT() as stt: while True: try: print(stt.listen()) except STTAudioError: pass # timed out, try again except STTRecognitionError: print("Please repeat.")
PTT (PDF To Text) Function
What Is This?
- Full document extraction β reads all pages and joins them into a single string
- Per-page fault tolerance β if a single page fails, extraction continues for the rest
- Configurable page separator β control how pages are joined in the output string
- Structured logging β all warnings and errors go through Python's
loggingmodule - Custom exception β
PDFExtractionErrorwraps PyMuPDF errors for clean upstream handling - Input validation β checks for empty paths, missing files, and non-file paths before opening
How to Use
1. Basic extraction
frompyaitk.PTTimport extract_text_from_pdf text = extract_text_from_pdf("document.pdf") print(text)
2. Custom page separator
text = extract_text_from_pdf("report.pdf", page_separator="\n---\n") print(text)
3. Custom encoding
text = extract_text_from_pdf("document.pdf", encoding="latin-1")
4. With error handling
frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError try: text = extract_text_from_pdf("document.pdf") print(f"Extracted {len(text)} characters.") except FileNotFoundError: print("File not found.") except ValueError as e: print(f"Invalid input: {e}") except PDFExtractionError as e: print(f"Could not read PDF: {e}")
5. Processing multiple files
frompathlibimport Path frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError results = {} for pdf_path in Path("./docs").glob("*.pdf"): try: results[pdf_path.name] = extract_text_from_pdf(str(pdf_path)) except PDFExtractionError as e: print(f"Skipping {pdf_path.name}: {e}") for name, text in results.items(): print(f"{name}: {len(text)} characters")
API Reference
extract_text_from_pdf(pdf_path, encoding, page_separator)
Extracts all text from a PDF and returns it as a single string.
| Parameter | Type | Default | Description |
|---|---|---|---|
pdf_path |
str |
(required) | Path to the PDF file |
encoding |
str |
"utf-8" |
Text encoding for extraction |
page_separator |
str |
"\n\n" |
String inserted between pages in the output |
Returns: str β full extracted text from all pages.
Raises:
| Exception | When |
|---|---|
ValueError |
pdf_path is None, empty, or not a file path |
FileNotFoundError |
The specified file does not exist on disk |
PDFExtractionError |
PDF is corrupted, invalid, or unreadable |
Exception Reference
PDFExtractionError
βββ Wraps fitz.FileDataError and other PyMuPDF runtime errors
PDFExtractionError is the single catch-all for PDF-level failures. FileNotFoundError and ValueError are raised directly for path/input issues and do not need to be caught as PDFExtractionError.
Behaviour Notes
- Empty PDF β if the document has zero pages, an empty string
""is returned and a warning is logged. - Page-level errors β if one page fails to extract, that page contributes an empty string and extraction continues for remaining pages. The error is logged.
- No partial results lost β all successfully extracted pages are still returned even if some pages failed.
Examples Summary
# Minimal usage frompyaitk.PTTimport extract_text_from_pdf text = extract_text_from_pdf("file.pdf") # Custom separator between pages text = extract_text_from_pdf("file.pdf", page_separator="\n--- PAGE BREAK ---\n") # Full error handling frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError try: text = extract_text_from_pdf("file.pdf") except FileNotFoundError: print("File not found.") except PDFExtractionError as e: print(f"Extraction failed: {e}") # Batch processing a folder frompathlibimport Path frompyaitk.PTTimport extract_text_from_pdf, PDFExtractionError for f in Path("./docs").glob("*.pdf"): try: print(f"{f.name}: {len(extract_text_from_pdf(str(f)))} chars") except PDFExtractionError: print(f"{f.name}: failed")
PPTXExtractor β PowerPoint Content Extraction Utility
A straightforward utility for extracting text, images, and tables from .pptx PowerPoint files using python-pptx. Processes all slides and organises extracted content by slide number.
What Is This?
PPTXExtractor is a class-based PPTX extraction tool that provides:
- Text extraction β pulls all non-empty text from every shape across all slides
- Image extraction β saves embedded images to disk, preserving their original format (PNG, JPEG, etc.)
- Table extraction β reads all table shapes and returns row/cell data as nested lists
- All-in-one extraction β a single
extract_all()call returns text, images, and tables together - Auto output directory β creates the image output folder automatically if it doesn't exist
- Slide-indexed results β all output is keyed by slide number (1-based) for easy lookup
How to Use
1. Extract everything at once
frompyaitk.PPTExtractimport PPTXExtractor extractor = PPTXExtractor("presentation.pptx") data = extractor.extract_all() # data["texts"] β {slide_num: [str, ...]} # data["images"] β {slide_num: [image_path, ...]} # data["tables"] β {slide_num: [[[cell, ...], ...], ...]}
2. Extract text only
extractor = PPTXExtractor("presentation.pptx") texts = extractor.extract_text() for slide_num, lines in texts.items(): print(f"Slide {slide_num}:") for line in lines: print(f" {line}")
3. Extract and save images
extractor = PPTXExtractor("presentation.pptx", image_output_dir="my_images") images = extractor.extract_images() for slide_num, paths in images.items(): for path in paths: print(f"Slide {slide_num}: saved β {path}")
Images are saved as slide{N}_image{M}.{ext} inside the output directory.
4. Extract tables
extractor = PPTXExtractor("presentation.pptx") tables = extractor.extract_tables() for slide_num, slide_tables in tables.items(): for table in slide_tables: for row in table: print("\t".join(row))
5. Iterate all content by slide
extractor = PPTXExtractor("presentation.pptx") data = extractor.extract_all() for slide_num in data["texts"]: print(f"\n--- Slide {slide_num} ---") for text in data["texts"][slide_num]: print(f" Text: {text}") for img_path in data["images"].get(slide_num, []): print(f" Image: {img_path}") for table in data["tables"].get(slide_num, []): for row in table: print(" Row:", "\t".join(row))
API Reference
PPTXExtractor(pptx_path, image_output_dir)
| Parameter | Type | Default | Description |
|---|---|---|---|
pptx_path |
str |
(required) | Path to the .pptx file |
image_output_dir |
str |
"extracted_images" |
Directory where extracted images will be saved |
The image output directory is created automatically if it does not exist.
extract_text() β dict[int, list[str]]
Returns all non-empty text from every shape across all slides.
{ 1: ["Title of Slide One", "Bullet point A", "Bullet point B"], 2: ["Slide Two heading", "Some body text"], }
extract_images() β dict[int, list[str]]
Saves all embedded images to image_output_dir and returns their file paths.
{ 1: ["extracted_images/slide1_image1.png"], 3: ["extracted_images/slide3_image1.jpeg", "extracted_images/slide3_image2.png"], }
Filenames follow the pattern: slide{slide_num}_image{shape_num}.{ext}
extract_tables() β dict[int, list[list[list[str]]]]
Returns table data as nested lists: slide β list of tables β list of rows β list of cell strings.
{ 2: [ [["Header A", "Header B"], ["Row 1A", "Row 1B"], ["Row 2A", "Row 2B"]], ] }
extract_all() β dict
Runs all three extractors and returns a combined dictionary:
{ "texts": { 1: [...], 2: [...] }, "images": { 1: [...], 3: [...] }, "tables": { 2: [...] }, }
Examples Summary
frompyaitk.PPTExtractimport PPTXExtractor # Extract everything data = PPTXExtractor("deck.pptx").extract_all() # Text only texts = PPTXExtractor("deck.pptx").extract_text() # Images saved to a custom folder images = PPTXExtractor("deck.pptx", image_output_dir="assets").extract_images() # Tables only tables = PPTXExtractor("deck.pptx").extract_tables() # Iterate slide by slide extractor = PPTXExtractor("deck.pptx") data = extractor.extract_all() for slide_num in data["texts"]: print(f"Slide {slide_num}:", data["texts"][slide_num])
NER System β Named Entity Recognition
A complete, modular Named Entity Recognition system built on spaCy. Covers the full ML lifecycle: text preprocessing, inference, postprocessing, training with early stopping, evaluation with precision/recall/F1, and persistent entity storage.
What Is This?
This package is a NER system with six cooperating modules:
| Module | Class / Function | Responsibility |
|---|---|---|
pipeline.py |
NERPipeline |
Core inference β single and batch prediction |
trainer.py |
NERTrainer |
Training loop with early stopping and model saving |
evaluator.py |
NEREvaluator |
Precision / Recall / F1 metrics (exact + partial match) |
preprocessor.py |
TextPreprocessor |
Cleans and normalises raw text before inference |
postprocessor.py |
EntityPostprocessor |
Filters, deduplicates, and enriches entity outputs |
entity_store.py |
EntityStore |
In-memory store with JSONL persistence and analytics |
logging_config.py |
setup_logging() |
Configures root logger (console + optional file) |
Quick Start
frompyaitk.NERimport NERPipeline, EntityStore # Load a trained model pipeline = NERPipeline.from_model_path("models/my_ner") # Predict result = pipeline.predict("Apple was founded by Steve Jobs in California.") for entity in result.entities: print(entity.label, entity.text) # ORG Apple # PERSON Steve Jobs # GPE California # Store results store = EntityStore(persist_path="entities.jsonl") store.add(result) print(store.top_entities("ORG", n=5))
Module Guide
NERPipeline β Inference
frompyaitk.NERimport NERPipeline # From a saved model on disk pipeline = NERPipeline.from_model_path("models/my_ner") # From a blank model (for training or rule-based use) pipeline = NERPipeline.from_blank(lang="en", labels=["ORG", "PERSON", "GPE"]) # Single prediction result = pipeline.predict("Google acquired DeepMind in 2014.") print(result.entities) # list of Entity objects print(result.entity_types) # ['DATE', 'ORG'] print(result.filter_by_label("ORG")) # [Entity(text='Google', ...)] print(result.processing_time_ms) # e.g. 12.4 # Batch prediction (lazy generator, memory-efficient) texts = ["Text one.", "Text two.", "Text three."] for result in pipeline.predict_batch(texts): print(result.entities) # Save model to disk pipeline.save("models/my_ner_v2") # Inspect labels print(pipeline.labels) # ['DATE', 'GPE', 'ORG', 'PERSON']
Constructor options:
| Parameter | Type | Default | Description |
|---|---|---|---|
nlp |
spacy.Language |
(required) | Loaded spaCy model |
preprocessor |
TextPreprocessor |
None |
Custom preprocessor; defaults to standard config |
postprocessor |
EntityPostprocessor |
None |
Custom postprocessor; defaults to standard config |
return_doc |
bool |
False |
Include raw spaCy Doc in results |
batch_size |
int |
64 |
Chunk size for nlp.pipe in batch mode |
NERTrainer β Training
frompyaitk.NERimport NERTrainer frompyaitk.NER.trainerimport TrainerConfig TRAIN_DATA = [ ("Apple was founded by Steve Jobs.", {"entities": [(0, 5, "ORG"), (21, 31, "PERSON")]}), ("Google is based in Mountain View.", {"entities": [(0, 6, "ORG"), (19, 32, "GPE")]}), ] DEV_DATA = [...] # Train from scratch trainer = NERTrainer(config=TrainerConfig(n_iter=30, dropout=0.35)) trainer.prepare(TRAIN_DATA) results = trainer.train(TRAIN_DATA, dev_data=DEV_DATA, output_dir="models/v1") print(results["best_dev_f1"]) # e.g. 0.8712 print(results["history"]) # list of per-epoch loss + dev scores # Fine-tune from an existing spaCy model trainer = NERTrainer(base_model="en_core_web_sm") trainer.train(TRAIN_DATA, output_dir="models/v1_finetuned") # Export training data to spaCy v3 DocBin format importspacy nlp = spacy.blank("en") NERTrainer.data_to_docbin(TRAIN_DATA, nlp, "train.spacy")
TrainerConfig fields:
| Field | Default | Description |
|---|---|---|
lang |
"en" |
spaCy language code for blank model |
n_iter |
30 |
Maximum training epochs |
dropout |
0.35 |
Dropout rate during training |
batch_start |
4 |
Starting batch size for compounding scheduler |
batch_compound |
1.001 |
Compounding factor for batch size growth |
eval_every |
5 |
Evaluate on dev set every N epochs |
patience |
5 |
Early stopping patience (epochs without improvement) |
min_delta |
0.001 |
Minimum F1 improvement to reset patience counter |
seed |
42 |
Random seed for reproducibility |
NEREvaluator β Metrics
frompyaitk.NERimport NEREvaluator, NERPipeline pipeline = NERPipeline.from_model_path("models/my_ner") evaluator = NEREvaluator() # exact match (default) evaluator_partial = NEREvaluator(partial=True) # any span overlap counts gold_data = [ ("Apple was founded by Steve Jobs.", {"entities": [(0, 5, "ORG"), (21, 31, "PERSON")]}), ] report = evaluator.evaluate(gold_data, pipeline) print(report) # ============================================================ # Micro P=0.9200 R=0.8750 F1=0.8970 # Macro F1 = 0.8800 # ------------------------------------------------------------ # Label P R F1 TP FP FN # ------------------------------------------------------------ # ORG 0.9500 0.9000 0.9245 ... # PERSON 0.8100 0.8500 0.8295 ... # ============================================================ print(report.to_dict()) # structured dict for logging / serialisation # Evaluate pre-computed predictions (no pipeline needed) gold_spans_list = [[(0, 5, "ORG"), (21, 31, "PERSON")]] pred_spans_list = [[(0, 5, "ORG"), (21, 31, "PERSON")]] report = evaluator.evaluate_from_predictions(gold_spans_list, pred_spans_list)
EvaluationReport properties:
| Property | Description |
|---|---|
micro_precision |
TP / (TP + FP) across all labels |
micro_recall |
TP / (TP + FN) across all labels |
micro_f1 |
Harmonic mean of micro precision and recall |
macro_f1 |
Unweighted average F1 across all labels |
per_label |
Dict of LabelScore objects keyed by label |
TextPreprocessor β Input Cleaning
frompyaitk.NERimport TextPreprocessor frompyaitk.NER.preprocessorimport PreprocessorConfig # Default config preprocessor = TextPreprocessor() clean = preprocessor.process("<b>Visit https://example.com for info.</b>") # β "Visit for info." # Custom config config = PreprocessorConfig( lowercase=True, remove_urls=True, remove_emails=True, remove_html_tags=True, normalize_whitespace=True, normalize_unicode=True, max_length=512, custom_patterns=[r"\d{4}-\d{2}-\d{2}"], # strip ISO dates ) preprocessor = TextPreprocessor(config) # Batch cleaned_texts = preprocessor.process_batch(["Text one.", "Text two."])
PreprocessorConfig fields:
| Field | Default | Description |
|---|---|---|
lowercase |
False |
Lowercase the entire text |
remove_urls |
True |
Remove http://, https://, and www. URLs |
remove_emails |
False |
Remove email addresses (kept by default as NER signal) |
remove_html_tags |
True |
Strip HTML tags |
normalize_whitespace |
True |
Collapse all whitespace to single spaces |
normalize_unicode |
True |
NFC-normalise Unicode |
max_length |
None |
Truncate text to this many characters |
custom_patterns |
[] |
List of regex strings to remove |
EntityPostprocessor β Output Cleaning
frompyaitk.NERimport EntityPostprocessor frompyaitk.NER.postprocessorimport PostprocessorConfig config = PostprocessorConfig( min_length=2, max_length=100, allowed_labels={"ORG", "PERSON", "GPE"}, blocked_labels={"MISC"}, deduplicate=True, merge_adjacent=True, strip_punct=True, custom_label_map={"PERSON": "PER"}, # rename labels ) postprocessor = EntityPostprocessor(config) entities = postprocessor.process(entities)
Processing chain (in order):
- Filter by text length (
min_length/max_length) - Filter by label (
allowed_labels/blocked_labels) - Strip leading/trailing punctuation from entity text (
strip_punct) - Remap label names (
custom_label_map) - Lowercase labels (
lowercase_labels) - Deduplicate identical spans (
deduplicate) - Merge back-to-back same-label spans (
merge_adjacent, gap β€ 1 char)
EntityStore β Persistence & Analytics
frompyaitk.NERimport EntityStore # In-memory only store = EntityStore() # With JSONL persistence (appends on each add; loads on init if file exists) store = EntityStore(persist_path="entities.jsonl") store.add(result) # add a single NERResult store.add_batch([result1, result2]) # add multiple results # Query store.query(label="ORG") # all ORG records store.query(text_contains="apple", min_score=0.8) # filtered store.unique_entities(label="PERSON") # sorted unique names store.top_entities("ORG", n=10) # [(entity, count), ...] store.label_distribution() # {"ORG": 42, "PERSON": 31} # Export store.to_json("all_entities.json") # Iterate for record in store.iter_records(): print(record["text"], record["label"]) print(len(store)) # total entity count print(store) # EntityStore(total=132, labels={'ORG': 42, 'PERSON': 31})
Each stored record contains:
| Field | Description |
|---|---|
text |
Entity surface form |
label |
Entity type (e.g. ORG, PERSON) |
start_char |
Start character offset in source text |
end_char |
End character offset in source text |
score |
Detection confidence (0.0β1.0) |
source_text |
The original text the entity came from |
Logging Setup
frompyaitk.NER.logging_configimport setup_logging setup_logging(level="DEBUG", log_file="ner.log")
| Parameter | Default | Description |
|---|---|---|
level |
"INFO" |
Logging level |
log_file |
None |
Optional path for file output |
fmt |
"%(asctime)s [%(levelname)s] %(name)s β β¦" |
Log record format string |
Full End-to-End Example
frompyaitk.NERimport NERPipeline, NERTrainer, NEREvaluator, EntityStore frompyaitk.NER.trainerimport TrainerConfig frompyaitk.NER.logging_configimport setup_logging setup_logging(level="INFO", log_file="ner.log") # 1. Prepare data TRAIN_DATA = [ ("Apple was founded by Steve Jobs in California.", { "entities": [(0, 5, "ORG"), (21, 31, "PERSON"), (35, 45, "GPE")] }), ] DEV_DATA = TRAIN_DATA # use your actual dev split # 2. Train trainer = NERTrainer(config=TrainerConfig(n_iter=20, eval_every=5)) results = trainer.train(TRAIN_DATA, dev_data=DEV_DATA, output_dir="models/v1") print("Best dev F1:", results["best_dev_f1"]) # 3. Load and predict pipeline = NERPipeline.from_model_path("models/v1") result = pipeline.predict("Microsoft acquired GitHub in 2018.") print(result.entities) # 4. Evaluate evaluator = NEREvaluator() report = evaluator.evaluate(DEV_DATA, pipeline) print(report) # 5. Store store = EntityStore(persist_path="entities.jsonl") store.add(result) print(store.top_entities("ORG")) print(store.label_distribution())
Data Format
Training and evaluation data uses spaCy's standard annotation format:
[ ("Text to annotate.", {"entities": [(start, end, "LABEL"), ...]}), ("Google is in Mountain View.", {"entities": [(0, 6, "ORG"), (13, 26, "GPE")]}), ]
start/endare character offsets (not token indices)- Spans must not overlap
- Labels are arbitrary strings registered via
NERTrainer.prepare()orNERPipeline.from_blank()
MathAI β Production-grade Mathematical Expression Solver
A robust symbolic mathematics solver built on SymPy, supporting simplification, equation solving, differentiation, integration, matrix analysis, and Taylor series expansion β with automatic operation detection, input validation, and structured result objects.
What Is This?
MathAI is a symbolic math engine that provides:
- Auto-detection β
process()andMathAI()infer the operation type from the query automatically - Symbolic computation β exact symbolic results powered by SymPy (no floating-point approximation unless requested)
- Numeric evaluation β automatically computes a decimal approximation when the result is a pure number
- Equation solving β single equations and systems of equations (via
x + y = 5 and x - y = 1syntax) - Calculus β differentiation (any order) and definite/indefinite integration
- Matrix analysis β determinant, trace, rank, inverse, and eigenvalues in one call
- Taylor/Laurent series β configurable expansion point and order
- Input validation β rejects empty input, oversized expressions, and dangerous code patterns
- Structured results β every operation returns a
MathResultdataclass withsuccess,result,simplified,numeric, andmetadatafields - Structured logging β all internal events go through Python's
loggingmodule - Convenience API β single
MathAI(query, operation)function for quick one-liner use
How to Use
1. One-liner convenience function (simplest)
frompyaitk.MathAIimport MathAI print(MathAI("x^2 + 2*x + 1")) print(MathAI("x^2 - 4 = 0", operation="solve")) print(MathAI("sin(x)", operation="differentiate"))
2. Auto-detect operation
The MathAI() function (and MathSolver.process()) detect the operation from the query:
- Contains
=β solve - Starts with
Matrixβ matrix operations - Starts with
difforderivative(...)β differentiate - Starts with
intorintegrate(...)β integrate - Anything else β simplify
frompyaitk.MathAIimport MathAI print(MathAI("x^2 - 9 = 0")) # β solve print(MathAI("Matrix([[2, 1], [5, 3]])")) # β matrix print(MathAI("sin(x)^2 + cos(x)^2")) # β simplify
3. Using MathSolver directly
frompyaitk.mathaiimport MathSolver solver = MathSolver() result = solver.simplify("sin(x)^2 + cos(x)^2") print(result)
4. Simplification
result = solver.simplify("x^3 + 3*x^2 + 3*x + 1") print(result.simplified) # (x + 1)**3 print(result.numeric) # set if result is a pure number
5. Solving equations
# Single equation result = solver.solve_equation("x^2 - 4 = 0") print(result.result) # [{x: -2}, {x: 2}] # System of equations (separate with "and") result = solver.solve_equation("x + y = 10 and x - y = 2") print(result.result) # [{x: 6, y: 4}]
6. Differentiation
# First derivative (default) result = solver.differentiate("x^3 + sin(x)") print(result.result) # 3*x**2 + cos(x) print(result.simplified) # simplified form # Higher-order derivative result = solver.differentiate("x^5", var="x", order=3) print(result.result) # 60*x**2
7. Integration
# Indefinite integral result = solver.integrate("x^2 + 3*x") print(result.result) # x**3/3 + 3*x**2/2 # Definite integral result = solver.integrate("x^2", var="x", limits=(0, 1)) print(result.result) # 1/3 print(result.numeric) # 0.333333333333333
8. Matrix operations
result = solver.matrix_operations("Matrix([[1, 2], [3, 4]])") print(result.metadata["determinant"]) # -2 print(result.metadata["inverse"]) # Matrix([[-2, 1], [3/2, -1/2]]) print(result.metadata["eigenvalues"]) # {-sqrt(33)/2 + 5/2: 1, sqrt(33)/2 + 5/2: 1} print(result.metadata["rank"]) # 2 print(result.metadata["trace"]) # 5
9. Taylor series expansion
# Default: expand around x=0, 6 terms result = solver.series_expansion("sin(x)") print(result.result) # x - x**3/6 + x**5/120 + O(x**6) # Custom point and order result = solver.series_expansion("exp(x)", var="x", point=0, order=4) print(result.result) # 1 + x + x**2/2 + x**3/6 + O(x**4)
API Reference
MathAI(query, operation) β str
Top-level convenience function. Returns a formatted string of the result.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
'1*x + 2*x - 3*x' |
Mathematical expression or equation |
operation |
str |
'auto' |
One of: auto, simplify, solve, differentiate, integrate, matrix, series |
MathSolver methods
| Method | Description |
|---|---|
process(query) |
Auto-detect and dispatch to the right operation |
simplify(expr) |
Simplify a symbolic expression |
solve_equation(expr) |
Solve one or more equations |
differentiate(expr, var, order) |
Differentiate to any order |
integrate(expr, var, limits) |
Indefinite or definite integral |
matrix_operations(expr) |
Det, trace, rank, inverse, eigenvalues |
series_expansion(expr, var, point, order) |
Taylor/Laurent series around a point |
MathResult fields
Every method returns a MathResult dataclass:
| Field | Type | Description |
|---|---|---|
success |
bool |
True if the operation completed without error |
operation |
str |
Name of the operation performed |
input_expr |
str |
The original input string |
result |
str or None |
Primary result of the operation |
simplified |
str or None |
Simplified form (where applicable) |
numeric |
str or None |
Decimal evaluation (when result is a pure number) |
error |
str or None |
Error message if success is False |
metadata |
dict or None |
Extra details (variable, limits, matrix properties, etc.) |
Calling str(result) produces a human-readable summary of all populated fields.
Supported Symbols and Functions
The solver pre-loads a wide set of SymPy functions accessible directly in expressions:
| Category | Available |
|---|---|
| Trigonometric | sin, cos, tan, cot, sec, csc, asin, acos, atan, sinh, cosh, tanh |
| Exponential / Log | exp, log, ln |
| Constants | e, pi, I (imaginary unit), oo (infinity) |
| Algebra | sqrt, abs, floor, ceil, factorial, binomial |
| Calculus | diff, integrate, limit, Sum, Product |
| Linear Algebra | Matrix, det, transpose |
| Simplification | factor, expand, simplify, cancel, apart, together |
| Special Functions | gamma, erf |
| Variables | x y z a b c β¦ theta phi alpha beta gamma delta epsilon |
Input Validation
Before any operation, expressions are checked for:
- Empty or whitespace-only input
- Expressions exceeding 10,000 characters
- Forbidden patterns:
__,import,eval,exec,compile,open,file
Rejected inputs return a MathResult with success=False and a descriptive error message β no exceptions are raised to the caller.
Examples Summary
frompyaitk.mathaiimport MathAI, MathSolver # One-liner API print(MathAI("x^2 + 2*x + 1")) print(MathAI("x^2 - 4 = 0", operation="solve")) print(MathAI("sin(x)", operation="differentiate")) print(MathAI("x^2", operation="integrate")) print(MathAI("Matrix([[1, 2], [3, 4]])", operation="matrix")) print(MathAI("cos(x)", operation="series")) # Using MathSolver directly solver = MathSolver() r = solver.solve_equation("x + y = 10 and x - y = 2") print(r.result) r = solver.differentiate("x^5", order=3) print(r.simplified) r = solver.integrate("x^2", limits=(0, 1)) print(r.numeric) r = solver.matrix_operations("Matrix([[1, 2], [3, 4]])") print(r.metadata)
Prompts for MathAI
Solve normal numeric problems.
1+2+3+4-5(1-55)+10
Solve symbolic methamatic.
X+2Y-X+10Z*(10-100)X
Matrix
Matrix([[1,0],[0,1]]) Matrix([[1,2,3],[2,3,10]])
Trigonometric Functions
sin
sin(0) sin(30) sin(45) sin(60) sin(90) ... sin(180) ...
Syntax
sin(<valueoftheata>)
cos
cos(0) cos(30) cos(45) cos(60) cos(90) ... cos(180) ...
Syntax
cos(<valueoftheata>)
tan
tan(0) tan(30) tan(45) tan(60) tan(90) ... tan(180) ...
Syntax
tan(<valueoftheata>)
cosec
csc(0) csc(30) csc(45) csc(60) csc(90) ... csc(180) ...
Syntax
csc(<valueoftheata>)
sec
sec(0) sec(30) sec(45) sec(60) sec(90) ... sec(180) ...
Syntax
sec(<valueoftheata>)
cot
cot(0) cot(30) cot(45) cot(60) cot(90) ... cot(180) ...
Syntax
cot(<valueoftheata>)
Limit
limit('2X') limit('2X+3Y')
Syntax
limit(<symbolicandtrigonometricvaluesinstringformate>)
Determinants
det(Matrix([[10]])) det(Matrix([[10],[20]])) det(Matrix([[10],[30],[40]])) det(Matrix([[10],[100],[0],[50]])) det(Matrix([[10],[97],[95],[1],[99]])) det(Matrix([[10],[11],[100],[3],[2],[150]])) det(Matrix([[10,20]])) det(Matrix([[10,20],[20,30]])) det(Matrix([[10,20],[60,70],[80,100]])) det(Matrix([[10,20,30]])) det(Matrix([[10,20,30],[40,50,60]])) det(Matrix([[10,20,30],[40,50,60],[90,100,102]])) det(Matrix([[10,20,30],[40,50,60],[90,100,102],[95,101,1000]])) ...
Log
log(10) log(20) log(0) log(100) ...
Ln
ln(0) ln(1) ln(10) ln(100) ln(1000) ln(102) ln(3) ln(90) ln(893) ln(9) ...
E
e()
β Get the value of e.
e(10) e(3) e(21) e(38) e(0) ...
βΌ (pi)
pi()
β Get the value of βΌ
Square
Get all square of the value.
sqrt(10) sqrt(2) sqrt(30) sqrt(28039) sqrt(19) sqrt(289843190) ...
Differential
diff('2x') diff('x') diff('y') ...
Give all the values in string formate.
Integration (β«)
Give all the values in string formate.
integrate('dx') integrate('xdx') integrate('2xdx') integrate('2xdy') integrate('xdy') integrate('2x+3y-3ydx') ...
Factor
Get all the factors of the numbers
factor(10) factor(213) factor(389) factor(1983) factor(0) factor(12) ...
ITT β Image-to-Text (OCR) Utility
A minimal, dependency-light OCR utility built on EasyOCR. Extracts text from images in a single function call, with the recognition model loaded once at startup for efficient repeated use.
What Is This?
ITT is a lightweight image-to-text module that provides:
- Single function API β
ITT(image_path)returns all recognised text as a plain string - EasyOCR backend β deep learning-based OCR supporting 80+ languages out of the box
- Model loaded once β the
readeris initialised at module level so repeated calls pay no reload cost - Multi-language support β pass any EasyOCR-supported language codes at call time
- Zero boilerplate β no configuration classes, no context managers; just import and call
Note
EasyOCR downloads model weights on first use (~100 MB). An internet connection is required for the initial download; subsequent runs work fully offline.
How to Use
1. Basic usage
frompyaitk.ITTimport ITT text = ITT("screenshot.png") print(text)
2. Extract text from any image format
EasyOCR supports JPEG, PNG, BMP, TIFF, and more.
text = ITT("photo.jpg") text = ITT("scan.bmp") text = ITT("document.tiff")
3. Multi-language recognition
Pass a list of BCP-47 / EasyOCR language codes as the second argument.
# English + French text = ITT("menu.jpg", languages=["en", "fr"]) # English + Hindi text = ITT("sign.png", languages=["en", "hi"])
Note: The module-level
readeris initialised with['en']only. To use other languages reliably, re-initialiseeasyocr.Readerwith the desired codes before callingreadtext.
4. Batch processing multiple images
frompyaitk.ITTimport ITT frompathlibimport Path results = {} for img in Path("./images").glob("*.png"): results[img.name] = ITT(str(img)) for name, text in results.items(): print(f"{name}: {text[:80]}")
5. Using the result downstream
text = ITT("invoice.png") # Search for keywords if "TOTAL" in text.upper(): print("Invoice contains a total amount.") # Write extracted text to file with open("output.txt", "w", encoding="utf-8") as f: f.write(text)
API Reference
ITT(image_path, languages) β str
Runs OCR on the given image and returns all detected text joined into a single space-separated string.
| Parameter | Type | Default | Description |
|---|---|---|---|
image_path |
str |
(required) | Path to the image file |
languages |
list[str] |
['en'] |
EasyOCR language codes to use for recognition |
Returns: str β all detected text regions joined by spaces, in detection order.
Module-level reader
reader = easyocr.Reader(['en'])
The reader is created once when the module is imported. This means:
- Model files are downloaded on the first import only
- All subsequent
ITT()calls reuse the same loaded model β no per-call overhead - If you need languages beyond English, create your own
easyocr.Readerinstance with the required codes
Supported Languages (selected)
EasyOCR supports 80+ languages. Common codes:
| Code | Language | Code | Language |
|---|---|---|---|
en |
English | fr |
French |
hi |
Hindi | de |
German |
zh |
Chinese | ja |
Japanese |
ko |
Korean | ar |
Arabic |
es |
Spanish | pt |
Portuguese |
Full list: https://www.jaided.ai/easyocr
Notes
- Detection order β text regions are returned in the order EasyOCR detects them, which follows a rough top-to-bottom, left-to-right reading order but may vary for complex layouts.
- Low-quality images β blurry, low-contrast, or heavily compressed images will reduce accuracy. Pre-processing with a library like Pillow (resize, sharpen, greyscale) can improve results.
- GPU acceleration β EasyOCR uses the GPU automatically if a CUDA-capable device is available. On CPU, recognition is slower but fully functional.
Examples Summary
frompyaitk.ITTimport ITT # Basic text = ITT("image.png") # Multi-language text = ITT("document.jpg", languages=["en", "fr"]) # Batch frompathlibimport Path for img in Path("./scans").glob("*.jpg"): print(img.name, ITT(str(img))) # Save to file with open("extracted.txt", "w") as f: f.write(ITT("receipt.png"))
GrammarCorrector β Grammar Correction Pipeline
A three-tier grammar correction system combining SpaCy rule-based morphology, a scikit-learn statistical token corrector, and a PyTorch LSTM Seq2Seq neural model β all unified behind a single GrammarCorrector facade with a JSON intents system for training on real-world error pairs.
What Is This?
Grammar provides a complete grammar correction pipeline with three escalating tiers:
| Tier | Engine | Requires fitting? | Description |
|---|---|---|---|
| 1 | SpaCy + regex rules | No | Morphological rules, subject-verb agreement, capitalisation, confused-word pairs |
| 2 | Scikit-learn (Logistic Regression + DictVectorizer) | Yes | Statistical token-level correction trained on (corrupted, correct) pairs |
| 3 | PyTorch LSTM Seq2Seq | Yes | Neural fallback for out-of-vocabulary and complex patterns |
Tier 1 always runs. Tiers 2 and 3 activate after fit() and each falls back gracefully if unavailable.
Additional features:
- JSON intents system β load structured
{incorrect, correct, tag}pairs from a file; two supported layouts (flat and grouped) - Auto-corruption engine β 100+ deterministic corruption rules (homophones, slang, typos, contractions) to generate training data from plain correct sentences
- BPE tokenizer β trained from scratch on the training corpus via HuggingFace
tokenizers - Context manager interface β loads SpaCy on enter, releases all GPU/CPU resources on exit
quick_correct()β one-liner convenience function for rule-only or full pipeline correction
How to Use
1. Rule-only correction (no fitting needed)
Tier 1 (SpaCy + rules) always runs β no training data required.
frompyaitk.Grammarimport GrammarCorrector with GrammarCorrector() as gc: print(gc.correct("i wants to here more about this")) # β "I wants to hear more about this." print(gc.correct("she go to the market")) # β "She goes to the market."
2. Full three-tier pipeline from a sentence list
frompyaitk.Grammarimport GrammarCorrector sentences = [ "I am going to the market.", "She has a beautiful garden.", "They were playing football yesterday.", ] with GrammarCorrector() as gc: gc.fit(sentences) print(gc.correct("i wants to here more")) print(gc.correct("she go to market"))
3. Train from a JSON intents file (recommended)
frompyaitk.Grammarimport GrammarCorrector with GrammarCorrector() as gc: gc.fit_from_intents_file("intents.json") print(gc.correct("i wants to here you"))
4. Train from intents with a tag filter
Only use specific error categories for training:
frompyaitk.Grammarimport GrammarCorrector, load_intents examples = load_intents("intents.json", tags=["verb_agreement", "capitalisation"]) with GrammarCorrector() as gc: gc.fit_from_intents(examples) print(gc.correct("he go home"))
5. Merge intents + plain sentences
frompyaitk.Grammarimport GrammarCorrector, load_intents sentences = ["The cat sat on the mat.", "I enjoy reading books."] examples = load_intents("intents.json") with GrammarCorrector() as gc: gc.fit(sentences, intents=examples) print(gc.correct("u r gr8"))
6. Save and load a trained model
frompyaitk.Grammarimport GrammarCorrector # Train and save with GrammarCorrector() as gc: gc.fit_from_intents_file("intents.json") gc.save("models/grammar_v1") # Load and use with GrammarCorrector() as gc: gc.load("models/grammar_v1") print(gc.correct("she go to the market"))
7. One-liner convenience function
frompyaitk.Grammarimport quick_correct # Rule-only (no training data) print(quick_correct("i go to school")) # With training sentences = ["She goes to school.", "I am happy."] print(quick_correct("i go to school", sentences=sentences))
8. Custom configuration
frompyaitk.Grammarimport GrammarCorrector, CorrectorConfig cfg = CorrectorConfig( epochs=20, hidden_dim=512, num_layers=3, dropout=0.4, learning_rate=5e-4, vocab_size=3000, device="cuda", ) with GrammarCorrector(cfg) as gc: gc.fit_from_intents_file("intents.json") print(gc.correct("they was at home"))
JSON Intents File Format
Two supported layouts β freely mixable in the same file.
Layout A β flat pair
{ "intents":[ { "tag":"subject_verb_agreement", "incorrect":"she go to the market", "correct":"she goes to the market" }, { "tag":"pronoun_capitalisation", "incorrect":"i am here", "correct":"I am here" } ] }
Layout B β grouped examples
{ "intents":[ { "tag":"capitalisation", "examples":[ {"incorrect":"i am here","correct":"I am here"}, {"incorrect":"hi my name","correct":"Hi my name"} ] } ] }
Rules:
"tag"is required; defaults to"general"if omitted- Pairs where
incorrect == correctare silently skipped - Duplicate pairs (case-insensitive) are deduplicated automatically
- Both layouts may be mixed freely in the same file
Loading and inspecting intents
frompyaitk.Grammarimport load_intents, intents_summary, intents_to_pairs # Load all examples = load_intents("intents.json") # Load with tag filter examples = load_intents("intents.json", tags=["verb_agreement"]) # Load from a dict (useful for testing) examples = load_intents({ "intents": [{"tag": "test", "incorrect": "i go", "correct": "I go"}] }) # Summary: {tag: count} summary = intents_summary(examples) print(summary) # {"capitalisation": 5, "verb_agreement": 8} # Convert to raw pairs pairs = intents_to_pairs(examples) # [("i go", "I go"), ...]
API Reference
GrammarCorrector(config?)
| Parameter | Type | Default | Description |
|---|---|---|---|
config |
CorrectorConfig or None |
None |
Uses CorrectorConfig() defaults if None |
Methods
| Method | Returns | Description |
|---|---|---|
fit(sentences, intents?) |
self |
Train all three tiers; auto-corrupts sentences; merges intents pairs |
correct(text) |
str |
Apply all available tiers in order |
fit_from_intents(intents, sentences?, tags?) |
self |
Train primarily from IntentExample list |
fit_from_intents_file(path, sentences?, tags?) |
self |
Load JSON file and train in one call |
save(path) |
None |
Save Seq2Seq weights + BPE tokenizer to a directory |
load(path) |
self |
Restore weights + tokenizer from a directory |
All fit* methods return self for chaining:
gc.fit_from_intents_file("intents.json").correct("she go to market")
CorrectorConfig fields
| Field | Type | Default | Description |
|---|---|---|---|
vocab_size |
int |
2000 |
BPE tokenizer vocabulary size |
special_tokens |
list |
["<s>", "</s>", "<unk>", "<pad>"] |
Special tokens |
emb_dim |
int |
128 |
Token embedding dimension |
hidden_dim |
int |
256 |
LSTM hidden layer size |
num_layers |
int |
2 |
Number of LSTM layers |
dropout |
float |
0.3 |
Dropout rate |
epochs |
int |
10 |
Seq2Seq training epochs |
learning_rate |
float |
1e-3 |
Adam optimizer learning rate |
teacher_forcing_ratio |
float |
0.5 |
Probability of using teacher forcing per step |
max_decode_len |
int |
120 |
Max tokens to generate during inference |
spacy_model |
str |
en_core_web_sm |
SpaCy model name |
device |
str |
"cuda" if available else "cpu" |
PyTorch device |
Exception Hierarchy
GrammarCorrectorError
βββ NotFittedError β correct() called before fit()
βββ TokenizerError β BPE tokenizer missing special tokens
βββ IntentsValidationError β JSON intents file fails schema validation
Data Helpers
frompyaitk.Grammarimport corrupt_sentence, build_dataset # Apply deterministic corruption rules to one sentence noisy = corrupt_sentence("She goes to the market.") # β "She go 2 the market." # Build (corrupted, correct) pairs from a list of correct sentences dataset = build_dataset(["I am happy.", "She goes to school."]) # β [("i'm happy.", "I am happy."), ...]
The corruption engine covers 100+ rules across five categories: homophones (their/there/they're), internet slang (u/r/gr8/lol), informal contractions (wanna/gonna/gotta), common typos (teh/freind/recieve), and punctuation corruption.
Low-level API
These are available for advanced use when you want direct access to individual tiers or the training loop:
frompyaitk.Grammarimport ( rule_based_corrector, # Tier 1 standalone train_tokenizer, # BPE tokenizer training Seq2Seq, Encoder, Decoder, # Tier 3 PyTorch models train_model, # Tier 3 training loop correct_sentence_neural, # Tier 3 greedy inference ) importspacy nlp = spacy.load("en_core_web_sm") corrected = rule_based_corrector("she go to the market", nlp) print(corrected) # "She goes to the market."
Architecture Overview
GrammarCorrector.correct(text)
β
βΌ
βββββββββββββββββββββββββββββββββ
β Tier 1 β SpaCy + Regex β Always runs
β β’ Lowercase "i" β "I" β
β β’ Sentence capitalisation β
β β’ Subject-verb agreement β
β β’ Confused-word pairs β
βββββββββββββββββ¬ββββββββββββββββ
β
βΌ (if fitted)
βββββββββββββββββββββββββββββββββ
β Tier 2 β Sklearn Pipeline β Runs after fit()
β β’ DictVectorizer features β
β β’ OneVsRest LogisticReg. β
β β’ Token-level sequence fix β
βββββββββββββββββ¬ββββββββββββββββ
β
βΌ (if fitted)
βββββββββββββββββββββββββββββββββ
β Tier 3 β LSTM Seq2Seq β Runs after fit()
β β’ BPE tokenizer β
β β’ Encoder (embedding + LSTM) β
β β’ Decoder (teacher forcing) β
β β’ Greedy inference β
βββββββββββββββββββββββββββββββββ
Examples Summary
frompyaitk.Grammarimport GrammarCorrector, load_intents, quick_correct # Rule-only (no fitting) with GrammarCorrector() as gc: print(gc.correct("i go to school")) # Full pipeline from sentences with GrammarCorrector() as gc: gc.fit(["She goes to school.", "I am happy."]) print(gc.correct("she go to school")) # From intents file with GrammarCorrector() as gc: gc.fit_from_intents_file("intents.json", tags=["verb_agreement"]) print(gc.correct("he dont know")) # Save and reload with GrammarCorrector() as gc: gc.fit_from_intents_file("intents.json") gc.save("models/v1") with GrammarCorrector() as gc: gc.load("models/v1") print(gc.correct("they was playing")) # One-liner print(quick_correct("i go home"))
EYE β Real-Time Object Detection
A production-ready webcam object detection application built on YOLOv8 (Ultralytics) and CustomTkinter. Supports a full GUI with live controls, headless CLI detection, a unified EyeSession facade, and full context-manager support on every public class β with extensive threading fixes and feature additions over a naive implementation.
What Is This?
EYE is a fully-featured real-time object detection module that provides:
- Full context-manager protocol β every public class (
CameraManager,ObjectDetector,DetectionApp,EyeSession) supportswithblocks with guaranteed resource cleanup EyeSessionfacade β unified entry point with.gui()and.headless()factory context managers covering the complete lifecycle- Live webcam detection β reads frames from any connected camera and runs YOLOv8 inference at a capped FPS
- Full GUI application β dark-mode CustomTkinter UI with video feed, control panel, and detection log
- Headless CLI mode β
simple_detect()uses context managers internally for safe camera/detector teardown - Hot-swappable model variants β switch between
yolov8n/s/m/l/x.ptat runtime without restarting - Class filter β show bounding boxes only for specific classes (e.g.
person,car) - Confidence threshold β adjustable slider (0.1β0.95), persisted to
~/.yolov8_app.json - Detection heat-map β alpha-blended overlay that accumulates and decays detection positions
- Video recording β save annotated footage to
.mp4at camera resolution - Screenshot β save the current annotated frame as a timestamped JPG
- Live FPS counter β exponential moving average over the last 30 frames
- Detection log β scrollable timestamped list of the last 200 detection events
- Keyboard shortcuts β Space (pause), S (screenshot), R (record), Q (quit)
- Serialisable config β
DetectionConfigsaves/loads all settings as JSON - Thread-safe design β all widget mutations dispatched via
self.after(), frame buffer protected withthreading.Lock - Legacy aliases β
EYE()andOpenEYE()for backwards compatibility withcore.py/ pyaitk
Note
YOLOv8 model weights are downloaded automatically on first use (~6 MB for
yolov8n.pt). An internet connection is required for the initial download; subsequent runs work offline.
Context-Manager Patterns
All four public classes now support the full context-manager protocol. Resources are always released β even when exceptions are raised.
EyeSession β recommended top-level API
frompyaitk.eyeimport EyeSession # GUI mode (blocks until window closed) with EyeSession.gui() as session: session.run() print(session.last_detections) # Headless mode (blocks until target seen or 'q' pressed) with EyeSession.headless(target_class="person") as session: detected = session.run() print(detected) # GUI with custom config frompyaitk.eyeimport DetectionConfig cfg = DetectionConfig(model_name="yolov8s.pt", show_heatmap=True, confidence_threshold=0.6) with EyeSession.gui(config=cfg) as session: session.run()
CameraManager β camera resource
frompyaitk.eyeimport CameraManager # __enter__ / __exit__ with CameraManager(camera_index=0, width=640, height=480) as cam: ret, frame = cam.read() # Factory context manager with CameraManager.open_device(0, width=1280, height=720) as cam: ret, frame = cam.read() print(cam.is_opened) # True
ObjectDetector β inference resource
frompyaitk.eyeimport ObjectDetector, DetectionConfig cfg = DetectionConfig() # __enter__ / __exit__ (provide your own model) frompyaitk.eyeimport ModelLoader model = ModelLoader.load_model("yolov8n.pt") with ObjectDetector(model, cfg) as det: classes, annotated, count, confs = det.detect(frame) # Factory context manager (loads model internally) with ObjectDetector.from_model("yolov8n.pt", cfg) as det: classes, annotated, count, confs = det.detect(frame)
DetectionApp β GUI application
frompyaitk.eyeimport DetectionApp, DetectionConfig importcustomtkinterasctk # __enter__ / __exit__ ctk.set_appearance_mode("dark") ctk.set_default_color_theme("blue") with DetectionApp(DetectionConfig()) as app: app.mainloop() # Factory context manager (sets up CTk theme automatically) with DetectionApp.run_app(DetectionConfig()) as app: app.mainloop()
Composing context managers (low-level pipeline)
frompyaitk.eyeimport CameraManager, ObjectDetector importcv2 with CameraManager(0) as cam: with ObjectDetector.from_model("yolov8n.pt") as det: while True: ret, frame = cam.read() if not ret: break classes, annotated, count, confs = det.detect(frame) cv2.imshow("Detection", annotated) if cv2.waitKey(1) & 0xFF == ord("q"): break cv2.destroyAllWindows() # Camera and detector released automatically on exit
How to Use
1. Launch the GUI (simplest)
frompyaitk.eyeimport launch_gui launch_gui()
2. GUI with custom config
frompyaitk.eyeimport launch_gui, DetectionConfig config = DetectionConfig( model_name="yolov8s.pt", confidence_threshold=0.6, target_fps=25, show_heatmap=True, filter_classes=["person", "car"], ) launch_gui(config)
3. Headless detection (no GUI)
simple_detect() opens an OpenCV window and runs until the target class is seen or q is pressed. Uses CameraManager and ObjectDetector context managers internally β camera is always released.
frompyaitk.eyeimport simple_detect detected = simple_detect(target_class="person") print("Detected:", detected)
macOS note:
cv2.imshowmust be called from the main thread. Only callsimple_detect()from the main thread.
4. Headless with custom config
frompyaitk.eyeimport simple_detect, DetectionConfig config = DetectionConfig( model_name="yolov8m.pt", confidence_threshold=0.45, camera_index=1, target_fps=20, ) detected = simple_detect(target_class="car", config=config) print(detected)
5. Legacy aliases (pyaitk / core.py compatibility)
frompyaitk.eyeimport EYE, OpenEYE detected = EYE() # β simple_detect() OpenEYE() # β launch_gui()
6. Low-level components with context managers
frompyaitk.eyeimport CameraManager, ObjectDetector, DetectionConfig importcv2 config = DetectionConfig() with CameraManager(camera_index=0, width=640, height=480) as cam: with ObjectDetector.from_model("yolov8n.pt", config) as det: ret, frame = cam.read() if ret: detected_classes, annotated, count, confidences = det.detect(frame) print(f"Detected {count} objects: {detected_classes}") cv2.imshow("Frame", annotated) cv2.waitKey(0) cv2.destroyAllWindows() # No explicit release needed β context managers handle it
GUI Controls Reference
| Control | Description |
|---|---|
| Model selector | Switch between yolov8n/s/m/l/x.pt (hot-swap, no restart needed) |
| Confidence slider | Detection threshold from 0.10 to 0.95 |
| Class filter field | Type a class name and press Enter or "Add Filter" |
| Clear Filter button | Remove all filters (show all classes) |
| Heatmap toggle | Enable/disable the alpha-blended detection heat-map |
| Pause / Resume button | Freeze/unfreeze the video loop |
| Screenshot button | Save annotated frame as screenshot_YYYYMMDD_HHMMSS.jpg |
| Record button | Start/stop saving annotated video as recording_*.mp4 |
| Detection log | Scrollable list of last 200 timestamped events |
| Clear Log button | Empty the detection log |
| Status bar | Camera info, resolution, active model name |
Keyboard Shortcuts
| Key | Action |
|---|---|
Space |
Pause / Resume |
S |
Screenshot |
R |
Start / Stop recording |
Q |
Quit |
API Reference
EyeSession
Unified high-level facade with two factory context managers.
EyeSession.gui(config?) β context manager
Launches the full CustomTkinter GUI. Blocks on session.run() until the window is closed. Tears down the app on exit.
with EyeSession.gui(config=DetectionConfig(model_name="yolov8s.pt")) as session: session.run() print(session.last_detections)
EyeSession.headless(target_class, config?) β context manager
Headless detection. Blocks on session.run() until target is detected or q is pressed. Calls cv2.destroyAllWindows() on exit.
with EyeSession.headless("car") as session: detected = session.run()
| Property | Type | Description |
|---|---|---|
last_detections |
list[str] |
Class names detected at the time of exit |
launch_gui(config) β None
Functional launcher. Uses DetectionApp.run_app() context manager internally.
| Parameter | Type | Default | Description |
|---|---|---|---|
config |
DetectionConfig or None |
None |
Load from ~/.yolov8_app.json if None |
simple_detect(target_class, config) β list[str]
Headless OpenCV detection loop. Uses CameraManager and ObjectDetector context managers internally.
| Parameter | Type | Default | Description |
|---|---|---|---|
target_class |
str |
"person" |
Class name to trigger exit on |
config |
DetectionConfig or None |
None |
Uses defaults if None |
Returns: sorted list of all class names detected at the time of exit.
CameraManager(camera_index, width, height)
| Parameter | Type | Default | Description |
|---|---|---|---|
camera_index |
int |
0 |
OpenCV device index |
width |
int |
640 |
Requested frame width |
height |
int |
480 |
Requested frame height |
| Method / Property | Returns | Description |
|---|---|---|
open() |
bool |
Open the device; returns True on success |
read() |
tuple[bool, ndarray|None] |
Read one frame |
release() |
None |
Release the capture device |
is_opened |
bool |
True if device is currently open |
__enter__ / __exit__ |
β | Opens on enter, releases on exit |
CameraManager.open_device(idx, w, h) |
context manager | Factory @contextmanager |
ObjectDetector(model, config)
| Method | Returns | Description |
|---|---|---|
detect(frame) |
tuple |
Run inference; see return table below |
__enter__ / __exit__ |
β | Clears heatmap accumulator on exit |
ObjectDetector.from_model(name, config?) |
context manager | Loads model + yields detector |
detect() return values:
| Field | Type | Description |
|---|---|---|
detected_classes |
set[str] |
All class names above confidence threshold |
annotated_frame |
np.ndarray |
Original-resolution BGR frame with bounding boxes |
object_count |
int |
Total number of detections |
class_confidences |
dict[str, float] |
Highest confidence per class name |
DetectionApp(config)
| Method | Description |
|---|---|
mainloop() |
Start the Tk event loop (blocks) |
__enter__ / __exit__ |
Calls _close_app() on exit |
DetectionApp.run_app(config?) |
Factory context manager; sets CTk theme |
DetectionConfig fields
| Field | Type | Default | Description |
|---|---|---|---|
model_name |
str |
"yolov8n.pt" |
YOLOv8 weight file name |
target_class |
str |
"person" |
Class to watch for in headless mode |
confidence_threshold |
float |
0.5 |
Minimum detection confidence (0.0β1.0) |
frame_width |
int |
640 |
Requested camera frame width |
frame_height |
int |
480 |
Requested camera frame height |
detection_size |
tuple[int,int] |
(416, 416) |
Resize resolution for YOLO inference |
camera_index |
int |
0 |
OpenCV camera index |
target_fps |
int |
30 |
FPS cap for the detection loop |
show_heatmap |
bool |
False |
Enable detection heat-map overlay |
filter_classes |
list[str] |
[] |
Restrict boxes to these classes; empty = all |
Persistence:
config = DetectionConfig(confidence_threshold=0.7) config.save() # β ~/.yolov8_app.json config = DetectionConfig.load() # restore from file
ModelLoader.load_model(model_name) β YOLO | None
Tries four strategies in order:
- Package resource (if running inside a package)
- Script directory and adjacent
eye/subfolder - Current working directory
- Auto-download from Ultralytics
Architecture Overview
EyeSession
βββ EyeSession.gui() β DetectionApp.run_app() β DetectionApp.__enter__/__exit__
βββ EyeSession.headless() β simple_detect()
βββ CameraManager.__enter__/__exit__
βββ ObjectDetector.from_model().__enter__/__exit__
DetectionApp (GUI)
βββ Main thread βββ CTk event loop + all widget mutations (via self.after())
β β
β polls frame queue at ~60 Hz
β β
βββ Video thread βββ camera.read() β detector.detect() β FramePacket β Queue(maxsize=2)
Resource ownership
ββββββββββββββββββ
CameraManager.__exit__ β cap.release()
ObjectDetector.__exit__ β _heatmap = None
DetectionApp.__exit__ β thread.join(2s) β writer.release() β camera.release() β destroy()
EyeSession.__exit__ β app._close_app() (if GUI) / cv2.destroyAllWindows() (if headless)
Examples Summary
# ββ EyeSession (recommended) βββββββββββββββββββββββββββββββββββββββββββββββββ frompyaitk.eyeimport EyeSession, DetectionConfig # GUI with EyeSession.gui() as session: session.run() print(session.last_detections) # GUI with custom config cfg = DetectionConfig(model_name="yolov8s.pt", show_heatmap=True) with EyeSession.gui(config=cfg) as session: session.run() # Headless with EyeSession.headless("car") as session: detected = session.run() # ββ Functional API βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ frompyaitk.eyeimport launch_gui, simple_detect launch_gui() detected = simple_detect("person") # ββ Low-level context managers βββββββββββββββββββββββββββββββββββββββββββββββ frompyaitk.eyeimport CameraManager, ObjectDetector importcv2 with CameraManager(0) as cam: with ObjectDetector.from_model("yolov8n.pt") as det: ret, frame = cam.read() classes, annotated, count, confs = det.detect(frame) cv2.destroyAllWindows() # ββ DetectionApp directly ββββββββββββββββββββββββββββββββββββββββββββββββββββ frompyaitk.eyeimport DetectionApp, DetectionConfig with DetectionApp.run_app(DetectionConfig()) as app: app.mainloop() # ββ Legacy aliases βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ frompyaitk.eyeimport EYE, OpenEYE EYE() # headless detect OpenEYE() # launch GUI
CLSE - Compositional Latent Synthesis Engine
A complete, self-contained Compositional Latent Synthesis Engine pipeline built entirely on NumPy, NLTK, scikit-learn, and PyTorch β no Stable Diffusion or external model weights required. Converts natural-language prompts into images through a multi-stage NLP β neural model β renderer architecture, with full support for procedural art, visual effects, animation, streaming, and a rich CLI.
What Is This?
The CLSE system is a nine-module package covering every layer of the Compositional Latent Synthesis Engine stack:
| Module | Class / Entry Point | Responsibility |
|---|---|---|
TTI_config.py |
TTIConfig, get_config() |
Master config β all tuneable parameters |
TTI_core.py |
TTIImage, ImageCanvas, ColorUtils, ImageIO |
Pixel-level image engine (BMP/PNG/JPEG I/O, drawing) |
TTI_art.py |
ProceduralArt, VisualEffects, StreamingWriter, AnimationEngine |
Procedural art, effects, animation, large-image streaming |
TTI_ai.py |
TTIGenerator, NLPAnalyser, PromptAnalysis |
NLP pipeline + numpy VAE renderer + colour predictor |
TTI_model.py |
TTIModel, TTITrainer, TTILoss |
4-layer transformer VAE (3.8M params), training loop |
TTI_dataset.py |
TTIDataset, Vocabulary |
Synthetic 50k-sample dataset generator with caching |
TTI_pipeline.py |
TTIPipeline |
Full end-to-end pipeline connecting all modules |
TTI_train.py |
CLI training script | Production training with checkpointing and early stopping |
TTI_main.py |
TTIPipeline faΓ§ade + main() |
Unified entry point + full CLI |
Note
These
*.pyfiles are inner file of CLSE (Compositional Latent Synthesis Engine).
Quick Start
frompyaitk.CLSEimport TTIPipeline pipe = TTIPipeline() # Generate from a text prompt img = pipe.generate("a calm blue ocean at sunset", output="ocean.png") # Procedural art (no model needed) img = pipe.art("mandelbrot", output="fractal.png") # Apply an effect to an existing image img = pipe.effect("sepia", "photo.png", output="photo_sepia.png") # Analyse a prompt analysis = pipe.analyse("dark gothic castle at midnight") print(analysis.scene_type, analysis.colour_matches)
Module Guide
Configuration
frompyaitk.CLSEimport TTIConfig, get_config, update_config, reset_config # Global singleton (auto-discovers tti_config.json next to the module) cfg = get_config() # Read settings print(cfg.image.default_width) # 512 print(cfg.ai.model_type) # "vae_numpy" print(cfg.art.fractal_max_iter) # 256 print(cfg.paths.output_dir) # "tti_output" # Bulk update update_config( image={"default_width": 1024, "default_height": 1024}, ai={"seed": 42, "num_inference_steps": 100}, ) # Save / load JSON snapshot cfg.save("tti_config.json") cfg2 = TTIConfig.load("tti_config.json") # Create all output directories cfg.ensure_dirs() # Reset to factory defaults reset_config()
Config sections:
| Section | Dataclass | Key fields |
|---|---|---|
cfg.image |
ImageConfig |
default_width, default_height, default_format, background_color, jpeg_quality |
cfg.ai |
AIConfig |
model_type, latent_dim, vocab_size, num_inference_steps, guidance_scale, seed |
cfg.art |
ArtConfig |
fractal_max_iter, blur_default_radius, noise_default_intensity, animation_fps |
cfg.paths |
PathConfig |
output_dir, model_dir, cache_dir, log_dir |
cfg.log |
LogConfig |
level, log_to_file, log_filename, show_progress |
Image Engine
frompyaitk.CLSEimport TTIImage, ImageCanvas, ColorUtils, ImageIO, ImageValidator # Create a blank image img = TTIImage(width=512, height=512, bpp=24, background=(20, 30, 60)) # Pixel operations img.set_pixel(100, 100, (255, 0, 0)) color = img.get_pixel(100, 100) # (255, 0, 0) # NumPy array interop arr = img.to_array() # shape (H, W, 3), dtype=uint8 img.from_array(arr) # Save and load img.save("output.png") img.save("output.jpg", fmt="jpeg") img.save("output.bmp", fmt="bmp") img2 = TTIImage.load("output.png") # Drawing (ImageCanvas) canvas = ImageCanvas(img) canvas.line(0, 0, 511, 511, (255, 255, 0)) canvas.rectangle(50, 50, 200, 200, (255, 100, 0), filled=True) canvas.circle(256, 256, 100, (0, 200, 255), filled=False) canvas.fill_background((10, 10, 30)) # Colour utilities rgb = ColorUtils.hsv_to_rgb(0.6, 0.8, 0.9) blended = ColorUtils.lerp((255, 0, 0), (0, 0, 255), t=0.5) rgba = ColorUtils.to_rgba((100, 150, 200)) # adds alpha=255 clamped = ColorUtils.clamp(300) # β 255 # Multi-format I/O (static) img = ImageIO.load("photo.png") ImageIO.save(img, "photo.jpeg", quality=85) # Integrity check ImageValidator.validate(img) # raises TTIImageError if corrupt
Exception hierarchy:
TTIError
βββ TTIImageError β pixel-level or dimension errors
βββ TTIIOError β file read/write failures
Procedural Art & Effects
ProceduralArt
All methods are static and return TTIImage objects.
frompyaitk.CLSEimport ProceduralArt, VisualEffects, StreamingWriter, AnimationEngine # Fractals img = ProceduralArt.mandelbrot_set(width=800, height=600) img = ProceduralArt.julia_set(800, 600, c_real=-0.7, c_imag=0.27015) img = ProceduralArt.sierpinski_triangle(512, 512, depth=7) # Patterns img = ProceduralArt.plasma(512, 512) img = ProceduralArt.voronoi(512, 512, n_cells=25, seed=42) img = ProceduralArt.perlin_noise_image(512, 512, octaves=4) # Gradients img = VisualEffects.create_linear_gradient(512, 512, (70, 130, 200), (200, 80, 120)) img = VisualEffects.create_radial_gradient(512, 512, (255, 220, 50), (30, 30, 120))
VisualEffects
Apply filters to existing TTIImage objects (all return the modified image):
img = VisualEffects.blur(img, radius=3) img = VisualEffects.gaussian_blur(img, sigma=2.0) img = VisualEffects.sharpen(img, factor=1.5) img = VisualEffects.edge_detect(img) img = VisualEffects.emboss(img) img = VisualEffects.grayscale(img) img = VisualEffects.sepia(img, strength=0.8) img = VisualEffects.invert(img) img = VisualEffects.add_noise(img, intensity=0.15) img = VisualEffects.pixelate(img, block_size=10) img = VisualEffects.vignette(img, strength=0.6) img = VisualEffects.adjust_brightness(img, factor=1.2) img = VisualEffects.adjust_contrast(img, factor=1.3) img = VisualEffects.blend(img1, img2, alpha=0.5)
StreamingWriter β large images without full RAM
frompyaitk.CLSEimport StreamingWriter # Write a 4000Γ4000 image row by row with StreamingWriter("large.png", width=4000, height=4000, bpp=24) as sw: for y in range(4000): row = [(y % 255, 100, 200)] * 4000 # list of RGB tuples sw.write_row(row)
AnimationEngine β frame sequences
frompyaitk.CLSEimport AnimationEngine engine = AnimationEngine(fps=24) frames = engine.generate_frames(base_img, n_frames=48, mode="zoom") engine.save_frames(frames, output_dir="frames/", fmt="png")
CustomBitDepth β non-standard pixel formats
frompyaitk.CLSEimport CustomBitDepth # 16-bit per channel, 3 channels cbd = CustomBitDepth(width=256, height=256, bits_per_channel=16, n_channels=3) cbd.set_pixel(0, 0, [65535, 0, 32768]) cbd.save("high_depth.custimg") cbd.save_preview("preview.png") # downsample to 8-bit PNG for viewing
AI Engine
frompyaitk.CLSEimport TTIGenerator, NLPAnalyser # NLP analysis nlp = NLPAnalyser() analysis = nlp.analyse("a mysterious purple galaxy with glowing stars") print(analysis.scene_type) # "starfield" print(analysis.nouns) # ["galaxy", "stars"] print(analysis.adjectives) # ["mysterious", "purple", "glowing"] print(analysis.colour_matches) # [("purple", (138,43,226)), ("gold", (255,215,0)), ...] print(analysis.modifiers) # {"mysterious": 0.8, "glowing": 0.6} print(analysis.filtered_tokens) # ["mysterious", "purple", "galaxy", "glowing", "stars"] print(analysis.complexity()) # 0.72 (float 0β1) # Full generation frompyaitk.CLSEimport get_config gen = TTIGenerator(get_config()) img = gen.generate("stormy sea at dusk", width=512, height=512, seed=99) img.save("storm.png") # Variations imgs = gen.generate_variations("neon city at night", n_variations=4) for i, img in enumerate(imgs): img.save(f"variation_{i}.png") # Interpolation between two prompts imgs = gen.interpolate("sunrise", "midnight", steps=6) for i, img in enumerate(imgs): img.save(f"interp_{i:02d}.png")
AI pipeline (internal stages):
NLPAnalyser β 256-d SemanticVector (TF-IDF + PCA, NLTK tokeniser)
ColourPredictor β PaletteSpec (sklearn KNN on 170+ colour keywords)
SceneComposer β 128-d LatentCode (numpy VAE encoder)
ImageDecoder β TTIImage (14 scene-type renderers)
Neural Architecture
frompyaitk.CLSEimport TTIModel, TTIModelLarge, ModelConfig, TTITrainer, TTILoss # Default model (4-layer transformer, 3.8M params) cfg = ModelConfig(vocab_size=8192, embed_dim=256, n_layers=4, n_heads=8, latent_dim=128) model = TTIModel(cfg) # Large variant (6-layer, 512-d, ~14M params) model_large = TTIModelLarge() # Forward pass importtorch tokens = torch.randint(0, 8192, (4, 32)) # batch=4, seq_len=32 out = model(tokens) # out.scene_logits : (4, 15) β 15-class scene prediction # out.colour_pred : (4, 18) β 6 colours Γ RGB # out.param_pred : (4, 64) β scene renderer parameters # out.mu, out.logvar: (4, 128) β VAE latent distribution # Multi-task loss criterion = TTILoss(scene_weight=1.0, colour_weight=0.5, param_weight=0.3, kl_weight=0.01) loss = criterion(out, scene_labels, colour_targets, param_targets) # Training loop trainer = TTITrainer(model, dataset, val_dataset, output_dir="tti_models/") history = trainer.train(epochs=20, batch_size=64, lr=3e-4) print(history["best_val_loss"])
Model architecture:
TokenEmbedding β learned token + sinusoidal positional embeddings
TransformerEncoder β 4Γ MultiHeadSelfAttention + FFN (BERT-style, pre-LN)
βββ ColourHead β 3-layer MLP β 18-d colour palette
βββ SceneClassifier β 2-layer MLP β 15-class scene logit
βββ ParamDecoder β VAE: ΞΌ/Ο β z β 64-d parameter vector
Dataset Generator
frompyaitk.CLSEimport TTIDataset, Vocabulary, build_dataset # Build a 50,000-sample dataset (cached to disk with SHA-256 integrity check) dataset = build_dataset( n_samples=50_000, cache_dir="tti_models/", force_rebuild=False, # use cache if available ) # Train / val / test splits train_ds, val_ds, test_ds = dataset.splits() print(len(train_ds), len(val_ds), len(test_ds)) # 40000, 5000, 5000 # DataLoader-compatible access sample = train_ds[0] # sample["token_ids"] : torch.LongTensor (32,) # sample["scene_label"] : int (0β14) # sample["colour_vec"] : torch.FloatTensor (18,) # sample["param_vec"] : torch.FloatTensor (64,) # sample["prompt"] : str # Vocabulary vocab = Vocabulary.load("tti_models/vocab.json") ids = vocab.encode("a stormy sea at dusk") # list of int text = vocab.decode(ids) # str print(vocab.size) # 8192
Dataset statistics (default build):
| Split | Samples |
|---|---|
| Train | 40,000 |
| Validation | 5,000 |
| Test | 5,000 |
| Vocab size | 8,192 |
| Scene classes | 15 |
| Colour dims | 18 (6 colours Γ RGB) |
| Parameter dims | 64 |
Unified Pipeline
frompyaitk.CLSEimport TTIPipeline pipe = TTIPipeline() # Generate img = pipe.generate("a bright rainbow over a misty waterfall", output="rainbow.png") # Variations imgs = pipe.variations("neon city at night", n=4, output_dir="variations/") # Interpolation imgs = pipe.interpolate("sunrise", "midnight", steps=6, output_dir="interp/") # Procedural art img = pipe.art("julia", output="julia.png", c_real=-0.4, c_imag=0.6) # Effects img = pipe.effect("sepia", input_path="photo.png", output="photo_sepia.png") # NLP analysis only info = pipe.analyse("dark gothic castle at midnight") # Animation frames = pipe.animate("sunset over the ocean", n_frames=24, output_dir="frames/") # Stream a very large image (memory-safe) pipe.stream_large("huge.png", width=4000, height=4000, pattern="gradient") # Model info print(pipe.model_info()) # Config management pipe.show_config() pipe.save_config("snapshot.json")
Architecture Overview
TTI_main.py / TTI_pipeline.py β unified faΓ§ade
β
βββ TTI_config.py β all settings (reads from config.pbcfg / tti_config.json)
β
βββ TTI_ai.py β NLP + VAE renderer (no model weights needed)
β βββ NLPAnalyser NLTK tokenise β TF-IDF β PCA β SemanticVector
β βββ ColourPredictor KNN on 170+ colour keywords β PaletteSpec
β βββ SceneComposer numpy VAE encoder β 128-d LatentCode
β βββ ImageDecoder 14 scene-type renderers β TTIImage
β
βββ TTI_model.py β PyTorch transformer VAE (optional, boosts quality)
β βββ TokenEmbedding learnable + positional
β βββ TransformerEncoder 4-layer BERT-style
β βββ ColourHead β 18-d palette
β βββ SceneClassifier β 15-class scene
β βββ ParamDecoder β 64-d renderer params (VAE)
β
βββ TTI_dataset.py β 50k-sample synthetic generator + Vocabulary + DataLoader
β
βββ TTI_train.py β training script (gradient ckpt, cosine LR, early stop)
β
βββ TTI_core.py β pixel engine (TTIImage, ImageCanvas, ColorUtils, I/O)
β
βββ TTI_art.py β procedural art, effects, streaming, animation
Examples Summary
frompyaitk.CLSEimport TTIPipeline frompyaitk.CLSEimport ProceduralArt, VisualEffects frompyaitk.CLSEimport NLPAnalyser frompyaitk.CLSEimport get_config, update_config pipe = TTIPipeline() # Generate img = pipe.generate("stormy sea at midnight", output="storm.png") # Batch pipe.generate_batch(["sunrise", "sunset", "noon"], output_dir="batch/") # Variations + interpolation pipe.generate_variations("neon city", n=4, output_dir="vars/") pipe.interpolate("calm lake", "raging ocean", steps=6, output_dir="interp/") # Procedural art pipe.art("mandelbrot", output="mandelbrot.png", width=800, height=600) pipe.art("julia", output="julia.png", c_real=-0.4, c_imag=0.6) # Effects pipe.effect("sepia", "input.png", output="sepia.png") pipe.effect("vignette", "input.png", strength=0.7, output="vig.png") # Analysis info = pipe.analyse("mysterious purple galaxy") print(info.scene_type, info.complexity()) # Config override update_config(image={"default_width": 1024}, ai={"seed": 7}) # Full demo pipe.demo(output_dir="tti_demo/")
Camera β Camera Module
A Tkinter-based camera viewer with live QR/barcode scanning, image capture, video recording, and full context-manager support. Wraps OpenCV and pyzbar behind a clean class-based API with thread-safe state management.
What Is This?
Camera provides a camera module that combines:
- Live video preview β Tkinter GUI window displaying frames from any OpenCV-compatible device
- QR / barcode scanning β auto-decodes QR codes and barcodes from every frame via
pyzbar; accumulates all unique payloads thread-safely - Image capture β saves the current frame as a timestamped
.pngwith one method call - Video recording β start/stop writing annotated frames to a timestamped
.avifile using XVID codec - Context-manager protocol β
Camera.open()creates the Tk root, opens the device, and releases everything on exit - Functional entry-point β
Start()launches the GUI and returns all scanned data in one call - Auto-install of
pyzbarβ silently attemptspip install pyzbarif the library is missing; degrades gracefully if it still can't be imported - Thread-safe design β separate
RLockguards for scanned data and video writer; frame loop never blocks on recording
Installation
On Linux,
pyzbaralso needs the nativezbarlibrary:sudoaptinstalllibzbar0On Windows, the
pyzbarwheel bundles the DLL automatically.
How to Use
1. Simplest β functional one-liner
Launches the GUI; blocks until the window is closed; returns all scanned payloads.
frompyaitk.Cameraimport Start data = Start() for item in data: print(item)
2. Context manager (recommended)
frompyaitk.Cameraimport Camera with Camera.open() as cam: cam.run() # blocks until the window is closed print(cam.scanned_data) # set of all scanned QR/barcode strings
3. Custom device and output directory
frompyaitk.Cameraimport Camera with Camera.open(device=1, output_dir="./recordings") as cam: cam.run()
4. Programmatic capture and recording
frompyaitk.Cameraimport Camera importtkinterastk root = tk.Tk() cam = Camera(root, device=0, output_dir="./output") # Capture a single frame immediately path = cam.capture_image() print(f"Saved: {path}") # Start and stop recording record_path = cam.start_recording() print(f"Recording to: {record_path}") # β¦ run some frames β¦ cam.run() # blocks until window closed cam.stop_recording() cam.close()
5. Inspect scanned data before closing
frompyaitk.Cameraimport Camera importthreading with Camera.open() as cam: # Poll scanned_data from another thread defmonitor(): importtime while not cam.is_closed: print("Scanned so far:", cam.scanned_data) time.sleep(2.0) t = threading.Thread(target=monitor, daemon=True) t.start() cam.run() print("Final scanned data:", cam.scanned_data)
6. State checks
with Camera.open() as cam: print(cam.is_recording) # False cam.start_recording() print(cam.is_recording) # True cam.stop_recording() print(cam.is_closed) # False cam.run() print(cam.is_closed) # True
GUI Reference
The camera window opens with three buttons:
| Button | Action |
|---|---|
| πΈ Capture Image | Saves current frame as captured_YYYYMMDD_HHMMSS.png in output_dir; shows a confirmation dialog |
| π₯ Start Recording / βΉοΈ Stop Recording | Toggles video recording; saves to output_YYYYMMDD_HHMMSS.avi in output_dir |
| β Quit | Stops recording (if active), releases the camera, and destroys the window |
Closing the window via the title bar Γ button also triggers a clean shutdown.
API Reference
Start(device, output_dir) β set[str]
Functional entry-point. Launches the GUI, blocks until the window is closed, and returns all scanned QR/barcode payloads.
| Parameter | Type | Default | Description |
|---|---|---|---|
device |
int |
0 |
OpenCV camera device index |
output_dir |
Path or str |
"." |
Directory for saved images and videos |
Camera.open(device, output_dir) β context manager
Class-level @contextmanager factory. Creates a tk.Tk root, instantiates Camera, yields it, and calls close() on exit.
with Camera.open(device=0, output_dir="./out") as cam: cam.run()
Camera(window, device, output_dir)
Direct constructor for advanced use when you manage the Tk root yourself.
| Parameter | Type | Default | Description |
|---|---|---|---|
window |
tk.Tk |
β | Root Tkinter window |
device |
int |
0 |
OpenCV capture device index |
output_dir |
Path or str |
"." |
Directory for saved images and videos |
Raises: CameraError if the device cannot be opened.
Instance methods
| Method | Returns | Description |
|---|---|---|
run() |
None |
Enter the Tk main-loop (blocks until window closed) |
capture_image() |
Path or None |
Capture current frame to output_dir; returns saved path |
start_recording() |
Path or None |
Begin writing frames to .avi; returns output path |
stop_recording() |
None |
Stop recording and flush the video file |
close() |
None |
Release all resources; safe to call multiple times |
Properties
| Property | Type | Description |
|---|---|---|
scanned_data |
set[str] |
Thread-safe snapshot of all decoded QR/barcode payloads |
is_recording |
bool |
True while a video is being written |
is_closed |
bool |
True after close() has been called |
Output Files
All files are written to output_dir (default: current directory).
| File pattern | Format | Created by |
|---|---|---|
captured_YYYYMMDD_HHMMSS.png |
PNG | capture_image() / πΈ button |
output_YYYYMMDD_HHMMSS.avi |
AVI (XVID, 20 fps) | start_recording() / π₯ button |
Exception Reference
CameraError(RuntimeError)
βββ Raised when the camera device cannot be opened
βββ Raised when run() is called after the camera is already closed
Architecture Notes
- Frame loop β driven by
window.after(_FRAME_DELAY_MS, _update_frame)(10 ms β 100 fps cap); never blocks the Tk event loop - QR decoding β runs on every frame inside the frame loop; new unique payloads are added to
_scanned_dataunder_data_lock - Video writer β guarded by
_writer_lockso the recording flag andcv2.VideoWriterare always consistent across threads pyzbargraceful degradation β if not installed, apip installis attempted once at import time; if still unavailable, QR scanning is silently disabled and the rest of the module works normally- Clean shutdown β
close()stops recording, releases the OpenCV capture device, and destroys the Tk window; idempotent (safe to call multiple times)
Examples Summary
# Simplest: launch and collect scanned QR data frompyaitk.Cameraimport Start data = Start() # Context manager frompyaitk.Cameraimport Camera with Camera.open(device=0, output_dir="./out") as cam: cam.run() print(cam.scanned_data) # Custom device with Camera.open(device=1) as cam: cam.run() # Programmatic capture before running importtkinterastk frompyaitk.Cameraimport Camera root = tk.Tk() cam = Camera(root, output_dir="./shots") cam.capture_image() # save a frame immediately cam.start_recording() # start video cam.run() # blocks cam.stop_recording() cam.close() # State checks with Camera.open() as cam: print(cam.is_recording) # False cam.start_recording() print(cam.is_recording) # True cam.run()
Memory β Memory Module for Pythonaibrain
A two-tier episodic memory system for Brain and AdvanceBrain. Memory provides a thread-safe, JSON-backed key/value store. SmartMemory extends it transparently with a full ML pipeline β TF-IDF β Autoencoder β Clustering β IntentClassifier β enabling semantic search over conversation history without changing any call sites.
What Is This?
Memory provides two public classes and a factory function:
| Class / Function | Description |
|---|---|
Memory |
Thread-safe, JSON-backed key/value episodic store with LRU eviction |
SmartMemory |
Drop-in superset of Memory with semantic search, intent prediction, and cluster analytics |
build_memory() |
Factory that returns SmartMemory when available, Memory otherwise |
MemoryEntry |
Dataclass holding a single episodic record with timestamps and access metadata |
Key properties shared by both classes:
- Thread-safe β all public methods acquire an
RLock; the background ML fit thread takes only a snapshot so reads/writes are never blocked - Atomic writes β
save_memory()writes to a.tmpfile then renames it, preventing corruption on crash - LRU eviction β when the store reaches
max_entries, the least-recently-used key is evicted - Backward-compatible format β transparently migrates legacy flat
{key: value}JSON files to the v2 rich format - Graceful degradation β if
SummarizerAIis absent,SmartMemorysilently falls back to plainMemorybehaviour
How to Use
1. Plain Memory (classic key/value)
frompyaitk.Memoryimport Memory mem = Memory(path="memory.json") mem.remember("user_name", "Divyanshu") mem.remember("last_topic", "Python programming") print(mem.recall("user_name")) # "Divyanshu" print(mem.recall("missing_key", default="N/A")) # "N/A" mem.save_memory() mem.load_memory() # reload from disk
2. SmartMemory β drop-in with ML extras
frompyaitk.Memoryimport SmartMemory mem = SmartMemory( path="memory.json", auto_fit=True, # refit every fit_interval calls to remember() fit_interval=20, # default: 20 ) mem.remember("Hello! How are you?", "I'm doing great, thanks!") mem.remember("Tell me a joke", "Why did the chicken cross the road?") # β¦ add more entries β¦ # Semantic search results = mem.semantic_search("a funny story", top_k=3) for r in results: print(r["score"], r["key"], r["value"]) # Intent prediction intent = mem.predict_intent("What's the weather today?") print(intent) # e.g. "weather_query" # Export cluster/intent analytics report mem.export_report("memory_report.json") mem.print_report() mem.save_memory()
3. build_memory() factory (recommended for Brain)
frompyaitk.Memoryimport build_memory # Returns SmartMemory if SummarizerAI is available, Memory otherwise mem = build_memory("memory.json", smart=True, fit_interval=50) mem.remember("greeting", "Hello!") mem.save_memory()
4. Checking and controlling the summarizer
frompyaitk.Memoryimport SmartMemory mem = SmartMemory(path="memory.json", auto_fit=False) # Manually trigger a (blocking) fit success = mem.fit_summarizer() print(success) # True / False # Check fit state print(mem.is_summarizer_fitted()) # True after fit # Get the report object report = mem.get_report() # MemorySummaryReport dataclass or None
5. Extended Memory API
frompyaitk.Memoryimport Memory mem = Memory() mem.remember("a", "alpha") mem.remember("b", "beta") # Delete one entry mem.forget("a") # Check membership print("b" in mem) # True print("a" in mem) # False # Iterate for key in mem.keys(): print(key) for key, value in mem.items(): print(key, "β", value) # Snapshot as plain dict snapshot = mem.to_dict() # {"b": "beta"} print(mem.size) # 1 print(len(mem)) # 1 # Clear everything (in-memory only; disk unchanged until save_memory) mem.wipe()
6. Inspecting MemoryEntry metadata
frompyaitk.Memoryimport Memory mem = Memory() mem.remember("topic", "AI and robotics") # Access internal entry (advanced) entry = mem._entries["topic"] print(entry.key) # "topic" print(entry.value) # "AI and robotics" print(entry.timestamp) # Unix timestamp of creation print(entry.access_count) # Number of recalls print(entry.last_accessed) # Unix timestamp of last recall print(entry.to_dict()) # Full dict for serialization
API Reference
Memory(path, max_entries, auto_load)
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
memory.json |
Path to the JSON persistence file |
max_entries |
int |
10_000 |
Max entries before LRU eviction |
auto_load |
bool |
True |
Load from disk automatically if file exists on init |
Core contract methods (used by Brain / AdvanceBrain)
| Method | Returns | Description |
|---|---|---|
load_memory() |
None |
Load (or reload) from disk; silent no-op if file absent |
remember(key, value) |
None |
Store or update; evicts LRU entry when at max_entries |
recall(key, default="") |
str |
Retrieve value for key; returns default if not found |
save_memory() |
None |
Atomically persist current state to disk |
Extended methods
| Method | Returns | Description |
|---|---|---|
forget(key) |
bool |
Delete one entry; returns True if key existed |
wipe() |
None |
Clear all in-memory entries (disk unchanged) |
keys() |
Iterator[str] |
Iterate over all stored keys |
items() |
Iterator[tuple[str,str]] |
Iterate over (key, value) pairs |
to_dict() |
dict[str, str] |
Plain {key: value} snapshot without metadata |
Properties
| Property | Type | Description |
|---|---|---|
path |
Path |
Resolved path to the backing JSON file |
size |
int |
Number of currently stored entries |
SmartMemory(path, max_entries, auto_load, auto_fit, fit_interval, summarizer_config)
Inherits all Memory methods. Additional constructor parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
auto_fit |
bool |
True |
Auto-refit summarizer every fit_interval calls to remember() |
fit_interval |
int |
20 |
Number of remember() calls between automatic refits |
summarizer_config |
dict or None |
None |
Pass custom config to MemorySummarizer; None = from .pbcfg |
SmartMemory-only methods
| Method | Returns | Description |
|---|---|---|
fit_summarizer(force=False) |
bool |
(Re-)train ML pipeline; blocks caller; returns True on success |
semantic_search(text, top_k=3) |
list[dict] |
Ranked semantic matches; falls back to substring search if unfit |
predict_intent(text) |
str |
Predict intent of query from trained classifier |
get_report() |
MemorySummaryReport or None |
Return cluster/intent analysis report |
export_report(path) |
bool |
Write report as JSON; returns True on success |
print_report() |
None |
Pretty-print cluster report to stdout |
is_summarizer_fitted() |
bool |
True if summarizer is trained and not dirty |
semantic_search() result format
Each item in the returned list:
| Field | Type | Description |
|---|---|---|
score |
float |
Cosine similarity score 0.0β1.0 |
key |
str |
Original input key stored in memory |
value |
str |
Stored response value |
intent |
str |
Predicted intent label |
cluster |
int |
Cluster assignment index (-1 if unassigned) |
build_memory(path, smart, **kwargs) β Memory
Factory function. Returns SmartMemory when smart=True and SummarizerAI is importable; returns plain Memory otherwise.
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
memory.json |
Path for JSON persistence |
smart |
bool |
True |
Attempt to use SmartMemory |
**kwargs |
β | β | Forwarded to SmartMemory or Memory constructor |
MemoryEntry fields
| Field | Type | Description |
|---|---|---|
key |
str |
Entry key |
value |
str |
Stored value |
timestamp |
float |
Unix timestamp of creation |
access_count |
int |
Number of times recall() accessed this key |
last_accessed |
float |
Unix timestamp of most recent recall |
Architecture
Memory
βββ _entries : OrderedDict[str, MemoryEntry] β LRU-ordered episodic store
βββ _lock : threading.RLock β protects all mutations
βββ _path : Path β JSON backing file
SmartMemory(Memory)
βββ _summarizer : MemorySummarizer | None β ML pipeline
βββ _dirty : bool β needs refit flag
βββ _auto_fit : bool β enable background refits
βββ _fit_interval : int β refit every N remembers
βββ _fit_lock : threading.Lock β serializes fit calls
βββ _fit_thread : Thread | None β background training thread
MemorySummarizer (optional external module)
βββ TF-IDF β Autoencoder β KMeans/DBSCAN β IntentClassifier β PatternMatcher
Background fit behaviour:
- Every
fit_intervalcalls toremember(),_fit_background()spawns a daemon thread - The thread takes a snapshot of
_storeunder the lock, then trains outside it β memory reads/writes are never blocked by training - If a fit is already running, additional triggers are skipped
semantic_search()triggers a blockingfit_summarizer()automatically if the summarizer is unfit (_dirty=True)
File Format
Memory is persisted as a JSON file with a version marker:
{ "__version__":2, "saved_at":1718000000.0, "entry_count":3, "entries":[ { "key":"user_name", "value":"Divyanshu", "timestamp":1718000000.0, "access_count":4, "last_accessed":1718001000.0 } ] }
Legacy flat {key: value} files from v1 are automatically migrated on load.
Examples Summary
# Plain Memory frompyaitk.Memoryimport Memory mem = Memory("memory.json") mem.remember("name", "Divyanshu") print(mem.recall("name")) mem.save_memory() # SmartMemory frompyaitk.Memoryimport SmartMemory mem = SmartMemory("memory.json", auto_fit=True, fit_interval=10) mem.remember("How are you?", "I'm great!") results = mem.semantic_search("how are things", top_k=3) intent = mem.predict_intent("What's the weather?") mem.export_report("report.json") mem.save_memory() # Factory (recommended) frompyaitk.Memoryimport build_memory mem = build_memory("memory.json", smart=True, fit_interval=50) mem.remember("greeting", "Hello!") mem.save_memory() # Extended API mem.forget("greeting") print("greeting" in mem) # False print(mem.size) print(mem.to_dict()) mem.wipe()
Pythonaibrain Config
The master configuration layer for the entire Pythonaibrain / pyaitk framework. All subsystems β Brain, STT, TTS, NER, TTI, Memory Summarizer, LLM β read their settings from a single .pbcfg file via typed dataclass section objects on one unified AppConfig instance.
What Is This?
Config provides:
.pbcfgfile format β INI-style config file with; #comment support and inline commentsAppConfigβ unified config manager that reads/writes all sections and exposes them as typed dataclasses- 22 section dataclasses β one per subsystem, each with typed fields and sensible defaults
- Auto-discovery β searches upward from cwd to find
config.pbcfg, falls back to built-in defaults - Typed
get()/set()β generic read/write with optional type casting and fallback generate_default_config()β writes a fully-populated default.pbcfgto disk- JSON export β serialize the entire config to JSON string or file
- Module-level singleton β
get_config()returns the same instance across the entire application
File Format (.pbcfg)
The config file uses standard INI format. Comments start with ; or #. Inline comments are supported.
; PythonAIBrain configuration file [brain] intents_path=./intents.json condition=true smart_memory=true memory_path=memory.json memory_fit_interval=20 username=user_name download=false [model] model_path=model.pth dimension_path=dimensions.json batch_size=8 learning_rate=0.001 epochs=100 [llm] n_ctx=2048 n_threads=0; 0 β use os.cpu_count() max_tokens=512 verbose=false [tts] rate=150 volume=1.0 voice=david default_text=Hello from PyAI output_path=; if set, saves audio here instead of playing [stt] energy_threshold=; empty β auto-calibrate dynamic_energy_threshold=true pause_threshold=0.8 phrase_time_limit=; hard cap per utterance (seconds) timeout=5.0 ambient_noise_duration=0.5 connectivity_host=8.8.8.8 connectivity_port=53 connectivity_timeout=2.0 preferred_engine=None; None | google | pocketsphinx sphinx_language=en-US google_language=en-US google_api_key=; empty β free tier max_retries=3 retry_delay=0.5 [logging] level=INFO format=%(asctime)s [%(levelname)s] %(name)s - %(message)s [weather] base_url=https://api.openweathermap.org/data/2.5/weather units=metric [memory] auto_load=true auto_fit=true [search] max_results=5 [webassistant] intents_path=./Webintents.json model_path=WebAssistantModel.pth dimension_path=WebAssistantDimensions.json batch_size=8 learning_rate=0.001 epochs=100 [embedding] tfidf_max_features=5000 tfidf_ngram_range=1,3 tfidf_sublinear_tf=true embed_dim=128 vocab_size=10000 [clustering] n_clusters=8 kmeans_max_iter=300 kmeans_random_state=42 dbscan_eps=0.5 dbscan_min_samples=2 agglo_linkage=ward agglo_distance_threshold= [classifier] lr_max_iter=500 lr_c=1.0 lr_solver=lbfgs lr_multi_class=auto similarity_threshold=0.65 [summarizer] latent_dim=64 hidden_dim=256 ae_epochs=30 ae_lr=0.001 ae_batch_size=16 top_patterns_per_cluster=3 min_cluster_size=2 [tti_image] default_width=512 default_height=512 default_bpp=24 default_format=png background_color=255,255,255 jpeg_quality=92 [tti_ai] nlp_backend=nltk max_prompt_tokens=128 use_stopword_filter=true model_type=vae_numpy latent_dim=128 text_embed_dim=256 vocab_size=4096 hidden_dim=512 palette_clusters=8 num_inference_steps=50 guidance_scale=7.5 seed= [tti_art] fractal_max_iter=256 blur_default_radius=2 noise_default_intensity=0.15 animation_fps=24 streaming_chunk_mb=32 [tti_paths] output_dir=tti_output model_dir=tti_models cache_dir=tti_cache log_dir=tti_logs [postprocessor] min_length=1 max_length= allowed_labels= blocked_labels= deduplicate=true merge_adjacent=false lowercase_labels=false strip_punct=true custom_label_map= [preprocessor] lowercase=false remove_urls=true remove_emails=false remove_html_tags=true normalize_whitespace=true normalize_unicode=true max_length= custom_patterns=
Boolean values accept: true/false, yes/no, on/off, 1/0.
Empty values (e.g. timeout =) resolve to None for optional fields.
Tuple fields (e.g. tfidf_ngram_range, background_color) are stored as comma-separated integers.
How to Use
1. Module-level singleton (recommended)
frompyaitk.configimport get_config cfg = get_config() # auto-discovers config.pbcfg from cwd upward print(cfg.brain.smart_memory) # True print(cfg.model.epochs) # 100 print(cfg.stt.pause_threshold) # 0.8
2. Load a specific file
frompyaitk.configimport AppConfig cfg = AppConfig("myproject.pbcfg") print(cfg.tts.voice) # "david" print(cfg.llm.n_ctx) # 2048
3. Auto-discover
cfg = AppConfig.discover() # search cwd β home cfg = AppConfig.discover("/my/dir") # search from a custom start path
4. Build from a dict
cfg = AppConfig.from_dict({ "brain": {"smart_memory": True, "username": "divyanshu"}, "model": {"epochs": 50, "learning_rate": 0.0005}, })
5. Read, mutate, and save
cfg = get_config() # Read with typed access print(cfg.brain.intents_path) print(cfg.embedding.tfidf_max_features) # Generic get with cast and fallback val = cfg.get("tti_ai", "latent_dim", fallback=128, cast=int) # Mutate in memory cfg.set("tti_ai", "seed", 42) cfg.set("tts", "rate", 180) # Persist to disk cfg.save() # saves to original path cfg.save("backup.pbcfg") # save to alternate path
6. Reload from disk
cfg.load() # reload from current path cfg.load("other.pbcfg") # reload from different file
7. Generate a default config file
frompyaitk.configimport generate_default_config path = generate_default_config() # β ./config.pbcfg path = generate_default_config("myapp.pbcfg")
8. Reset to factory defaults
frompyaitk.configimport reset_config cfg = reset_config() # clears singleton, returns AppConfig with all defaults
9. Export to JSON
json_str = cfg.to_json(indent=2) cfg.save_json("config.json")
10. Inspect all settings
print(cfg.dump()) # human-readable dump of all sections d = cfg.as_dict() # nested dict {section: {key: value}}
Section Reference
[brain] β cfg.brain (BrainConfig)
| Key | Type | Default | Description |
|---|---|---|---|
intents_path |
str |
./intents.json |
Path to intents JSON file |
condition |
bool |
True |
Enable dynamic intent learning from web search |
smart_memory |
bool |
True |
Use SmartMemory (semantic search + clustering) |
memory_path |
str |
memory.json |
Path for memory persistence |
memory_fit_interval |
int |
20 |
Auto-fit SmartMemory every N stored memories |
username |
str |
user_name |
Key used for user name storage in memory |
download |
bool |
False |
Auto-download NLTK data on Brain init |
[model] β cfg.model (ModelConfig)
| Key | Type | Default | Description |
|---|---|---|---|
model_path |
str |
model.pth |
Path to save/load model weights |
dimension_path |
str |
dimensions.json |
Path to save/load vocab/intent map |
batch_size |
int |
8 |
Training batch size |
learning_rate |
float |
0.001 |
Adam optimizer learning rate |
epochs |
int |
100 |
Training epochs |
[llm] β cfg.llm (LLMConfig)
| Key | Type | Default | Description |
|---|---|---|---|
n_ctx |
int |
2048 |
LLM context window size |
n_threads |
int |
0 |
CPU threads; 0 = os.cpu_count() |
max_tokens |
int |
512 |
Max tokens to generate per response |
verbose |
bool |
False |
Enable llama.cpp verbose logging |
[tts] β cfg.tts (TTSConfig)
| Key | Type | Default | Description |
|---|---|---|---|
rate |
int |
150 |
Words per minute |
volume |
float |
1.0 |
Volume 0.0β1.0 (validated in __post_init__) |
voice |
str |
david |
Voice name fragment (fuzzy matched) |
default_text |
str |
Hello from PyAI |
Fallback text for empty say() calls |
output_path |
str or None |
None |
Save to WAV file instead of playing |
[stt] β cfg.stt (STTConfig)
| Key | Type | Default | Description |
|---|---|---|---|
energy_threshold |
float or None |
None |
Mic sensitivity; None = auto-calibrate |
dynamic_energy_threshold |
bool |
True |
Continuously adjust threshold |
pause_threshold |
float |
0.8 |
Silence seconds marking end of phrase |
phrase_time_limit |
float or None |
None |
Hard cap per utterance in seconds |
timeout |
float or None |
5.0 |
Seconds to wait for speech to start |
ambient_noise_duration |
float |
0.5 |
Seconds to sample noise floor before listening |
connectivity_host |
str |
8.8.8.8 |
Host used for the network connectivity probe |
connectivity_port |
int |
53 |
Port for the connectivity probe |
connectivity_timeout |
float |
2.0 |
Timeout for the connectivity probe |
preferred_engine |
Engine or None |
None |
Force GOOGLE or POCKETSPHINX; None = auto |
sphinx_language |
str |
en-US |
PocketSphinx language code |
google_language |
str |
en-US |
Google Speech API BCP-47 language tag |
google_api_key |
str or None |
None |
Google API key; None = free tier |
max_retries |
int |
3 |
Max retry attempts on service errors |
retry_delay |
float |
0.5 |
Seconds between retries |
[logging] β cfg.logging (LoggingConfig)
| Key | Type | Default | Description |
|---|---|---|---|
level |
str |
INFO |
Root logging level |
format |
str |
%(asctime)s [%(levelname)s] %(name)s - %(message)s |
Log record format |
[memory] β cfg.memory (MemoryConfig)
| Key | Type | Default | Description |
|---|---|---|---|
auto_load |
bool |
True |
Load memory from disk on init |
auto_fit |
bool |
True |
Auto-fit SmartMemory summarizer on load |
[embedding] β cfg.embedding (EmbeddingConfig)
| Key | Type | Default | Description |
|---|---|---|---|
tfidf_max_features |
int |
5000 |
Max vocabulary size for TF-IDF |
tfidf_ngram_range |
tuple |
(1, 3) |
N-gram range (stored as 1,3 in file) |
tfidf_sublinear_tf |
bool |
True |
Apply sublinear TF scaling |
embed_dim |
int |
128 |
Learned embedding dimension |
vocab_size |
int |
10000 |
Vocabulary size for learned embeddings |
[clustering] β cfg.clustering (ClusteringConfig)
| Key | Type | Default | Description |
|---|---|---|---|
n_clusters |
int |
8 |
KMeans cluster count |
kmeans_max_iter |
int |
300 |
KMeans max iterations |
kmeans_random_state |
int |
42 |
KMeans random seed |
dbscan_eps |
float |
0.5 |
DBSCAN epsilon radius |
dbscan_min_samples |
int |
2 |
DBSCAN minimum samples per cluster |
agglo_linkage |
str |
ward |
Agglomerative linkage criterion |
agglo_distance_threshold |
float or None |
None |
Agglomerative distance threshold |
[classifier] β cfg.classifier (ClassifierConfig)
| Key | Type | Default | Description |
|---|---|---|---|
lr_max_iter |
int |
500 |
Logistic Regression max iterations |
lr_c |
float |
1.0 |
LR regularization strength (lower = stronger) |
lr_solver |
str |
lbfgs |
LR solver |
lr_multi_class |
str |
auto |
LR multi-class strategy |
similarity_threshold |
float |
0.65 |
Min cosine similarity for pattern matching |
[summarizer] β cfg.summarizer (SummarizerConfig)
| Key | Type | Default | Description |
|---|---|---|---|
latent_dim |
int |
64 |
Autoencoder latent space dimension |
hidden_dim |
int |
256 |
Autoencoder hidden layer size |
ae_epochs |
int |
30 |
Autoencoder training epochs |
ae_lr |
float |
0.001 |
Autoencoder learning rate |
ae_batch_size |
int |
16 |
Autoencoder training batch size |
top_patterns_per_cluster |
int |
3 |
Top patterns to extract per cluster |
min_cluster_size |
int |
2 |
Minimum cluster size for summarization |
[tti_image] β cfg.tti_image (TTIImageConfig)
| Key | Type | Default | Description |
|---|---|---|---|
default_width |
int |
512 |
Output image width in pixels |
default_height |
int |
512 |
Output image height in pixels |
default_bpp |
int |
24 |
Bits per pixel |
default_format |
str |
png |
Output format: png, bmp, jpeg |
background_color |
tuple |
(255, 255, 255) |
RGB background color (stored as 255,255,255) |
jpeg_quality |
int |
92 |
JPEG compression quality (1β95) |
[tti_ai] β cfg.tti_ai (TTIAIConfig)
| Key | Type | Default | Description |
|---|---|---|---|
nlp_backend |
str |
nltk |
NLP tokenizer: nltk or spacy |
max_prompt_tokens |
int |
128 |
Max tokens per prompt |
use_stopword_filter |
bool |
True |
Filter stopwords from prompts |
model_type |
str |
vae_numpy |
AI model type: vae_numpy or torch_vae |
latent_dim |
int |
128 |
VAE latent space dimension |
text_embed_dim |
int |
256 |
Text embedding dimension |
vocab_size |
int |
4096 |
Vocabulary size for text encoding |
hidden_dim |
int |
512 |
Hidden layer size |
palette_clusters |
int |
8 |
KMeans clusters for palette extraction |
palette_model_path |
str |
tti_palette_model.pkl |
Path for saved palette model |
num_inference_steps |
int |
50 |
Diffusion inference steps |
guidance_scale |
float |
7.5 |
Classifier-free guidance scale |
seed |
int or None |
None |
Random seed; empty in file resolves to None |
[tti_art] β cfg.tti_art (TTIArtConfig)
| Key | Type | Default | Description |
|---|---|---|---|
fractal_max_iter |
int |
256 |
Max iterations for fractal generation |
blur_default_radius |
int |
2 |
Default Gaussian blur radius |
noise_default_intensity |
float |
0.15 |
Default noise overlay intensity |
animation_fps |
int |
24 |
Frames per second for animations |
streaming_chunk_mb |
int |
32 |
Chunk size for streaming output (MB) |
[tti_paths] β cfg.tti_paths (TTIPathConfig)
| Key | Type | Default | Description |
|---|---|---|---|
output_dir |
str |
tti_output |
Directory for generated images |
model_dir |
str |
tti_models |
Directory for saved models |
cache_dir |
str |
tti_cache |
Directory for cached data |
log_dir |
str |
tti_logs |
Directory for TTI log files |
Call cfg.ensure_tti_dirs() (or cfg.tti_paths.ensure_dirs()) to create all four directories.
[postprocessor] β cfg.postprocessor (PostprocessorConfig)
| Key | Type | Default | Description |
|---|---|---|---|
min_length |
int |
1 |
Minimum entity text length |
max_length |
int or None |
None |
Maximum entity text length |
allowed_labels |
set[str] or None |
None |
Allow only these labels; None = allow all |
blocked_labels |
set[str] |
{} |
Always exclude these labels |
deduplicate |
bool |
True |
Remove duplicate spans |
merge_adjacent |
bool |
False |
Merge consecutive same-label spans |
lowercase_labels |
bool |
False |
Lowercase all label names |
strip_punct |
bool |
True |
Strip leading/trailing punctuation from text |
custom_label_map |
dict[str, str] |
{} |
Rename labels e.g. {"PERSON": "PER"} |
[preprocessor] β cfg.preprocessor (PreprocessorConfig)
| Key | Type | Default | Description |
|---|---|---|---|
lowercase |
bool |
False |
Lowercase the text |
remove_urls |
bool |
True |
Strip http://, https://, www. URLs |
remove_emails |
bool |
False |
Strip email addresses |
remove_html_tags |
bool |
True |
Strip HTML tags |
normalize_whitespace |
bool |
True |
Collapse whitespace to single spaces |
normalize_unicode |
bool |
True |
NFC Unicode normalization |
max_length |
int or None |
None |
Truncate text to this many characters |
custom_patterns |
list[str] |
[] |
Regex patterns to strip |
AppConfig API Reference
| Method / Property | Returns | Description |
|---|---|---|
load(path?) |
AppConfig |
Parse .pbcfg file and populate all sections |
save(path?) |
AppConfig |
Write current config to .pbcfg |
get(section, key, fallback, cast) |
Any |
Read a raw value with optional type casting |
set(section, key, value) |
None |
Write a value into the in-memory config (call save() to persist) |
to_json(indent?) |
str |
Serialize all sections to a JSON string |
save_json(filepath?) |
None |
Write config to a JSON file |
as_dict() |
dict |
Return entire config as a nested {section: {key: value}} dict |
dump() |
str |
Human-readable summary of all settings |
ensure_tti_dirs() |
None |
Create TTI output/model/cache/log directories if missing |
AppConfig.discover(start?) |
AppConfig |
Search cwd β home for config.pbcfg; use defaults if not found |
AppConfig.from_dict(data, path?) |
AppConfig |
Build config from a nested dict |
Module-Level Helpers
frompyaitk.configimport get_config, reset_config, generate_default_config # Get or create the global singleton cfg = get_config() # Force a specific file into the singleton cfg = get_config("custom.pbcfg") # Reset singleton to factory defaults cfg = reset_config() # Write a default config file to disk path = generate_default_config() # β ./config.pbcfg path = generate_default_config("myapp.pbcfg")
Examples Summary
frompyaitk.configimport AppConfig, get_config, generate_default_config # Generate a default file generate_default_config("myproject.pbcfg") # Load and read cfg = AppConfig("myproject.pbcfg") print(cfg.brain.smart_memory) # True print(cfg.model.epochs) # 100 print(cfg.stt.google_language) # "en-US" print(cfg.tti_image.default_format) # "png" print(cfg.embedding.tfidf_ngram_range) # (1, 3) # Mutate and save cfg.set("model", "epochs", 200) cfg.set("tts", "voice", "zira") cfg.set("tti_ai", "seed", 1337) cfg.save() # Generic typed get val = cfg.get("clustering", "n_clusters", fallback=8, cast=int) # JSON export cfg.save_json("config_backup.json") # Factory methods cfg2 = AppConfig.discover() cfg3 = AppConfig.from_dict({ "brain": {"username": "divyanshu", "smart_memory": True}, "llm": {"max_tokens": 256}, }) # Create TTI directories cfg.ensure_tti_dirs() # Inspect print(cfg.dump())
PyAgent β ZENTRAA CLI Reference
ZENTRAA (Zone for Encrypted Networked Talks & Real-time AI Agent)
Powered by Pythonaibrain v1.1.9 Β· Author: Divyanshu Sinha
Encryption: RSA-2048-OAEP Β· AES-256-GCM Β· RSA-PSS Β· Curve25519
Installation
pipinstall"pythonaibrain[zentraa]"
Linux only β install PortAudio before pip if you plan to use the TIGER AI voice features:
sudoaptinstallportaudio19-devpython3-pyaudio
Quick Start
Every command is available through the pythonaibrain (or pyaitk) dispatcher:
pythonaibrain zentraa <command> [options]
Or as a standalone alias:
zentraa-server / zentraa-client / zentraa-tiger-ai / zentraa-web
Typical startup order:
1. zentraa server β start first
2. zentraa web β optional browser bridge
3. zentraa ai β optional TIGER AI agent
4. zentraa client β one per human user
Commands
zentraa server β TCP Chat Server
Starts the encrypted ZENTRAA chat server that all clients connect to.
pythonaibrainzentraaserver pythonaibrainzentraaserver--host0.0.0.0--port9999 pythonaibrainzentraaserver--config/path/to/ZENTRAA.pbcfg
| Option | Short | Default | Description |
|---|---|---|---|
--config |
-c |
ZENTRAA.pbcfg |
Path to config file |
--host |
-H |
from config | Override bind host |
--port |
-p |
from config | Override bind port |
--help |
Show help and exit |
zentraa client β Chat Client (TUI)
Interactive terminal client for human users. Supports direct messages, broadcasts, and talking to TIGER AI.
pythonaibrainzentraaclient pythonaibrainzentraaclient--host127.0.0.1--port9999--useridAlice pythonaibrainzentraaclient--config/path/to/ZENTRAA.pbcfg
| Option | Short | Default | Description |
|---|---|---|---|
--config |
-c |
ZENTRAA.pbcfg |
Path to config file |
--host |
-H |
from config | Server host |
--port |
-p |
from config | Server port |
--userid |
-u |
prompted | Your user ID |
--help |
Show help and exit |
In-chat commands
| Command | Description |
|---|---|
<message> |
Broadcast to all users |
@<userid> <message> |
Direct message a user |
@uid1 @uid2 ... <message> |
Multi-user direct message |
@ai <message> |
Ask TIGER AI privately |
@ai @<uid> <message> |
Ask AI, share reply with <uid> |
/help |
Show help |
/clear or /cls |
Clear the screen |
/ai |
Show TIGER AI info |
/setting |
View current settings |
/users |
List online users |
/me <action> |
Send an action / emote message |
/whois <userid> |
Show info about a user |
/ping |
Ping the server manually |
/stats |
Show session statistics |
/notify <on|off> |
Toggle bell notifications |
/timestamps <on|off> |
Toggle message timestamps |
/quit or /exit |
Disconnect |
Keyboard shortcuts
| Key | Action |
|---|---|
β / β |
Browse command history |
Tab |
Autocomplete @userid or /command |
Ctrl+C / Ctrl+D |
Quit |
zentraa ai β TIGER AI Agent
Connects an automated AI agent (TIGER AI) to the server. Other users can query it with @ai <message>.
pythonaibrainzentraaai pythonaibrainzentraaai--smart pythonaibrainzentraaai--basic--host127.0.0.1--port9999 pythonaibrainzentraaai--config/path/to/ZENTRAA.pbcfg
| Option | Short | Default | Description |
|---|---|---|---|
--config |
-c |
ZENTRAA.pbcfg |
Path to config file |
--host |
-H |
from config | Server host |
--port |
-p |
from config | Server port |
--smart |
from config | Force AdvanceBrain (LLM mode) | |
--basic |
from config | Force Brain (intent-matching mode) | |
--help |
Show help and exit |
--smartand--basicare mutually exclusive. If neither is passed, the value fromZENTRAA.pbcfgis used.
zentraa web β HTTP / WebSocket Bridge
Starts the HTTP and WebSocket bridge so browser clients can connect to the ZENTRAA TCP server. The bridge auto-selects a free port starting from 7080 unless --no-auto-port is set.
pythonaibrainzentraaweb pythonaibrainzentraaweb--http-port7080--tcp-port9999 pythonaibrainzentraaweb--no-auto-port--max-upload-mb128--history1000
| Option | Default | Description |
|---|---|---|
--host |
0.0.0.0 |
HTTP bind host |
--http-port |
7080 (auto) |
HTTP / WebSocket port |
--tcp-host |
127.0.0.1 |
ZENTRAA TCP server host |
--tcp-port |
9999 |
ZENTRAA TCP server port |
--no-auto-port |
off | Fail instead of scanning for a free port |
--max-upload-mb |
64 |
Maximum file upload size in MB |
--history |
500 |
Messages stored per conversation |
--ping-interval |
20 |
WebSocket ping interval in seconds |
--help |
Show help and exit |
Once running, open your browser at:
http://localhost:7080
Bridge features
- Typing indicators
- Read receipts
- Emoji reactions
- Message delivery confirmations
- Rich user list (online/offline status)
- File upload via
POST /api/upload(chunked, β€128 KB per chunk) - REST
GET /api/usersβ online users and metadata - REST
GET /api/history/{conv_id}β last N messages
Global CLI Flags
pythonaibrain--version# Print version pythonaibrain--info# Package metadata + module availability pythonaibrain--modules# Per-module availability table pythonaibrain--help# Full help
Configuration
All commands read ZENTRAA.pbcfg from the current directory by default. Pass --config to use a different path.
A minimal config example:
[network] host=0.0.0.0 port=9999 [client] default_host=127.0.0.1 default_port=9999 [ai] smart_ai=true [ui] banner_style=full
.envfiles and RSA key files (.pem) in.zentraa_keys/are generated locally and are never included in the package. Keep them out of version control.
Architecture
Browser ββWS/JSONβββΊ HTTP Bridge (zentraa web)
β
TCP/Encrypted
β
ZENTRAA Server (zentraa server)
β
ββββββββββββββ΄βββββββββββββ
Chat Clients TIGER AI Agent
(zentraa client) (zentraa ai)
License
- PyAgent / ZENTRAA: LGPL-3.0-or-later
- pyaitk.CLSE: AGPL-3.0-or-later (see
pyaitk/CLSE/LICENSE.txt)
Visit PyPI for installation and more details.
Visit GitHub for more detail about package.
Visit Pythonaibrain Issues for any issues.
Start building your AI assistant today with Pythonaibrain!
Project details
Verified details
These details have been verified by PyPIMaintainers
π Avatar for DivyanshuSinha from gravatar.comDivyanshuSinha
Unverified details
These details have not been verified by PyPIProject links
Meta
-
License Expression: LGPL-3.0-or-later AND AGPL-3.0-or-later
SPDX License Expression - Author: Divyanshu Sinha
- Tags ai , artificial-intelligence , assistant , tts , stt , speech , nlp , nlp-library , text-to-speech , speech-to-text , object-detection , computer-vision , named-entity-recognition , ner , image-generation , offline-ai , math-ai , summarizer , memory
- Requires: Python >=3.9
-
Provides-Extra:
core,tts,stt,camera,itt,context,ner,memory,math,search,pptx,pdf,eye,summarizer,clse,all,dev,docs,zentraa
Classifiers
- Development Status
- Intended Audience
- Natural Language
- Operating System
- Programming Language
- Topic
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pythonaibrain-1.1.9.tar.gz.
File metadata
- Download URL: pythonaibrain-1.1.9.tar.gz
- Upload date:
- Size: 28.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e9dee64045d4182db2fee4f265760a779f5cc621a54ab423d123de2becf1f84
|
|
| MD5 |
73572839861d733a051792e0d9b401d5
|
|
| BLAKE2b-256 |
79dbda65aac591a8a0da6b80e66c4b41e0d06f35d09e60cf27847e519ce070ea
|
File details
Details for the file pythonaibrain-1.1.9-py3-none-any.whl.
File metadata
- Download URL: pythonaibrain-1.1.9-py3-none-any.whl
- Upload date:
- Size: 28.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e39701c87dfe0982144b1e12ee7d458920f81d59bde1adfdee3eb33243dcc433
|
|
| MD5 |
8e47d5b5b846bbf0af8dc6ae2e26b7a2
|
|
| BLAKE2b-256 |
23debf22001afc78dc19b423d0ef4dee78fe02d41d4546e99f20bf184124c50e
|
