Voozh

Dataset Viewer

image imagewidth (px) 1.72k 1.72k	frame_id int64 0 9	frequency int64 0 0	time_scale int64 1 1
0	5.365	1	2026-01-13T19:57:36.427Z
1	5.813	1	2026-01-13T19:57:36.946Z
2	6.219	1	2026-01-13T19:57:37.459Z
3	6.688	1	2026-01-13T19:57:37.954Z
4	7.147	1	2026-01-13T19:57:38.619Z
5	7.552	1	2026-01-13T19:57:38.936Z
6	8.032	1	2026-01-13T19:57:39.445Z
7	8.501	1	2026-01-13T19:57:39.939Z
8	9.003	1	2026-01-13T19:57:40.519Z
9	9.504	1	2026-01-13T19:57:40.947Z

👁 Website
👁 GitHub
👁 Hugging Face
👁 Follow on X

 AAA UUUUUUUU UUUUUUUUDDDDDDDDDDDDD IIIIIIIIII OOOOOOOOO FFFFFFFFFFFFFFFFFFFFFF OOOOOOOOO RRRRRRRRRRRRRRRRR MMMMMMMM MMMMMMMM
 A:::A U::::::U U::::::UD::::::::::::DDD I::::::::I OO:::::::::OO F::::::::::::::::::::F OO:::::::::OO R::::::::::::::::R M:::::::M M:::::::M
 A:::::A U::::::U U::::::UD:::::::::::::::DD I::::::::I OO:::::::::::::OO F::::::::::::::::::::F OO:::::::::::::OO R::::::RRRRRR:::::R M::::::::M M::::::::M
 A:::::::A UU:::::U U:::::UUDDD:::::DDDDD:::::DII::::::IIO:::::::OOO:::::::OFF::::::FFFFFFFFF::::FO:::::::OOO:::::::ORR:::::R R:::::RM:::::::::M M:::::::::M
 A:::::::::A U:::::U U:::::U D:::::D D:::::D I::::I O::::::O O::::::O F:::::F FFFFFFO::::::O O::::::O R::::R R:::::RM::::::::::M M::::::::::M
 A:::::A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::F O:::::O O:::::O R::::R R:::::RM:::::::::::M M:::::::::::M
 A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F::::::FFFFFFFFFF O:::::O O:::::O R::::RRRRRR:::::R M:::::::M::::M M::::M:::::::M
 A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::::::::::::F O:::::O O:::::O R:::::::::::::RR M::::::M M::::M M::::M M::::::M
 A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::::::::::::F O:::::O O:::::O R::::RRRRRR:::::R M::::::M M::::M::::M M::::::M
 A:::::AAAAAAAAA:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F::::::FFFFFFFFFF O:::::O O:::::O R::::R R:::::RM::::::M M:::::::M M::::::M
 A:::::::::::::::::::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::F O:::::O O:::::O R::::R R:::::RM::::::M M:::::M M::::::M
 A:::::AAAAAAAAAAAAA:::::A U::::::U U::::::U D:::::D D:::::D I::::I O::::::O O::::::O F:::::F O::::::O O::::::O R::::R R:::::RM::::::M MMMMM M::::::M
 A:::::A A:::::AU:::::::UUU:::::::U DDD:::::DDDDD:::::DII::::::IIO:::::::OOO:::::::OFF:::::::FF O:::::::OOO:::::::ORR:::::R R:::::RM::::::M M::::::M
 A:::::A A:::::AUU:::::::::::::UU D:::::::::::::::DD I::::::::I OO:::::::::::::OO F::::::::FF OO:::::::::::::OO R::::::R R:::::RM::::::M M::::::M
 A:::::A A:::::A UU:::::::::UU D::::::::::::DDD I::::::::I OO:::::::::OO F::::::::FF OO:::::::::OO R::::::R R:::::RM::::::M M::::::M
AAAAAAA AAAAAAA UUUUUUUUU DDDDDDDDDDDDD IIIIIIIIII OOOOOOOOO FFFFFFFFFFF OOOOOOOOO RRRRRRRR RRRRRRRMMMMMMMM MMMMMMMM

Audioform_Dataset_v1

This dataset is the very first output from AUDIOFORM — a Three.js powered 3D audio visualization tool that turns audio files into beautiful, timestamped visual frames with rich metadata. AUDIOFORM by webXOS is available for download in the /audioform/ folder of this repo so developers can create their own similar datasets. Audio for is a synthetic harmonic oscilator that runs in HTML, think of it as the "Hello World" / MNIST-style dataset application for audio-to-visual multimodal machine learning.

This dataset contains 10 captured frames from a short uploaded WAV file (played at 1× speed), together with per-frame metadata including dominant frequency, timestamp, and capture info.

Dataset Structure

audioform_dataset/
├── images/
│ ├── frame_0001.png
│ ├── frame_0002.png
│ └── ... (10 PNG frames total)
├── metadata.csv # Main metadata file (Hugging Face viewer uses this)
└── README.md

| Column | Type | Description | Example Value |
|---------------|---------|-----------------------------------------------------------------------------|-----------------------------------|
| `file_name` | string | Relative path to the visualization PNG (required by Hugging Face) | `images/frame_0001.png` |
| `frame_id` | int | Sequential frame number (0-based) | 0, 1, 2, …, 9 |
| `timestamp` | float | Time in seconds when the frame was captured from the audio | 5.365, 6.219, 9.504 |
| `frequency` | int | Dominant / main detected audio frequency at capture time (Hz) | 0 (in this tiny sample) |
| `time_scale` | int | Playback speed multiplier used during visualization | 1 |
| `capture_date`| string | UTC ISO timestamp when the frame was rendered | 2026-01-13T19:57:36.427Z |

See how fast a tiny diffusion model / GAN / LoRA can memorize & regenerate these exact 10 styles. Use the frames as style references for ControlNet, IP-Adapter, or fine-tuning SD to adopt this neon 3D audio-viz aesthetic.

 This dataset shows the **format** AUDIOFORM produces. 
 → Feed it real music, voices, field recordings, synths 
 → Generate 1k–100k+ frames 
 → Add labels (genre, instrument, mood, multiple freq peaks…) 
 → Unlock serious applications:

 - Music video auto-generation 
 - Visual audio classifiers 
 - Audio-conditioned image/video generation 
 - Interactive music → 3D art installations 
 - Novel multimodal music understanding models

Dataset Description

This dataset was generated using AUDIOFORM, a 3D audio visualization system.

Total Frames: 10
Generation Date: 2026-01-13
Audio Type: Uploaded WAV File
Time Scaling: 1x

Dataset Structure

images/: Contains all captured frames in PNG format
metadata.csv: Contains classification data for each frame

Metadata Columns

file_name: Relative path to the image file (e.g., images/frame_0001.png) - REQUIRED for Hugging Face
frame_id: Unique identifier for each frame
timestamp: Time in seconds when frame was captured
frequency: Audio frequency at capture time (Hz)
time_scale: Playback speed multiplier
capture_date: ISO date string of capture

Intended Use

This dataset is intended for training machine learning models on audio visualization patterns, waveform classification, or generative AI tasks.

Downloads last month: 40

URL: https://huggingface.co/datasets/webxos/audioform_dataset

⇱ webxos/audioform_dataset · Datasets at Hugging Face

Audioform_Dataset_v1

Dataset Structure

Dataset Description

Dataset Structure

Metadata Columns

Intended Use