image imagewidth (px) 1.72k 1.72k | frame_id int64 0 9 | timestamp float64 5.37 9.5 | frequency int64 0 0 | time_scale int64 1 1 | capture_date stringdate 2026-01-13 19:57:36 2026-01-13 19:57:40 |
|---|---|---|---|---|---|
0 | 5.365 | 0 | 1 | 2026-01-13T19:57:36.427Z | |
1 | 5.813 | 0 | 1 | 2026-01-13T19:57:36.946Z | |
2 | 6.219 | 0 | 1 | 2026-01-13T19:57:37.459Z | |
3 | 6.688 | 0 | 1 | 2026-01-13T19:57:37.954Z | |
4 | 7.147 | 0 | 1 | 2026-01-13T19:57:38.619Z | |
5 | 7.552 | 0 | 1 | 2026-01-13T19:57:38.936Z | |
6 | 8.032 | 0 | 1 | 2026-01-13T19:57:39.445Z | |
7 | 8.501 | 0 | 1 | 2026-01-13T19:57:39.939Z | |
8 | 9.003 | 0 | 1 | 2026-01-13T19:57:40.519Z | |
9 | 9.504 | 0 | 1 | 2026-01-13T19:57:40.947Z |
๐ Website
๐ GitHub
๐ Hugging Face
๐ Follow on X
AAA UUUUUUUU UUUUUUUUDDDDDDDDDDDDD IIIIIIIIII OOOOOOOOO FFFFFFFFFFFFFFFFFFFFFF OOOOOOOOO RRRRRRRRRRRRRRRRR MMMMMMMM MMMMMMMM A:::A U::::::U U::::::UD::::::::::::DDD I::::::::I OO:::::::::OO F::::::::::::::::::::F OO:::::::::OO R::::::::::::::::R M:::::::M M:::::::M A:::::A U::::::U U::::::UD:::::::::::::::DD I::::::::I OO:::::::::::::OO F::::::::::::::::::::F OO:::::::::::::OO R::::::RRRRRR:::::R M::::::::M M::::::::M A:::::::A UU:::::U U:::::UUDDD:::::DDDDD:::::DII::::::IIO:::::::OOO:::::::OFF::::::FFFFFFFFF::::FO:::::::OOO:::::::ORR:::::R R:::::RM:::::::::M M:::::::::M A:::::::::A U:::::U U:::::U D:::::D D:::::D I::::I O::::::O O::::::O F:::::F FFFFFFO::::::O O::::::O R::::R R:::::RM::::::::::M M::::::::::M A:::::A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::F O:::::O O:::::O R::::R R:::::RM:::::::::::M M:::::::::::M A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F::::::FFFFFFFFFF O:::::O O:::::O R::::RRRRRR:::::R M:::::::M::::M M::::M:::::::M A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::::::::::::F O:::::O O:::::O R:::::::::::::RR M::::::M M::::M M::::M M::::::M A:::::A A:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::::::::::::F O:::::O O:::::O R::::RRRRRR:::::R M::::::M M::::M::::M M::::::M A:::::AAAAAAAAA:::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F::::::FFFFFFFFFF O:::::O O:::::O R::::R R:::::RM::::::M M:::::::M M::::::M A:::::::::::::::::::::A U:::::D D:::::U D:::::D D:::::DI::::I O:::::O O:::::O F:::::F O:::::O O:::::O R::::R R:::::RM::::::M M:::::M M::::::M A:::::AAAAAAAAAAAAA:::::A U::::::U U::::::U D:::::D D:::::D I::::I O::::::O O::::::O F:::::F O::::::O O::::::O R::::R R:::::RM::::::M MMMMM M::::::M A:::::A A:::::AU:::::::UUU:::::::U DDD:::::DDDDD:::::DII::::::IIO:::::::OOO:::::::OFF:::::::FF O:::::::OOO:::::::ORR:::::R R:::::RM::::::M M::::::M A:::::A A:::::AUU:::::::::::::UU D:::::::::::::::DD I::::::::I OO:::::::::::::OO F::::::::FF OO:::::::::::::OO R::::::R R:::::RM::::::M M::::::M A:::::A A:::::A UU:::::::::UU D::::::::::::DDD I::::::::I OO:::::::::OO F::::::::FF OO:::::::::OO R::::::R R:::::RM::::::M M::::::M AAAAAAA AAAAAAA UUUUUUUUU DDDDDDDDDDDDD IIIIIIIIII OOOOOOOOO FFFFFFFFFFF OOOOOOOOO RRRRRRRR RRRRRRRMMMMMMMM MMMMMMMM
Audioform_Dataset_v1
This dataset is the very first output from AUDIOFORM โ a Three.js powered 3D audio visualization tool that turns audio files into beautiful, timestamped visual frames with rich metadata. AUDIOFORM by webXOS is available for download in the /audioform/ folder of this repo so developers can create their own similar datasets. Audio for is a synthetic harmonic oscilator that runs in HTML, think of it as the "Hello World" / MNIST-style dataset application for audio-to-visual multimodal machine learning.
This dataset contains 10 captured frames from a short uploaded WAV file (played at 1ร speed), together with per-frame metadata including dominant frequency, timestamp, and capture info.
Dataset Structure
audioform_dataset/
โโโ images/
โ โโโ frame_0001.png
โ โโโ frame_0002.png
โ โโโ ... (10 PNG frames total)
โโโ metadata.csv # Main metadata file (Hugging Face viewer uses this)
โโโ README.md
| Column | Type | Description | Example Value |
|---------------|---------|-----------------------------------------------------------------------------|-----------------------------------|
| `file_name` | string | Relative path to the visualization PNG (required by Hugging Face) | `images/frame_0001.png` |
| `frame_id` | int | Sequential frame number (0-based) | 0, 1, 2, โฆ, 9 |
| `timestamp` | float | Time in seconds when the frame was captured from the audio | 5.365, 6.219, 9.504 |
| `frequency` | int | Dominant / main detected audio frequency at capture time (Hz) | 0 (in this tiny sample) |
| `time_scale` | int | Playback speed multiplier used during visualization | 1 |
| `capture_date`| string | UTC ISO timestamp when the frame was rendered | 2026-01-13T19:57:36.427Z |
See how fast a tiny diffusion model / GAN / LoRA can memorize & regenerate these exact 10 styles. Use the frames as style references for ControlNet, IP-Adapter, or fine-tuning SD to adopt this neon 3D audio-viz aesthetic.
This dataset shows the **format** AUDIOFORM produces.
โ Feed it real music, voices, field recordings, synths
โ Generate 1kโ100k+ frames
โ Add labels (genre, instrument, mood, multiple freq peaksโฆ)
โ Unlock serious applications:
- Music video auto-generation
- Visual audio classifiers
- Audio-conditioned image/video generation
- Interactive music โ 3D art installations
- Novel multimodal music understanding models
Dataset Description
This dataset was generated using AUDIOFORM, a 3D audio visualization system.
- Total Frames: 10
- Generation Date: 2026-01-13
- Audio Type: Uploaded WAV File
- Time Scaling: 1x
Dataset Structure
images/: Contains all captured frames in PNG formatmetadata.csv: Contains classification data for each frame
Metadata Columns
file_name: Relative path to the image file (e.g., images/frame_0001.png) - REQUIRED for Hugging Faceframe_id: Unique identifier for each frametimestamp: Time in seconds when frame was capturedfrequency: Audio frequency at capture time (Hz)time_scale: Playback speed multipliercapture_date: ISO date string of capture
Intended Use
This dataset is intended for training machine learning models on audio visualization patterns, waveform classification, or generative AI tasks.
- Downloads last month
- 40
