VOOZH

URL: https://huggingface.co/datasets/its5Q/habr_qna

⇱ its5Q/habr_qna · Datasets at Hugging Face

Dataset Viewer

Dataset Card for Habr QnA

Dataset Summary

This is a dataset of questions and answers scraped from Habr QnA. There are 723430 asked questions with answers, comments and other metadata.

Languages

The dataset is mostly Russian with source code in different languages.

Dataset Structure

Data Fields

Data fields can be previewed on the dataset card page.

Data Splits

All 723430 examples are in the train split, there is no validation split.

Dataset Creation

The data was scraped with a script, located in my GitHub repository

Additional Information

Dataset Curators

https://github.com/its5Q

Downloads last month: 87

Models trained or fine-tuned on its5Q/habr_qna

Sentence Similarity • 0.1B • Updated Mar 26 • 10.3k • • 26

Sentence Similarity • 34.4M • Updated Apr 18, 2025 • 3.07k • • 11

Sentence Similarity • Updated May 15 • 14

Sentence Similarity • 34.4M • Updated May 15 • 9

Collection including its5Q/habr_qna

Datasets collected from scraping Russian question answering websites • 4 items • Updated Mar 15, 2024 • 2