Paper • 2503.02240 • Published • 3
Qwen3.5-0.8B Text2SQL
Supervised Fine-Tuning (SFT) for Natural Language to SQL Generation
Fine-tuning Qwen3.5-0.8B using Spider, BIRD23, and SynSQL-2.5M datasets with QLoRA + Unsloth.
Repository Project: https://github.com/MuhammadNafishZaldinanda/finetuning-text2sql
Dataset
Dialect: SQLite
| Dataset | Source Paper | Samples Used | Notes | Links |
|---|---|---|---|---|
| Spider | Spider: A Large-Scale Human-Labeled Dataset... | 7,000 | All training split. | Link Google Drive Donwload |
| BIRD23-Train-Filtered | A BIg Bench for Large-Scale Database Grounded Text-to-SQLs | 6,626 | Used subset bird23-train-filtered. |
HuggingFace Dataset |
| SynSQL-2.5M (Filtered) | OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale | 7,000 | Filtering by question style dan SQL complexity. | HuggingFace Dataset OmniSQL Official Repo |
| Total | 20,626 | NafishZaldinanda/text2sql-omnisql-style |
SynSQL-2.5M Filtering Configuration
| Criteria | Value |
|---|---|
| Question Style | Formal, Colloquial, Imperative, Interrogative, Descriptive, Concise |
| Simple | 700 |
| Moderate | 2,800 |
| Complex | 2,800 |
| Highly Complex | 700 |
| Total Samples | 7,000 |
Instruction Prompt
Task Overview:
You are a data science expert. Below, you are provided with a database schema and a natural language question. Your task is to understand the schema and generate a valid SQL query to answer the question.
Database Engine:
SQLite
Database Schema:
{db_details}
This schema describes the database's structure, including tables, columns, primary keys, foreign keys, and any relevant relationships or constraints.
Question:
{evidence}{question}
Instructions:
- Make sure you only output the information that is asked in the question. If the question asks for a specific column, make sure to only include that column in the SELECT clause, nothing more.
- The generated query should return all of the information asked in the question without any missing or extra information.
- Before generating the final SQL query, please think through the steps of how to write the query.
Output Format:
In your answer, please enclose the generated SQL query in a code block:
```sql
-- Your SQL query
```
Take a deep breath and think step by step to find the correct SQL query.
LoRA Configuration
| Parameter | Value |
|---|---|
| Quantization | 4-bit |
| LoRA Rank (r) | 32 |
| LoRA Alpha | 64 |
| LoRA Dropout | 0.0 |
| Bias | none |
| Trainable Parameters | 12.78M |
| Percentage of Trainable Parameters | 2.22% |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | Qwen3.5-0.8B |
| Total Dataset | 20626 |
| Epoch | 1 |
| Max Sequence Length | 8704 |
| Learning Rate | 1e-5 |
| Scheduler | Cosine |
| Warmup Ratio | 10% |
| Optimizer | adam_torch_fused |
| Max Gradient Norm | 0.5 |
| Batch Size | 1 |
| Gradient Accumulation Steps | 8 |
| Hardware | NVIDIA RTX 4000 SFF Ada |
| Available VRAM | 20 GB |
| Peak VRAM Usage | ~19 GB |
| Training Time | 7 Hours 36 Minutes |
Training Results
| Metric | Value |
|---|---|
| Final Train Loss | 0.262 |
| Final Validation Loss | 0.218 |
Model Performance Evaluation: Base vs. Fine-Tuned (Qwen3.5-0.8B)
1. Base Model (Qwen3.5-0.8B)
Overall Performance
| Metric | Value |
|---|---|
| Accuracy | 21.3% |
| Correct | 106 |
| Wrong | 152 |
| Execution Error | 240 |
Performance by Difficulty
| Difficulty | Correct / Total | Accuracy |
|---|---|---|
| Simple | 51 / 148 | 34.5% |
| Moderate | 47 / 250 | 18.8% |
| Challenging | 8 / 102 | 7.8% |
2. Fine-Tuned Model (QLoRA)
Overall Performance
| Metric | Value |
|---|---|
| Accuracy | 18.3% |
| Correct | 91 |
| Wrong | 171 |
| Execution Error | 236 |
Performance by Difficulty
| Difficulty | Correct / Total | Accuracy |
|---|---|---|
| Simple | 57 / 148 | 38.5% |
| Moderate | 26 / 250 | 10.4% |
| Challenging | 8 / 102 | 7.8% |
3. Head-to-Head Comparison
| Metric | Base Model | Fine-Tuned (QLoRA) | Selisih |
|---|---|---|---|
| Overall Accuracy | 21.3% | 18.3% | -3.0% |
| Simple | 34.5% | 38.5% | +4.0% |
| Moderate | 18.8% | 10.4% | -8.4% |
| Challenging | 7.8% | 7.8% | 0.0% |
| Execution Error | 240 | 236 | -4 |
- Downloads last month
- 44
Safetensors
Model size
0.9B params
Tensor type
F32
·
BF16 ·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
