My LoRA Fine-Tuned AI-generated Detector

This is a e5-small model fine-tuned with LoRA for sequence classification tasks. It is optimized to classify text into AI-generated or human-written with high accuracy.

Label_0: Represents human-written content.
Label_1: Represents AI-generated content.

Model Details

Base Model: intfloat/e5-small
Fine-Tuning Technique: LoRA (Low-Rank Adaptation)
Task: Sequence Classification
Use Cases: Text classification for AI-generated detection.
Hyperparameters:
- Learning rate: 5e-5
- Epochs: 3
- LoRA rank: 8
- LoRA alpha: 16

Training Details

Dataset:
- 10,000 twitters and 10,000 rewritten twitters with GPT-4o-mini.
- 80,000 human-written text from RAID-train.
- 128,000 AI-generated text from RAID-train.
Hardware: Fine-tuned on a single NVIDIA A100 GPU.
Training Time: Approximately 2 hours.
Evaluation Metrics:

Metric (Raw) E5-small Fine-tuned

Accuracy 65.2% 89.0%

F1 Score 0.653 0.887

AUC 0.697 0.976

Metric	(Raw) E5-small	Fine-tuned
Accuracy	65.2%	89.0%
F1 Score	0.653	0.887
AUC	0.697	0.976

Collaborators

Menglin Zhou
Jiaping Liu
Xiaotian Zhan

Citation

If you use this model, please cite the RAID dataset as follows:

@inproceedings{dugan-etal-2024-raid,
 title = "{RAID}: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors",
 author = "Dugan, Liam and
 Hwang, Alyssa and
 Trhl{\'\i}k, Filip and
 Zhu, Andrew and
 Ludan, Josh Magnus and
 Xu, Hainiu and
 Ippolito, Daphne and
 Callison-Burch, Chris",
 booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
 month = aug,
 year = "2024",
 address = "Bangkok, Thailand",
 publisher = "Association for Computational Linguistics",
 url = "https://aclanthology.org/2024.acl-long.674",
 pages = "12463--12492",
}

Downloads last month: 409

Safetensors

Model size

33.4M params

Tensor type

F32

Model tree for MayZhou/e5-small-lora-ai-generated-detector

Base model

intfloat/e5-small

Finetuned

(5)

this model

Quantizations

3 models

Dataset used to train MayZhou/e5-small-lora-ai-generated-detector

Space using MayZhou/e5-small-lora-ai-generated-detector 1

Evaluation results

accuracy on RAID-test
RAID Benchmark Leaderboard
0.939

URL: https://huggingface.co/MayZhou/e5-small-lora-ai-generated-detector

⇱ MayZhou/e5-small-lora-ai-generated-detector · Hugging Face