VOOZH about

URL: https://huggingface.co/MayZhou/e5-small-lora-ai-generated-detector

⇱ MayZhou/e5-small-lora-ai-generated-detector · Hugging Face


My LoRA Fine-Tuned AI-generated Detector

This is a e5-small model fine-tuned with LoRA for sequence classification tasks. It is optimized to classify text into AI-generated or human-written with high accuracy.

  • Label_0: Represents human-written content.
  • Label_1: Represents AI-generated content.

Model Details

  • Base Model: intfloat/e5-small
  • Fine-Tuning Technique: LoRA (Low-Rank Adaptation)
  • Task: Sequence Classification
  • Use Cases: Text classification for AI-generated detection.
  • Hyperparameters:
    • Learning rate: 5e-5
    • Epochs: 3
    • LoRA rank: 8
    • LoRA alpha: 16

Training Details

  • Dataset:
    • 10,000 twitters and 10,000 rewritten twitters with GPT-4o-mini.
    • 80,000 human-written text from RAID-train.
    • 128,000 AI-generated text from RAID-train.
  • Hardware: Fine-tuned on a single NVIDIA A100 GPU.
  • Training Time: Approximately 2 hours.
  • Evaluation Metrics:
    Metric (Raw) E5-small Fine-tuned
    Accuracy 65.2% 89.0%
    F1 Score 0.653 0.887
    AUC 0.697 0.976

Collaborators

  • Menglin Zhou
  • Jiaping Liu
  • Xiaotian Zhan

Citation

If you use this model, please cite the RAID dataset as follows:

@inproceedings{dugan-etal-2024-raid,
 title = "{RAID}: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors",
 author = "Dugan, Liam and
 Hwang, Alyssa and
 Trhl{\'\i}k, Filip and
 Zhu, Andrew and
 Ludan, Josh Magnus and
 Xu, Hainiu and
 Ippolito, Daphne and
 Callison-Burch, Chris",
 booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
 month = aug,
 year = "2024",
 address = "Bangkok, Thailand",
 publisher = "Association for Computational Linguistics",
 url = "https://aclanthology.org/2024.acl-long.674",
 pages = "12463--12492",
}
Downloads last month
409
Safetensors
Model size
33.4M params
Tensor type
F32
·

Model tree for MayZhou/e5-small-lora-ai-generated-detector

Finetuned
(5)
this model
Quantizations
3 models

Dataset used to train MayZhou/e5-small-lora-ai-generated-detector

Space using MayZhou/e5-small-lora-ai-generated-detector 1

Evaluation results