doc dict | metric dict | model_response dict |
|---|---|---|
{
"choices": [
" Jared types at 52 WPM and increases it by 5 WPM, 52 + 5 = <<52+5=57>>57 WPM.\nHis average over all his measured words-per-minute is 47 + 52 + 57 = <<47+52+57=156>>156.\nHis total is 156 / 3 typing speeds = <<156/3=52>>52 WPM as Jared's average typing speed.\n#### 52"
],
"fewshot_samples": [
... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Mr. Bodhi is transporting some animals using a yacht across a river. He has 20 cows, 15 foxes and three times as many zebras as foxes. To balance the yacht to ensure a smooth sail across the river, the total number of animals in the yacht needs to be 100. If he dec... |
{
"choices": [
" Jordan has 2 children, each requiring 5 diaper changes per day, for a total of 5*2=<<2*5=10>>10 diaper changes per day.\nJordan's wife changes half of the diapers, leaving Jordan to change the other half, for a total of 10/2=<<10/2=5>>5 diaper changes per day that Jordan is responsible for.\n####... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: The owner of a company needs to install a transformer to power his electrical equipment. He operates three units of machinery that require a running current of 40A each. These appliances require at least twice their running current for starting, and the transformer... |
{
"choices": [
" The boxes can weigh up to 5000 pounds - 3755 pounds = <<5000-3755=1245>>1245 pounds in total.\nThere can be 1245 / 15 = <<1245/15=83>>83 boxes loaded onto the truck without exceeding the bridge's weight limit.\n#### 83"
],
"fewshot_samples": [
{
"choices": [
" After 56 minut... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Two runners are competing in a 10-mile race. The first runs at an average pace of 8 minutes per mile, while the second runs at an average pace of 7 minutes per mile. After 56 minutes, the second runner stops for a drink of water. For how many minutes could the s... |
{
"choices": [
" If Tim uses 3 blue shoeboxes, he remains with 7-3=<<7-3=4>>4 blue shoe boxes in the box.\nTim also uses 1/3 of the red shoeboxes, a total of 1/3*9=<<1/3*9=3>>3 red shoe boxes.\nTim remains with 9-3=<<9-3=6>>6 red shoeboxes.\nThe total number of red and blue shoe boxes left in Tim's box is 6+4=<<6... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: In Rodrigo's classroom in the morning there are red chairs, yellow chairs, and blue chairs. There are 4 red chairs. There are 2 times as many yellow chairs as red chairs, and there are 2 fewer blue chairs than yellow chairs. In the afternoon, Lisa borrows 3 chairs.... |
{
"choices": [
" If the number of helmets was 20, there were 2 footballs/helmet * 20 helmets = <<20*2=40>>40 footballs\nThere were half as many robots as helmets, so in total, there were 1 robot/2 helmets * 20 helmets = <<1/2*20=10>>10 robots.\nThe total number of items in the changing room is 10 robots + 40 foot... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Kris has been suspended for bullying many times. For every instance of bullying, she was suspended for 3 days. If she has been suspended for three times as many days as a typical person has fingers and toes, how many instances of bullying is she responsible for?\... |
{
"choices": [
" There are 4 x 18 = <<4*18=72>>72 seats in the school auditorium.\n72 x1/4 = <<72*1/4=18>>18 seats were occupied by the administrators.\nThere were 72 - 18 = <<72-18=54>>54 seats not occupied by the administrators.\n54 x 1/3 = <<54*1/3=18>>18 seats were occupied by the parents.\nSo, 54 - 18 = <<54... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Daniel practices basketball for 15 minutes each day during the school week. He practices twice as long each day on the weekend. How many minutes does he practice during a whole week?\nAnswer: Daniel practices for a total of 15 x 5 = <<15*5=75>>75 minutes during the... |
{
"choices": [
" The dogs have 5 dogs * 4 legs/dog = <<5*4=20>>20 legs.\nThe cats have 2 cats * 4 legs/cat = <<2*4=8>>8 legs.\nThe birds have 10 birds * 2 legs/bird = <<10*2=20>>20 legs.\nThe pets have 20 legs + 8 legs + 20 legs = <<20+8+20=48>>48 legs.\n#### 48"
],
"fewshot_samples": [
{
"choices":... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Jeff wanted to rent an apartment to live in for the next 5 years until he finishes his studies. He found a nice apartment next to his school, the owner asks Jeff to pay $300 each month, Jeff agreed, and for the first 3 years everything went fine, but then the owner... |
{
"choices": [
" At half the speed, the car in the slow lane travels at a speed of 1/2*60 = <<1/2*60=30>>30 miles per hour.\nTo travel 480 miles, the car in the slow lane traveling at 30 miles per hour will take 480/30 = <<480/30=16>>16 hours\n#### 16"
],
"fewshot_samples": [
{
"choices": [
... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: The country of Belize had a 5 lane bridge that had 2000 vehicles passing through it every month. They decided to build a new bridge that had twice the capacity of the old one, and the number of vehicles passing through the new one increased by 60% more than the old... |
{
"choices": [
" The price of the two clothing items totals 35 + 25 = <<35+25=60>>60 dollars before discount.\nThe total sale discount comes to 60 * 0.3 = <<60*0.3=18>>18 dollars.\nThe final sale price comes to 60 - 18 = <<60-18=42>>42 dollars.\nJoe ends up with 50 - 42 = <<50-42=8>>8 dollars change.\n#### 8"
]... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Michael has a lot of matchsticks at home and decides to use them to make miniature models. As he is about to start crafting, he has second thoughts and decides to only use half of his pile of matchsticks. He creates 30 matchsticks houses. If he has now used all of ... |
{
"choices": [
" He drank 8*2=<<8*2=16>>16 ounces of milk\nSo he drank 16*3=<<16*3=48>>48 calories of milk\n#### 48"
],
"fewshot_samples": [
{
"choices": [
" Lizard eats 12 / 2 = <<12/2=6>>6 bugs.\nFrog eats 6 x 3 = <<6*3=18>>18 bugs.\nToad eats 18 x 1.50 = <<18*1.50=27>>27 bugs.\nThe lizard... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: In the garden, the gecko eats 12 bugs. The lizard eats half as many bugs as the gecko. The frog eats 3 times as many bugs as the lizard. The toad eats 50% more bugs than the frog. How many bugs are eaten in total by all of them?\nAnswer: Lizard eats 12 / 2 = <<1... |
{
"choices": [
" Johnny invested twenty and ten dollars for a total of 20+10=<<20+10=30>>30 dollars invested\nJohnny tripled this amount of money over the year, for a total of 30*=90 dollars\n#### 90"
],
"fewshot_samples": [
{
"choices": [
" Elaine was in the pool for 2*3=<<2*3=6>>6 minutes.... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: At the polar bear club, Jerry and his three friends, Elaine, George, and Kramer, took turns jumping into a swimming pool filled with freezing-cold water. Jerry was in the pool for 3 minutes before becoming too cold to remain in the cold water. Elaine stayed in th... |
{
"choices": [
" It was 3 inches tall after the first week.\nAfter the second week, it was 2 * 3 inches = <<2*3=6>>6 inches tall.\nAfter the third week, it was 6 inches + 4 inches = <<6+4=10>>10 inches tall.\n#### 10"
],
"fewshot_samples": [
{
"choices": [
" The second drink was 2/12=1/6 the... | {
"extractive_match": 1
} | {
"argmax_logits_eq_gold": [],
"input": "Question: John drinks 2 energy drinks. 1 of them is a 12-ounce drink with 250 grams of caffeine. The second drink is 3 times more caffeinated per ounce but only 2 ounces. He then takes a caffeine pill that has as much caffeine as his 2 drinks combined. How much caffeine d... |
{
"choices": [
" Christina had 30 - 12 - 8 = <<30-12-8=10>>10 neutral days in the month.\nAfter the first three 8-day periods, she had 30 - 8 * 3 = 30 - 24 = <<30-8*3=6>>6 days of moods left.\nOf those days, 12 - 8 = <<12-8=4>>4 days were good.\nAfter the next 3 days with 2 good days, Christina had 4 - 2 = <<4-2=... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: A family bought a 24 pack of bottled water. They drank 1/3 of them on the first day and 1/2 of what was left after the first day on the second day. How many bottles of water remain after 2 days?\nAnswer: After 1 day, the family drank 24 bottles * 1/3 = <<24*1/3=8>>... |
{
"choices": [
" Valerie's brother earns 2 * $5000 = $<<2*5000=10000>>10000\nTheir combined salary is $5000+$10000 = $<<5000+10000=15000>>15000\nTheir mother earn twice their combined earnings, which is 2*$15000 = $<<2*15000=30000>>30000\nThe total amount of money they all have is $15000+$30000 = $<<15000+30000=4... | {
"extractive_match": 0
} | {
"argmax_logits_eq_gold": [],
"input": "Question: Maryann spends seven times as long doing accounting as calling clients. If she worked 560 minutes today, how many minutes did she spend calling clients?\nAnswer: Let a be the amount of time Maryann spends doing accounting and c be the amount of time she spends call... |
Dataset Card for Evaluation run of HuggingFaceTB/SmolLM2-1.7B
Dataset automatically created during the evaluation run of model HuggingFaceTB/SmolLM2-1.7B.
The dataset is composed of 1 configuration, each one corresponding to one of the evaluated task.
The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results.
An additional configuration "results" store all the aggregated results of the run.
To load the details from a run, you can for instance do the following:
from datasets import load_dataset
data = load_dataset("SaylorTwift/details_HuggingFaceTB__SmolLM2-1.7B",
"results",
split="train")
Latest results
These are the latest results from run 2025-06-25T09:38:36.715958(note that their might be results for other tasks in the repos if successive evals didn't cover the same tasks. You find each in the results and the "latest" split for each eval):
{
"all": {
"extractive_match": 0.34,
"extractive_match_stderr": 0.04760952285695235
},
"lighteval|gsm8k|5": {
"extractive_match": 0.34,
"extractive_match_stderr": 0.04760952285695235
}
}
Dataset Details
Dataset Description
- Curated by: [More Information Needed]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
Dataset Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Dataset Structure
[More Information Needed]
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
Data Collection and Processing
[More Information Needed]
Who are the source data producers?
[More Information Needed]
Annotations [optional]
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations.
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Dataset Card Authors [optional]
[More Information Needed]
Dataset Card Contact
[More Information Needed]
- Downloads last month
- 16
