VOOZH about

URL: https://huggingface.co/datasets/InstaDeepAI/nucleotide_transformer_downstream_tasks

⇱ InstaDeepAI/nucleotide_transformer_downstream_tasks · Datasets at Hugging Face


sequence
stringlengths
199
600
name
stringlengths
5
30
label
int32
0
2
task
stringclasses
18 values
TCACTTCGATTATTGAGGCAGTCTTCATTAAAGTTTATTACAATGGATATGGTATCACCAGTCTTGAACCTACAATCATCTATTTTAGGTGAGCTCGTAGGCATTATTGGAAAAGTGTTCTTTCTCTTAATAGAAGAGATTAAATACCCGATAATCACACCCAAAATTATTGTGGATGCCCAGATATCTTCTTGGTCATTGTTTTTTTTCGCTTCAATCTGTAATCTCTCTGCAAAATTTCGGGAGCCAATAGTGACAACATCGTCAATAATAAGTTTGATGGAATCGGAAAAAGATCTTAAAAATGTAAATGAGTATTT...
YBR063C_YBR063C_367930|0
0
H3
CAGTAGTGGCATAAACCCAAGGAACAGAGCCAGTGGTACTCCATCCAATGAGCGGGCTCGTCCGGCGTCGGGTATCAGTTCGTTTTTGAATACCTTCGGAATTAGGCAAAATAGCCAGACAGCTTCTTCTTCTGCGGCTCCTGATCAGCGTCTATTCGGCACAACCCCATCTAACTCACATATGAGTGTGGCCATGGAAAGTATCGATACCGCTCCGCAACAGCAGGAACCACGTCTGCATCATCCTATACAAATGCCTCTGTCGGCCCAGTTCCACGTTCATCGCAACTATCAACTCCCCATCTCCATATCTCTCACTG...
YNL116W_YNL116W_408685|1
1
H3
TTTCCGATAAGCTTCAGCCCCGGCAACGCTAAAAATAGTATCATTCGCACCCCATGCAACTAGCACAGGAATCTTTGAATCTCTCAAGAACTTTTGAAAGGCTGGGTACAACTTTATATTATTCTGATAATCGAAAAATAATCTCAATTGAATATCGGTCTGGCCGGTACGCTGAATTAGCGCAATATCAAGCGTATAAGCGGCGGGATCAACAGATTCGATAGCTGGTACTCCATCATGGTACTGGCATATGACATTAGCTGGATCTTCAAGGTACGGGATAAGGGATTTAACAAATACCGGATCACTTTGATATGACT...
YNR064C_YNR064C_749488|1
1
H3
CCGTTTGGAGTAATGAGCGGTTAAACTTGTTCTTGGTGAAGTGTGCAGAGTTTCTAAACTTTAAGTAAATGACACTAAGCCTATATTTTTCGCATTGCTAAAGAAACACTGATACATAATGTGTCTACATTTTATATAGCTGATAGCACGTTCTTTTTAAGTTGAATTATCTTTTTTTTCTCTTGTTAAATAAGATGGACAACGCGCTTTCGCTGGCGAGATGAAGCGCCCCTGATTGACAAGCAACTCGTGGAGGCAAGTAATTAGCGACGCTTTACTTTGTCTGATAGCAGTAAATATGTGACTGCATTTAAGAAGTG...
iYDR005C_458375|0
0
H3
TTTTTATTTAGTCGACTATAAAGGTGGAAGTCCATACTTAAGAGATATTAAGGGTATTTTGATCAACAAGTAAGTAACAATCGTTATAAAAATACAATAGCAAAAGTATGAGCGGAGAAAATCGTGCTGTGGTGCCGATTGAATCAAACCCTGAAGTTTTTACAAATTTTGCACATAAATTAGGTTTAAAAAATGAATGGGCGTATTTCGATATCTATAGCTTAACAGAGCCAGAGTTACTAGCATTCTTACCAAGGCCAGTGAAGGCCATTGTGCTGCTATTTCCGATAAACGAGGATAGAAAATCGAGTACCAGTCAA...
iYJR098C_615412|1
1
H3
CCAAAGTCTTTATACTCGCTCGTTGATGTACTGGGCGTTATCAAAACTGTTTGCCACTGGGTAGTTGCCTTATGGTCACTATCTGCAAGTTAAATTCAATCGCAAGAGGAAGAGAAAATAAACCGTCGCATCACTTTTCTACACTTACCCGGTTATTAATTTTATAACGTTTATATGATATATCTTTTCTTATTTATTATATTATGAATCGTGAAAACGGATTAAGCTATGCTTACAATTTGGTTTCCTCTAGTTTCGTGGCAATCTCGTTGGTCTTAGTAGCGCCTGGGTACTTAGTACCCTTATTCTTTGGTTTCTTA...
YGR185C_TYS1_866355|1
1
H3
CCAAGTTTTGGACCACCAAAGAGCCATCCCATTCAGAAGATTTAACTCCTCTATTGGTAGAACTGCCCAAGGTAAGGAATTTGGTGTCACCAAGGCTAGATGGCCAGCTAAATCTGTTAAGTTCGTTCAAGGTTTGTTGCAAAACGCCGCTGCCAATGCTGAAGTATGTTTAAAATGAAAATATGAAAAATAAGATTGAAAAATAATAATGATGTGTTGACGGGAGATGATAAAGTATTAGAATATGTTCATATGTGTTGCATCCTATTTTCTGCATGAATGCACGATTCGAAAGAGCTAAAATTAACAGTTTTCCAAAT...
YJL177W_RPL17B_91178|0
0
H3
CTAATTTCCGGGACTCTTTGTTGAGCGATGATTGCGACTGTTGCACAAATGCATTTAATTGTTTAGATTGTTTTGCGTAATCATTGGCTATCACTTCAGACAGCTTGTCTATTTGAGATAGTGTCGATAATTTGTCTAATAACAAGGTGTACCTTGCTAGATTTTTAGACAATTGGCCCTCGATGATCGAGCTCTCCCATGATGCATTCGATTTTTGCCCTTCCATCAAAACTTCACCCTTTTAGAGCCTGACAATACAGTATCGTTAGTTTGCGCTACTTCGCTGTCCCAACGAATTTGGTTTGTTTTCTATGTTCTAT...
iYPR105C_739589|1
1
H3
ACTATTGTAGCAAATAACTTGTTTCAGAGGGGAGAAAAAGTAGCCGTGGGGGCCTCTGGTGGTAAAGACTCCACGGTGTTAGCGCATATGCTGAAGCTTCTGAATGACCGTTACGATTACGGCATTGAGATTGTACTGTTGAGTATTGACGAAGGAATCATCGGATACAGAGATGATTCCTTGGCCACGGTGAAAAGGAATCAGCAGCAGTATGGCCTTCCTTTGGAGATTTTTTCCTTTAAGGACCTCTATGATTGGACAATGGATGAGATTGTATCCGTTGCTGGTATAAGGAACAGTTGCACGTATTGTGGTGTTTT...
YGL211W_NCS6_92932|1
1
H3
TCCTTTCCACTAGCATTTGGTCTCAAGACCTCATTTGGGTTTATGCACTATGCCAAGGCCCCTGCCATTAATTTACGCCCCAAGGAATCCTTGCTGCCGGAAATGAGTGATGGTGTGCTGGCCTTGGTTGCGCCGGTTGTTGCCTACTGGGCGTTGTCTGGTATATTCCATGTAATAGACACTTTCCATCTGGCTGAGAAGTACAGAATTCATCCGAGCGAAGAGGTTGCCAAGAGGAACAAGGCGTCGAGAATGCATGTTTTCCTTGAAGTGATTCTACAACATATCATACAGACCATTGTTGGCCTTATCTTTATGCA...
YDR297W_SUR2_1056824|1
1
H3
ATTTGTTTTCAATAGGATCTGTTGATTCATATAACTGTTCTAGAAGGTTCAGTGGGAGAGTCATATTATCAATACCGGCTAACGCCTTCAATTCATCTAAATTCCTGAAAGAAGCCGCCATTACCTCAGTTGCATAACCATGCCTCTTATAGTAACTGTATATCTTCTTAACAGAAAGAACACCGGGATCAGTTTCTGCAGTATAGTCTTTGCCTGAAAGGGCCTTGTAAAAGTCCATAATCCTTCCAACAAATGGGGAGATCAATGTGACATTTGCCTCCGCACAGGCTACTGCTTGCGTAAAGGAAAACAGTAATGTC...
YGR043C_YGR043C_580864|1
1
H3
TTCTGAACCTGCTGCAACAACCACAGATGGATCATCTTCTATGTCAGAGGAGCGTGTAGGTACTGAGCAAACTGCAGCTTCCGTTCAGGACAACGGAACTGCAAATAACATTCAAAATGGTCTTGGTGAAGAAGGAAACGCAACACGATCAAAAACTTCAAACGAACACAATGAAAACCAACAACCATCTCAACCATCAGAACGTGTTATACTTCCTGAAAGAAGTGATGAGAAAAAAAAATACACTCTACTTGCAAAAGTAACAGGATTGGAACGATTTGGATCTGCAACCGGTAAGAAAGAGAATCCGACTATTATAT...
YOR132W_VPS17_573498|1
1
H3
AGCAGTTGTTTACTCGAATTTTGCAATCAACCCTAATTTTTGAAGCCTGGTTTTAGATTTATTCTCTTCTTTTTCTTCTTGTGAACTTCAATTACTAATGTAACTTAATTTTTAATATAACTTTTACAGTTTAATAATATTGATTTTTTTCGGTCTGGACCAATCGCGCCGCATTTCTCACTAATATTACTAACATACCCTCTTCTCATTGGCTCGGTACCCCTTTCGTGACCCGCATTTTTTGTTTTCTTTGTTAGCCCGAATGTCTCACAATGAAGATGTAAAATTAAGATTATATATGAAAAATTGATACAAAACAA...
iYLR219W_576585|0
0
H3
TTCATTTTTACAACAGTATTCCAAACGAGCCGTGTATGCAAAAAGAATGAGGTCAAATCAAAAGATCAAGTACCAAGCCAGCTGATTCTCTTCTTACTAAGTTTGACTATCGTTTACATTTCTTGCTTGTTGTTTCATCAAACAATGTACTCTTTTCTTGTTTTAAATGATTTTTTAGCGGCGAAGGTAACAGCAGCAAAAAAAAAAAAGAAAAAACATTGCAAAATAGCTAATGGAAATGGATTTATTACTGAAATATGTAGAAATATTTAACTCGATAGATTTGATCATACACATATAATGCAAAATATATATATATA...
iYBR057C_353497|0
0
H3
AAATGCGTTGATATTGTGAAACAATCCTTACCAAGACAGCGCCAAAAGCTATCCTCTTTAAAGAATTAAACTTGATTTGTTCCAGGAACTGGCTCATTAAAATCAATGCTGGAGTTATTACTAAATGGTACTGGTCCGAGGTGGAGAAAAGAATACCAATAATGGAAAAAAATACCAGATCACCATTTGATAGAGCATCAAAATGGTTCTTTTTATAGCGTGCTTGCATTTCATTGATATAATCTCTACACTCCTCGGAAAGCTCTCTATTATATTTCTCTGAAAGGGATTTTAGAATTGAGATCAATGCATTTTGTGTA...
YDL148C_NOP14_189168|0
0
H3
CGCTATCAAGCCCAACTTGATGCAAACTTTAGAAGGTACTCCTGTCTTGGTCCATGCCGGCCCATTTGCCAACATCTCTATCGGTGCCTCTTCTGTTATTGCTGATCGCGTGGCTTTGAAATTGGTTGGTACCGAGCCAGAGGCAAAAACAGAAGCTGGTTATGTGGTTACTGAAGCAGGGTTCGATTTCACTATGGGTGGTGAAAGATTCTTCAACATCAAGTGCCGTTCCTCTGGATTGACACCTAATGCTGTGGTCTTGGTTGCTACTGTTAGGGCATTGAAGTCACACGGTGGTGCTCCAGATGTCAAACCTGGCC...
YGR204W_ADE3_908050|1
1
H3
AAAGGACACTAAAATAAAACAGAACCGGAATGTGTTGAATAAAACTGAAGTGAGCTACTATATCTTGAATCACAATATCTTAGCTCCGTGATAGAGCATAGGGCACCTCTATTGATCTCTTCGTGATGCCCGGTTATATACACACGTAACCCCGTAAAAGCCGTTACCCGGTTTATGTCTTTCTGTGAGCACAGTAAGCTAGGAAGGTCGATGGAGAAGAAGACTCCTGTCTCATATGGAACAACCTTGCAAATTGAGTGCAATTATCCCCTTTGGAGTTTGAATTTTGTGTTAGTACATTCTCCTTTTTTAACTTGTCT...
iYIR002C_360653|1
1
H3
ACCCAGAAGAAGTGAAAAGAAAAAGGAAATTAATAAATGATGTCGATGCAGCCCAAAAAAAACTAAGTGAAAGAAAAAAGCATAACAGTTGGGTGCCGAAGTGGTTGAAACCGAAGAAATCCAAATGGAAGGTCATGGTCGAAGAAGCTGTCGAAGAAGGAAGAGATATGCAAGACCTACCAGAAAATGACGTCAATAACAATGAAAACGAAAATCCAGATGAACATGAAGGGATAGCAAGGCAAAAACGCAGAGATGCTGCTCTTGTCGATCATGGGGCATTAATGCATGAGTTACAACTTATAAAACAAGCGATGCAC...
YFL034W_YFL034W_68389|1
1
H3
GGGGTTACGACAATGCTCCAGGTATCTGGTCCGAAGAACAAATTAAAGAATGGACCAAGATTTTCAAGGCTATTCATGAGAATAAATCGTTCGCATGGGTCCAATTATGGGTTCTAGGTTGGGCTGCTTTCCCAGACACCCTTGCTAGGGATGGTTTGCGTTACGACTCCGCTTCTGACAACGTGTATATGAATGCAGAACAAGAAGAAAAGGCTAAGAAGGCTAACAACCCACAACACAGTATAACAAAGGATGAAATTAAGCAATACGTCAAAGAATACGTCCAAGCTGCCAAAAACTCCATTGCTGCTGGTGCCGAT...
YHR179W_OYE2_462988|1
1
H3
TAGCAAATTAGGAAAACAACTTTAGCACGCCCAACTGCGTACTGAAACGCAGAAAACTATAGAGTTTCCCGAAACGGTGACAGCTGAGTTCGCTCAAATAAGAACAACACCATGTCTTGAAGCTTCTTCTATAGCGTAGAGTAAGTAGTGCCTTGTAGCCCTACCGTTTTACAATGCATGAGGTTACCCGCACTTATTATTTTTTTCTCTTTTTTTTTCTTAGCTATAAAAGGCAGATCAATGCAGCATTCATTGCTTTATTTGACTTTCCTTTACTATTTATTTACTTTCCATTTCTTATTTTAGTGTTATTTTATAAC...
iYGL089C_345913|0
0
H3
TCATGACTCAAGCTTGCGATATGTGTTGGTGTCATGCACAAGACGCCACAAATGATAGAACAGAAAAGAAAGTGAACTAATCTTCCAAGACGAAGAAAACCAAAATCCGGGATGAGTTGAAAGTCAAAAAGACTGTATATATAAATTTCAACTTTTGTAGAAGATGCAGAAAAAGAAAATGATATGGTATGCAGAAAAAGAAATAAACCGCTATTATCCTCGCGGTTTGTCATTATAACAGGCAATTACACTAGAGAAAGCCGCACACCTCCCTCCGTTTCTTTTGCCCTGCGAGTTTTTCCGGAAAAGAAAAAAAAAAC...
iYBR105C_452171|0
0
H3
ATTAAACGCACCACACTGTTGCCGTTTACTTTAAAGAGAAGCGCAAAGAGCGGCAATGGTGATAGCGAGTGCGCATACGCTCGCTTTCCAGGTCAAACTCATATTTTTGATTGGTAATAGCTGCGAAGTTGTTAGTTTATTAGAAATTGCAGCATTCTTACTCTTCCTGGAAGTCGTAGTGGAGGACAAAACCGATGACGTTGTTCTTGGAAATGCCTGTGCTGATAAAGAGGATGAGAATTTTTCTTCGATAGATCCATCTGCTTTGAAGTACGTGTAGTTGTCGAAAGGGGAAGGGAAAAGGTTAGCTTCTTTTAGCT...
YDR261C_EXG2_977731|1
1
H3
ACGGTGCTGTTCCCTGGTTTTGTCTTGCTTGTGCCTGGTACTCATCGTCAAATTTTGGCATGTCCGCACTACCTGCAGTTTGTCTTTTAAGTTGTGTCATTATAGGTTTCAGGTCTGGCTCCACAGGTAAGCTTTCTCTTTCTAGTGGCCATTTCAATTCTGTGGGCAATTTAAAATCGCTGTAACCCACCACACATGCCATGGTGATCGACGTACCATGACGAATAACAACAGTATACTTGTCATTGTCATCCAGAGTCTGAATCACCACTAGCTCAGAGAGAAACGATGTTGAACCCTTCTCACTGGGAGAATCTTTA...
YCR076C_YCR076C_249852|1
1
H3
CTAATAAAGGAAATATCAGAAAAGATGAAGCTACTGGTTTCTAATTCCAATGGAAGTAGTAGTAGTGGTAATTCAAGTTCAATTTACAACTCTCATTTAATGAATGACAAGAAGAAAAACAACAATGCCGGCTTAAACAAAAATATGTTGAAGAAAATTATAAAATAATTGATAGAGGGAAGAACTCCCTTTAACATTCTTATGACCATGATACATCGCACTTTTTTTTACTTGGATTGAGGAAATATGTATATATAGGCACGTATGTATATATACAAATGTTCTTTCTGAAACCATGTAAACAGAAAAGGAAAAAAAAT...
iYPR084W_708420|0
0
H3
CTGGAAAAACAATGTATGCTGCCGACGGTGACTATTTAGAAACTTACAAGCAATTGGAAAAAATTTACCTTGATCCTAACGATCATCGTGTGAGAGCCATTGGTGTCTCAAATTTTTCCATTGAGTATTTGGAACGTCTCATTAAGGAATGCAGAGTTAAGCCAACGGTGAACCAAGTGGAAACTCACCCTCACTTACCACAAATGGAACTAAGAAAGTTCTGCTTTATGCACGACATTCTGTTAACAGCATACTCACCATTAGGTTCCCATGGCGCACCAAACTTGAAAATCCCACTAGTGAAAAAGCTTGCCGAAAAG...
YBR149W_ARA1_540657|1
1
H3
GCGCAAGCCTAGGAATTGCTCTCGATGGCTTGGACAAGCCTCCGACTTCTTATCTGCTTGTTCACAATGAAGATTTTGAAAAAAAGTGGGACGTGTTAATGACTTCTACTTTTAGAAATAGAACTGTGCCATTAAATATAATACAGTATTTGATCTCTCATACTGATAGTAATACTGAGTTTAATCGAATGTTACGCTCCAATTTTGATGATTCATTACTACTTATTGAGAAATGTAAAAAGTTTATTAAGACCTTCGTGGATGTTTCCTGTTCTGTTAAAGATGTAGATTTCGGAAACGGTTTCAATTTACGTCATCTA...
YMR176W_ECM5_614461|0
0
H3
TATCAGAGCTCACTCATGTAACAACAGCTCATACGTTGACAGTAACAAGATTACTCAAGGTTCCGGTACCAGATGTAGACAAGCTCAAGCCGCTGTTGCATTCCTCTACTTCTCTTGTGCCATCTTTTTGGCTAAGACCCTGATGTCTGTTTTCAACATGATCTCCAATGGTGCCTTTGGTTCTGGTTCTTTCTCCAAGAGAAGAAGAACTGGCCAAGTCGGTGTTCCAACCATTTCCCAAGTCTAATTGAAGCGCACCAACTTAAATTTTACGCCACTTTCAATTAAGAATATAATAAATGGACACCGTGAATAAATTA...
iYPR149W_830436|0
0
H3
TGACTGTAGGTGGTACTTTTACATTTTTATGTGCGCTGGTATTTTTCTTCAACTTCTTAATGTTTATTCCGATGAAATATGGTATGAAGTGGAGGGAGGATAGATTATTGAAACAACAAAGACAGTCTTGGTTAAACACCTTGGCAGTCAAAGCCAAAAAGGGAACAAAAAGAGACCAGAACGATAATCATAATTAATTGGCATTCTTCAATTTGATAGACACTTATCCTGCATATTTTTTTTATAAACAGCTTATAGACTTTCATGTAAATTTTTCCTAATTAATGTATTATTTACTTCGTTAATTTTCCGTTGAATTA...
iYNL065W_505534|0
0
H3
ATACAATTCGGCTTCAAATATGCGGGTGCACAATCTATCTTCAAATTTTCTGTTATGGTCTCAGAAGTTTGCAACTCAATCCCGTACAATTCTTGAATAGAATGCACGAACTTGATAACCTGAGTTAAAGTGAGACCTTCCTTAATACAGGTCAATTTAACACGTGGTTGTTCTTCACCAGCGTATAAGTAAACGGTTTCTAACTTGGACATGGTTTCCTTTCTTTGTACTTGGTATTTCTTCTTCTTTTTCTAGCTGTCTGTATGAGTACTTAACTGTAACAGAATAGTAATGGGTTCAGGCTAAATCTCTTTAATTAT...
iYKR011C_462372|0
0
H3
TCAGCGGCAACAGATGTTTCGACAGTAGCAACAGTTGAGGCAGGAGCACGTTCACCCTCTCCAGTCCATACGAAAGTAGCATCCTGGGTGACGGTCTTTGTGTAGACATGTCCATTTTTGGTGGCTGTAATAGTGGTGGTTATACTCGAACCAGACCCTTCATCAGCAGAGGTGCCCTGAGCAGAAGTGGAAGGCTCAACAATAGATGTTTCAGCGGCAACAGATGTTTCAGCGGCAGCAGACGTTGCAGCGACAGTAGACGCTTCAGTTACTGCGGAGGTTACCGCAGCGCCTTCTCCTGCCCAGACAAAAGTAGCATC...
YDR534C_FIT1_1504697|1
1
H3
TACCACCATTATGTCTTGTTGGAGAGTTAGCATCCCCGACTCTTGCCCTACCAGAACCCTTTTGGGGCATCAATTTACGTCTGGAAAACCCGTTTTCACTTCTCCCGGGAGGATTTGAGGCGCCTACTCTTCTATTATCGTTTTCATAAACTACGGCTCTCCATAGTATATCACGCCTTAATGGAGCAGCTACAGTTGATGTGGGGACCGGCACAAATGTCAATGGTTCTAATGATGGAAAGGAGCGAACGGTAACTAAGGCGTATTTGGGAGGAATGGCTGCGTTTGGCAGAGGGTTTAACGTGGATTCTGCATGTGCG...
YML025C_YML6_225131|1
1
H3
ACAGTAAATCCACCTCACGGAGAACGATAAAAGTCTTCATTTCTCCATGAGGTGGAAACGTTGTTTGAATATTAAACATAATACTTGAACTTCTTGTGTAATTTTTATTTTTTAATCGAAGTCTACTCCGAAGTCTCCTATAAAGCGTAAGGCGCGTCTCCAAAGTGCTATTTTTCAAGATCAATTCAGCTTCAAAGAAGCTTTGATACCTACGGGGCAATGCAAAGGCTGCCAGTCGAAACAAACAGCGAAAGCTAGCCCATAATTTCTTGAGCTGTTGATTTTGATTGTGATATTTTTCATACTTTAGAAGAAGATAT...
iYDR135C_728183|1
1
H3
GGTAAATCATTGAAGGAGTCAAGAGCTTTATTACTTGGAATGATGTAATAAGAATGATCGTAAATGCCGCTCAAGTAAGACGTCTCAATTTTACCCCCATTTTGAACAACAATTTTGGTTAGTATTGAACGGTGATTGCCAGGAAAAATATGATGAATGATAAACGCACAATTTTTGAAAATCAAAGTACTATTATCACTTAAATCTTCTTGTTTGTTATTTATCGATAATGGTGACTGCCCTAGTACTGAAAAATTTGTCTTACTATGCTGCTGTAGAGACATTGCCTTGTCCCATATTTTTTCACCCTTTGGTTTAAA...
YJL090C_DPB11_263840|0
0
H3
GTTCTGCCCAGCAGGTGTCTACGAATTCGTCAAAGATGAAAAATCGCCTGTGGGTACGAGACTGCAAATCAATTCACAAAATTGTATCCATTGCAAGACCTGTGATATTAAGGCGCCCAGACAAGATATTACTTGGAAAGTACCCGAAGGTGGAGATGGACCAAAGTACACGCTAACTTAATCACAGGATTCATTATTTATTTAATTATTTTATTCATTATTTATTTAAATTTTTCTATTCATTCTTTTATATAATCTATATTATTTATTCACGTAAAAGAGTTCTTTTCAGCCGACAAACTTTTCAGCTTCAATGAACC...
iYOR356W_1009179|1
1
H3
TCAATACTTAGACTTGTTCACCTTGAGTGATCCTGAAAAAAAAAAATGATGGCCGCGTTTCGAAGAAAATAAAGAAAAAATAAAGCAAACCTCAAACAAGAAAACAAAGCGGAAAATGGTAAACAAGAGAGATAGAACAGAGTAGTAGGCTATTAGCAAAAGCGCGAGAATTACTACATTATAAAGGATCTGTCAAGAATGACATTGAATAGGAAGTGCGTAGTGATACATAACGGATCGCACAGAACGGTGGCTGGATTTAGTAACGTTGAATTGCCACAATGCATTATACCCTCTAGCTATATTAAAAGAACAGATGA...
YPR034W_ARP7_639571|1
1
H3
CACTAGCGTATCGGTATTCAAAAAAAAGGAATGGAGAAGAAAGTGACCAGGATGATGAAGAGTGTGATGAAAAGTCTAGGGGCGAAGGCCATTCAGATCAACCACAAAACCCAAAACCAGAGAGCTTTACAGCCACAAAGGAAGAAAAGGCTCTGGTAAATCAATTAGTTGCGTTGCATAATTCACTACATGTGAAAGGCGTATCTTTGTATGGGATTGGCTACGGTAAAGTGCATCCTGAAAATGCTAACGGAAGCCACGGCGAACCGGGACTCTCAAACTGGGCCAATACATGGTGTGGATTATTAGACTACATATTT...
YML118W_NGL3_33556|1
1
H3
TGGAAACGGGGATGGCCATCATGGCTCTTACAATATGAGTTCGTCAGATAGAAAGAGACTAATGGAGGAAACTGGAAACAATGGAAACTTCTCCAATAAGAAATTCAAAAGAGACTCAGAGCTTCCAACAGAGGTTCTTGATTTATTGAGCGTTATACCAAAACGTCAATATTTTAATACAAATTTACTCGATGCGCAGAAATTGGTGAATTTTTTAAATGATCAAGTAGAGATTCCAACAGTTGAGAGCACCAAGTCAGGTTAACATTACGTTAATAAATAGGTATATATGAATATTTATACCAACACATCTATTATAA...
iYMR061W_394772|0
0
H3
TCCGTTTGTCAGGATCTTTTTGTAACAGCTTCTTCAATAAATCTTTGGCATCAGTGTATTCTTCCATTGTGATTCCCGAAGTGGCACCATTCAACATTTCCTCATAAGATGGAAATTCTAAGGGTTTATTGATAATACTGTCGAAAAGTTCTAATCCTGAATTGGCGTTGAAAGGCAACTTTCCAAACAACAAGCAGTATATAGTGACCCCCAATGACCAGATATCGATGGCAGATGAGCAGGAATACTCTTTTTCAGTGGAGCAAAGCTCAGGTGCAAAAAAAGCAGGGGTCCCTAAGGCTCTAGATTTTAAAAGTTGT...
YGL179C_TOS3_164348|1
1
H3
AGGTCCAAGTACCGAAAGTTCTTGCACTCAAATGGGTTGTTTCAGTGGGTTTTCTTTCGTAGACTTTACGTGTCAATTCTAAACCAGAAACGTAAGTCTGGATAGAATTGAAGACTGATACAATGGAAATGAAAAGTAACCATTTTGGTAAGTAACCTTTTGGCATTGCTGCCAAGGTGGTCTTGGTTGTAGTTATTACGTCTTGTAGGCTGAACATTATCTGATACTTTAGTGTATTTGAATCCTTCGATACTATGGTCTAGCTAGTGTCAAGAATATAAAAACCTGAAATTAGTCAATGCGTTTACGTCAAAAATAGT...
iYER044C_238048|0
0
H3
GAGAATTGGCGGATGATGGGTAACTGTGAGAGGAAGTGTCTGGATTTGATCGTATTTGTTAAATTCATCAACACCTTTTATGCAGAGGTGTAGTAGTGTAGGTAGCCCTTAGTAATTGGTGTTTATTATTGTCTATATAGAAATTTCTTTTTAGTGTAGCTTGTACTCTTTCCCTGCATTATTTTTGTTTTATTGACTTTTGTCTTGGTGAATCAAAAAATTAACGAAACGAACAAATTTAAAATGGCAAGATACGGTGCTACTTCAACCAACCCAGCTAAATCTGCGTCAGCTCGTGGTTCTTATTTGCGTGTTTCTTT...
YJL177W_RPL17B_90789|0
0
H3
TACTTTAAACAGCTTTCTTGCTTTTAATTTTCTTTCTCTATTAATAAGTTTAGAAGCCATATCTACAGGGTGCAACTCCCCTTTTGCTATCATTTCATGAACTTTGTGATCAATATTTTTAGTATTAATGACATCTTTTCGTACCCCTGATCCCCTTTCAACAACGGAACCAATACATTTCATAATTTTAATAATTTCAGCGTGGGAATTTGTGACTGCTTTAGGAATATTATTTAAAGTTGTAATCTGATTTGATAGAGGACAAAACACTCTTTGGTATTGAAACGCATAATTGGCGAACTCAACCTGCTGCTTGAACG...
YDR263C_DIN7_994653|1
1
H3
AATGCATTTGGAGGAAGCGCCACCACCGGAGGGGGCCTTTTCGGTAACAAACCTAACAATACGGCGAACACTGGGGGCGGGTTATTTGGCGCTAATTCGAACAGTAATTCTGGCAGTTTGTTTGGTTCCAACAATGCACAGACGAGTCGTGGTTTGTTTGGTAATAATAACACTAATAATATCAATAATAGTAGTAGTGGCATGAATAATGCAAGCGCTGGACTATTTGGCTCTAAACCTGCAGGAGGCACTTCTTTGTTCGGTAATACAAGCACCTCTTCGGCCCCTGCGCAGAACCAGGGCATGTTTGGTGCAAAACC...
YGL172W_NUP49_181175|1
1
H3
AATCAGAGGAAAAGGCCTTCTTGTTTGGAGTAGCAATGGAAATACCATTTTCGACAAACTTAGTGTAAAAACCAGCAATGTAAGCGCTGGAAGTGTTATCAACCAAAATGACTGGCTTAGGTGAAGTCTTCAAATGAGCAATTAAATCATCCAAAGGCAACGTTTTAGTAGTGGAGGCTGCTAAAGCAGCCTTCCAATCAGAACCAACATTTAATGGAGAAAAGTCCTTGGAGATTAAAGAACGCTCAGCTTCAGCCAAAAGAACTAGATTGTAAGTAATGGTAGACTTCATGGCTAACAATTGATCCAAGAAAGCTGAA...
YJR139C_HOM6_690099|1
1
H3
AAAAAGTGGTCGCTTTTTGCAACCTCAGAATCACACACTTCACTCATTATTCGTTTGTAATCATCTATTCTTTGATTTCTATTGTTGACACCTTCATTTGCATTTAAATGAGCACAAATATAGCTAAATCTTTCCCAGTTCTCCTCGCCATTGCGTGTCATCTGAAAACTTATCAAAGTTCCACCTTTCAAGTGTGTACCAAACCAACCGCATTTCCCATTCCGTTTCAAAATATCATCTTTTACCTTTAGCGCATTATTATTGTATAATACTATTATGGTAATCGCCCCAAGACTATTCACCCCTAAACATGAATACTG...
YOL065C_INP54_205535|1
1
H3
TACATTTGGCCTTATAGAGTGTGGTCGTGGCGGAGGTTGTTTATCTTTCGAGTACTGAATGTTGTCAGTATAGCTATCCTATTTGAAACTCCCCATCGTCTTGCTCTTGTTCCCAATGTTTGTTTATACACTCATATGGCTATACCCTTATCTACTTGCCTCTTTTGTTTATGTCTATGTATTTGTATAAAATATGATATTACTCAGACTCAAGCAAACAATCAAAGAAATCTTTCACTGCTCTTTTCTGTGTTCCATTTAGTTTTTAGTACGATTGCATTGTCTATATACTGTATTTACCAAATCTTAATTTTAGTCAA...
iYCL066W_14042|0
0
H3
CATTTTTACTTAAAAATTGATGGTCCATCGTAAACAGCGTTAATAATTTCTTCGCCCGATGCATAGACAATGAGAAGCCAAAATGCAGAAACTTGATACTTTCAAAGAAAGAGGTTCAACTATGAAAAGAAAGAAGATTTAAGTGCCGATAGAGTATACTAGCATGGCACTTGTCAAATACAGCACAGTTTTTTTTCCACTCCGATCATTGCGACTGTTTGTGTCCATCAAGAAAGCATATTACCACAGCGAGCCGCATAGCATCGATCTATTTCATGATAAGGATTGGATTGTGAAAAGACCTAAATTCCTAAATTTAC...
YPL029W_SUV3_495588|1
1
H3
AGCGTTTTCTTCCAATTTACTATCCAAGTCAATTACCTTAAGCCCTACAGTGAATTACAATGATAAGACCATCAGCGGAAACTAAGTAGTTCATATAAATGCATTTTTTCATAAATGTGATAAAGGGAAAATTTCTTTTTCCACTTGAAAGATATGTACGTATTTAAGAACTGATCTTCTTGACCACTTTTTTTATGCGTGCAACTTTAAATGCTTTCATACATTTCCTTGGTCTTTCATCAGATGATATGCCAGATATCAACAATTATAGACCTAATATAACATTGAGTTATTTTTTACACGTATAATAATAACTCCTT...
iYPL120W_323908|0
0
H3
TGACTCCGGTTTCCAAATTTTGAAGCCGATGGACGGATGCATCAATCCGGGGGAACTACTTGTTGTGCTTGGACGACCCGGTGCAGGATGTACTACGCTGCTGAAATCTATATCTGTAAATACACACGGATTCAAGATTTCTCCGGACACAATCATCACGTACAATGGATTCTCCAACAAAGAGATCAAAAACCATTACCGTGGTGAAGTGGTCTACAATGCAGAATCAGACATTCACATCCCGCACTTGACAGTATTCCAAACTTTATACACAGTGGCAAGACTGAAGACACCAAGGAACCGAATCAAGGGTGTCGATA...
YOR328W_PDR10_932611|1
1
H3
TTTTCCCCCTGCTTGTTCCATTAAGAAAGCCATTGGGAAGGCCTCATAAAGCAACCTCAGTTTTCCGTTGGGGCTCTTCTTGTCGCAAGGGTATGCGAAAAGGCCACCGTAAAGAAACGTCCTGTGAACATCAGCAACCATGGATCCAACATACCTAGCCGAGAAAGGCTTGTTGTTGTTGTCTGCTTGGGGTTGTTTGACTTTCTCAATAAATGTTCTTATAGTCTCGTTCCAGTAGAGGGTGTTACCTTCATTAATTGAGTAGATGGCCTTTTGAGGCGGAATTCTTAAGTTAGGATGAGTCAAGATGAATTCGCCCA...
YLR377C_FBP1_874123|1
1
H3
GTTATCGCAGATAGTTGTTGCAAAGATAGCGGCGTAGGTGGCCGCGAAATGGGGAATTCCAAAACAAACGGTTTTTTTACTCCTGAGAAATACTTGTACGGGATAATCCAGGGCCTACCACCCACGCTTCGAGGATTGGCTTTTATTTTTTTTTTTTTGGTGGCGTTTTATTTCTTTCCCGCTTTCTGGGACTTGTGCGGAGTTTTGAGAGGGGCGCGCGGCAAAGGATTCCCAAAACGGAAATCAGACGCCAATAGCCAGCACTCAAAGCAGTTCTGGACCCATTCCGATTTTCCCATTTGGTTCTTGCGCGTGCTGAT...
iYLR110C_370593|0
0
H3
TCCTAAAAAAGAGTTAAAACTACTTTCTCTCATCGTGCCCAGCGCACTGATCTCCTTATTACATCGGCGTAGCTCTTCCGCTCTTGCATCTGAAGTCATAGGGCTGCCTGTCTCGGGTAATAAAGCGCTACCATCTTGTTTTTTTGAGTTTACCAGAAATTCCTTGCGTTGTACATTTATAGTCTGTGGGCTTGCCATAGACTTGCTCGTTATAGCTGAGGATGCCATAGAATTCCCTTTCTGAACCAAATAATCCGCCAATAAGCCATTATGTGGCAATATTCCCTCTTCTCCCAAAGATGAAACTGAACTGTACTTTT...
YLR014C_PPR1_174562|1
1
H3
TATCGATGCGTCACAGTAGCAGATCATCTCTGACACTTGTTTCCCCATTTTTTTTTTTCATTTTTTAAAGGGTTTCTCTACAGCCTACAGGCCTCCCCTAATAAGTCAGCCCCTCCCTTTGGAGTGCGCTGTTGACCTGCGTATATAAGAGGTATATCAGTGCCAGTAGGTAAACCCATCTTGCGGGGATTGTACCAGGAACATAGTAGAAAGACAAAAACAACCACCGTACTTGCCATTCGTATAGATGCTGCCCAGACTTGGTTTTGCGAGGACTGCTAGGTCCATACACCGTTTCAAGATGACCCAGATCTCTAAAC...
YDL085W_NDE2_303212|1
1
H3
TTTTCAGCGTTTTGCTCCATCAATAATGTTTAGAAGTTGCCTTTTTTTCCTATTCTGAGTTCATTAATGTTTCATTACTGCTCATCTCATCACTACACATTCGAACTAAATTCTTCATAACTTTAACTAGATCTTATGAAACTTTTCAAAAATGGTGCACGGGCTTTCTGTATATTGAACTTTCCGTCGCTTTTCGGAGCTACAATTCCCCTACCCGGCTTGTCGGCATTAAGTCTCTTCATCCCGTGTACTATTACCCGACAGTTTCTGCGGCATGTTCTGTGGAACGACGGTACGGGCGGTGACTGGGTGCCAAACAG...
iYLR401C_924676|0
0
H3
GGTCATGAGGAAAGAAAAATATGCAGAGGGGTGTAAAAGTAGGATGTAATCCAACTATAGTTTGCTTTCAATGTTTTTGACCAATTCCTTGTATTTCTCAGTAGAATAGGACTTTGGCCTTTCAATGGAAGCACCGATGGCCCTATCAGTGATCAATTGAGCAAGAATACCAAATGCCCTTGAAACGCCAAATAAAACGGTATAGAAAGAAGATTCTTTTAGTCCATAATATTGTAATAAGACACCAGAGTGAGCATCTACATTTGGCCATGGATTTTTGGTTTTACCATGTTCAGTCAATACGCCAGGTGCTACCTCGT...
YCR005C_CIT2_121135|0
0
H3
ATTCATTCCCTTAAAATCACGAGTTCTAATATCAAACATTTAATTGTCTCGCAAAATGGTGAAAGATTAGCTATTAACTGCTCCGATAGAACAATAAGACAATACGAAATAAGTATTGATGATGAAAACTCTGCGGTTGAGTTGACCTTAGAGCATAAGTACCAGGATGTGATTAATAAATTACAGTGGAACTGTATCCTCTTTAGTAATAATACTGCCGAATACTTAGTCGCTTCTACACATGGTTCTTCTGCACATGAACTATACATCTGGGAAACGACTAGTGGAACGTTGGTGAGAGTCCTGGAAGGGGCTGAAGA...
YAR003W_SWD1_155873|0
0
H3
AAAAGGGCTCCAGGGATGCCCTTCTACAAAGCATTTCACATGCTAATAAAAGCCCCGTTACATAATAACATCTCCCGCCCACAAAAGCGTTTGATCCTGTTTTTCTATTTGCTTCCATAAAAAATCGGATCACATTTCCTTTTCTTCCGCCTTAGATCTGCACCGAGATCTTTTGCGCTCTATTTTACCCCTAGAAACTGGACCCGGCCTAGAATCTTTCACTGCCACCAGAGAGTTTTAGCCTCGCTCTCTTTTAATTTGCTGAAAAAGGGTTTCCTGATAGAAGCCGCGGTTAATAGAAAAACGGCAAAAAAGTTATG...
iYBR085C-A_419811|0
0
H3
TTCCTCTTACGGCACTGCACGCATGCAGGCGGTTTTTTCATTTTTCTACCACGAATATCCATCGTTGTTTTTTGGTTCTGAAATGCTATTTTTGTAGATAAGAGCTTAAGAAAATATCACAGAAACCTCGATTTTAAATATCTGCGTAGTATGGGGCAATTGCAGCAGAGCACCAATGACTTTGGTTTGAAAGCCGTGAGTTTTGATGAAATAATCTAATGTAAGCAGCTAATATATCTTCTCGGGCGAAAACGGGAAAGTGGTCTTACAAGCGTCTAACGGGCCGAATGAATATTGGTGAACTATGGCTATTTGCGAGA...
iYDR303C_1071565|1
1
H3
CCTATCCACGCCAATTTACTGTGGCAGAAGTTACTGCAGTGGAAGAACTACCTTTTCCAGCATAGCCTCCGCCACCAAAAAATGCAGGTGCTGCGTTTTCTGGATACGTGGCTGGCCTTTAAAACGCATGGGAGGGACAGTATCTCACCGATTTGGTAATGCCGTGATGCAACCTCGAGGATCCGGTTTTCAAGAGCGGAAGTAAAAAACCGGGAAGATCTAATATTTCCTTATGCAGGTATTCTGTGACTAAAAAGTATAGAAGACTATTTTGCACCGCTCCCTGATTAGACTTCGTGCGATATTATTGATGGATATCG...
iYKL045W_355360|0
0
H3
AGACGATAGTGGATTTTTATTCCAACACATATAAAGTTTAGAAAGATGACATAAAGATTAACTTAATACATGACTGGGTATACTTGAATTCTAAAATTTTCAACAAATAAGTGGTTGTTTGGCCGAGCGGTCTAAGGCGCCTGATTCAAGAAATATCTTGACCGCAGTTAACTGTGGGAATACTCAGGTATCGTAAGATGCAAGAGTTCGAATCTCTTAGCAACCATATTTTATATTTTTTTATATTGTGTTCAACAATGAACAATAATGACGAATAATCCAGAATGATATAAACGTCCATGAATAAAAGTCCGTTATAA...
itL(CAA)K_458332|0
0
H3
TTTCACCCATGATACCCAATTTCAAGAGAAGCAATTGCTACATATAATTATTTAGGCTTTACTATCTACTACTCATTGACTGTGCCCTTTTACACAATTATAACAAATATGTCAAAGCAGATGCCATGAACTTTGTATCTGAATTTTTGATTTCCTTTTAATTCTAATTGCAGACGACGTAAATATAGTTCTGAATTTCAAAGTCACTGTTAATTAATTGTTCTAATTGTTTGGTTTTTTTAATATAAATCACTAGTGCTTAAGTTCTGTTGACGCACACAGTACCTATCTTTGATTCCTTCGTGCAAACAGTATTCCGG...
iYCL055W_30197|0
0
H3
TTACACCCAGAGAAACTCTGAAGGCTCTAGCCCATGTTATAATGACGACCAAATACGACAACAACAGCAACCTTTACAAATGCAACCTTTATCAAGAACTTCAAGTTCAAGTGTTAATGTCACAGCGATGAGAAGTACATCTGCAGGTAATTCAATTACAGCGAACGCTCCCGTTGTACCTAAGGTGATGGTCAATAACCAAAATGTTAAAACTGTTGCTGCCGATCAGTCTGCTACAGCACCTTCTTCACCTACCATGAATTCGTCCGTCACAACGATCAACCGCGAATCACCATACCAAACCTTAAAGAAAACGAACA...
YGR097W_ASK10_681758|1
1
H3
AAACTAGTATTGTATGGTAGATCTCTAGGTGGTGCTAATGCTCTTTACATTGCTTCAAAATTTCGGGATCTATGCGATGGCGTTATACTGGAGAATACATTTTTGAGTATTCGAAAAGTTATCCCATATATTTTCCCCCTTTTGAAACGCTTTACGCTTTTATGTCACGAAATCTGGAATTCAGAAGGCCTTATGGGAAGCTGTAGTTCAGAGACGCCATTCTTGTTTTTAAGTGGATTAAAAGATGAAATCGTTCCACCCTTCCACATGAGGAAACTGTACGAAACATGTCCTAGTTCGAACAAAAAGATTTTCGAATT...
YNL320W_YNL320W_38395|1
1
H3
ATATCTTGGTAAACTACAATTAGATCTTAAAACGGTTCCTGAAAAAGTGCTTTCATCTACCATTGAATTTAATTCACTTCCCTTTATGGAATTTGCACTGGTTTTGTGCTGTGGGATGTGGGTGTTTTGAAATATTTTCAGCTTGCAAGTCGTGCATATTGCACTTGTTTCTTATTTTTGGCGAGTGTAAACGTAATAGGTAATCTTAGTATTGGGCCATCGTCGCCCAGATAGCCATTGATAAAGCAGCTAGGTTCGTCGTCCGGTTGTTGGTATACACCTGTGCGTGCGTAAGTGTAGAAAAAAATAGAAATGACTAA...
iYOR268C_826319|1
1
H3
AAACCAGCAGTATCCGCCGAGAAGGTTTCGCAGCAAGGACAAGTGCCCACTAGGAGGACTCGTTCTCACTCAGTATCGTACGGCCTACTCCAGAAGAAGAATAATAACGATGACACCACAGATTCTCCTAAAATATCTCGAATTAGAACGGCACAGGATCAACCTGTTAAGGAAACAAAAAGCAGCACCCTTGCGGAACCCATCGTTTCAAAGAAGGGGAGAAGTCGTTCTTCTTCTATATCAACTTCTTTGAATGAACGATCTAAGAAATCTCTGTTCGGTTCTTTGTTCGGGAGAAGGCCTTCCACCACTCCCTCGCA...
YOR227W_YOR227W_763308|1
1
H3
ATTCACTTGAAGCCTTCTTAAAACATCATATATATCATTTGTGGGAGAATTGGTGCTTCCAGGGTAAAGCTCTGTTCTTCCATCGTTCAAATTTTGCAAGTTCATTAGGTATAAAAGGTAATCCGCTTGAAATATTGGATCTACAAGGTTATTTTTAACACTACCACACAACGCCAGTCCTAGCATTCTATTATGATCTGAAACGCGACCGGGAACACCGATTCCTGGAGTATCTATCAAGTATATTTCATTTCTTGATTCCGTATTCCGTGAAGTTACTCTAATAACCTCGCTTGTGGCTCTAGTAACACCAGCTTCTG...
YMR097C_MTG1_459882|1
1
H3
CCTGCCGCTGATGTGGCACAATATCGTTCGAGGAGTGTTCGGGAACCCAACCTTTTCGCTCAAAACACGCTCTTTCCGCTTTTGAGACTTAGTCAAAGGCGTAAAGCCGTCTTCTATGGGAGGATTGAATTTTATTGTGGGAGTAGCGCCCTCAGACGACTTTTTTTTGTTTGGAGGAAACAGTATTGTTGGGTAATTGAAGAAATCGCGCTTTTGTTCTAATCGATGATCTATCTGATTTTTCTTCTTGATCCATAGTTGTCTCCAATTGGCATACGCCCTACTAATCACAAGAGCGTCATTGATATCGGTGGGTGTAG...
YDR475C_JIP4_1409489|1
1
H3
TACGCCTCTTGTTATTATTTTTACTCTTTTAGCTTATGGCACAAAAGTACACAAATAGATACTAAAAGGCGAAATTTATTTCACTTGATTTACCCGTAACTGAATACTTGAGTTTGTCTATATATATATATGTATACCTTTTATTGTTAAAAAAAGAATATGCCAACAAAATCTATAATAAATAGTCCATGTACTCCACTATAAATTAATTTGCTATTGTCAGTAATGGCGCGATTAAAAATAAATAGGAAAAGAAATCTCTCCTCAAAAAATCTTATGTCCCCTAATTTCCGTATGTTGTGATTTTGGTATTCAGAGAT...
iYIL107C_166055|0
0
H3
CGTGATTCCCCTAAACTCAGGCCCTTCTCCAAGGAGAATGCAACGGTAGGCAGTTCGTTTGAAATTGAAACTTTTCTCAACGACAATGATGTCCCGCAGCATAAGAAAACATGAGATCAGAAAAGCGTTGCAAAACGCAGAAGCCACTTATCAAGAGCTACGGCTGCGCGTTTTTCTGGGACGTGCTGACAAACGAGAAAAAAGATGGCTTTGTATTCAGTTCAAGGCCCTGTGTATTTTTTTCCTTCAAGAAATTTTTATTTTGGCCTGGGGTCTTTAAACGGACGGGTGCCCTGAGAGTACTCTCGCATAATCCAGGG...
iYBR066C_371133|0
0
H3
CGCTCAAGATTACTTTTTTAATTTTCATACTTTTTCTCCTAATATCTTGCAAATCATGACTGAATTGACATTAACTGATGCAAATGGGTGGGAAAAATGCTTATCAACCCCCTAAAAAGTCAACTTCTTCGTTAGTGATTGCTATATGTGGCGTCAAGAATTACAAGAAGCTGGCGAGCTTTCCAAGCCCCTTCAGGATTAACCGTTTTTTACCCTTTTTTTACAAATTTACGGCGGGGAATAATTCTGATACTTTCTCTTTATGAGCATTCCTCAACAGGGATCGGCACTCAATTGCTTTAGTGAATAGTTTTTTTTTA...
itG(UCC)G_779933|0
0
H3
ATGATACTTGCCTCCCATAATTTGACTTAAGCCAACCTGGAAAACCAGATTCCAGAATGAAAATTTTGGTAAAGTCATCGGATATTGGTTTTTCAAAGGAATGATTCACCAGAATGTCCAACAGGACAGACTGCTGTTTAACATTGTATTCGTTTGCGTCTGTATAGAGAATGATAAACTTGAAAAGATTTCTACTTTGAAACATTTTAATCTCACTATTAGGAGAAGTAATTAATGATTTTTTCTCCAAATCATGATCTGAATATGACATTTTAAAAGAAATAGGCTCCAGGCATATAATATTTTTTGTATCAATATGT...
YDR069C_DOA4_586976|0
0
H3
TAAAGAATTCTAGATGGAGCTCTGAAATGGAATGGACCACGGGTCTTGTTGAAAGCAGTAGCCTTTCTCAAAAAGTCGTGGTACTTCAACTTGTTTCTGAAGAATTCACCAGAAATGTTCAAAGCTTCAGCTCTGACAACGACAATCTTTTGACCGTTCAATACTTGCTTGGCAATAGTGGAGGCCAAACGACCCAACAAATGATCCTTAGCTATAATAAAAGGAGAAAAATAAAGTTAGTAAATGCTAATCTTGGTATTTTTTACTGAGTGGATTGAATATAGTTGGAGTCAAAATATTGATTAGAAGTTACCATTACT...
YNL069C_RPL16B_494560|0
0
H3
GATAATGACTTTATTATCAATGATTTCAAGAAAATTGGAAGGGAAAACGAAAACGAAAACGAGGACGATTAGAATATATTAATATATAGATGTACACGTATATGCAGTAGTTTTATTTTTTTATCTATAATACAACTCAAGCACAAGAATGCTTTGTTTTCCTAGTGCTCATCCTGGGCCTAGGCGCCATAGTTATCCGATTTATCATCGGATTCAGCTTTAGTAAACTGAATGGGGCCGTGAGAACCACTGGCACCTTCACTCTTAACATTGACCGCTTCGTCCAGCTTTTCGTAGTTGGTCTTGTATATGCTTTCAAT...
YAL032C_PRP45_83407|1
1
H3
TTTACAGCATTTGTTTGTATCTTCATTTCTCAGAATTCTAAGATCTTCTTCTTGAATGGTGCGCAATGTCCTTCTTCCTTTTCCAAGCTTCTAATGAGGTTATTTTCTTAGACTTTTGTGCAGTATCTAAATATTATTGGTTTTATTGCTTGTAACTTGCACTTTTCTGTAATTGCCGAAGTATCAAAGAAATTTCAACTCTCAAAGAAGATGCGCAAGAATGAAGACCAAATGAAAAATGTCATAAAAAAGCAGTATATTAGCTGTCTTGGGTTAGGGTCCTCGTTCAACAGATTCTTTTTGCTTAATGGTGCAAATGA...
iYIL093C_189005|1
1
H3
TCGTACAGGCTTTACCGTCGATGTACTCACGGTATTCACCGAGGAAAATGTTTCTGTTATCAAAGAACGTATGGAATCCTTGATTAATGAAAAAATGTCACAACTGAATAAAATATCCAACATATTTAATGTCCATTTTATTGATGTTAACGAATTTTTCAACAATGCATCGGAAGTATCGACTTTTATCATTGACAATGAAAATTTTGAGATTTTCAGCAAGTCAAAGTCAGTAGATGATAGCAATATATTAACATTAAAAGAAATATTAGGCAAATACTGTCTCAATAATTCATCCAGGTCTGATTTAATATCGATTA...
YNL119W_NCS2_401573|0
0
H3
CTTTTATCTATTCAAATTGTTTCCCTTAGGTATATATATATATATATATATATATATATATTTTCCCTGTATATATCTATGTAAATGACGAAAACGCATGACATTTTAAAACCTACCCCGGGTTTGACCACAACCCACCGTTCATCTAATATTAACCCGCACCCACCAAGGAAAGAGGCAATAATTCGGGAATACTGCTCTAAGACCTTGTTTTTTCTTCTGTCAGGTGAAAGTATATGACCAAAATCTACATCCCTGCCCTAACAAACGTCATAGAAGTAGGAAATAAATGATGAGCCCCGCAAAATGACGCCGCACGG...
iYDR040C_538947|0
0
H3
AACTACAGGCAAGTACAATAGCCTCCTCCAGCTTTTGACTCTGAAAAAAAGGTAACGAAAGAAACTTTTCACACAGATAGGAACTATTCTAATATTACAGTAAAGCCAGCTATGGAATAACAGGTGGCATGTGCCCACTGTATTCATGTATGAGCGGCCTTTTTTTTCCACGAAAACAGATATGATGATGTTTTAGGCCAAAATGGTGAGATCCTTTTACAGGAAGATTTTACGCCAACTCTTTCCATTTAGTGACACAGGTGAGGGTCTTTTATCGGCGCAAGGTGTTTCAGATGAGCCATTTTGCCCGTAGTCCGGTT...
iYBR082C_408580|0
0
H3
TGGACGTGTCAATGCTCTTTCTAGGTCGTGTTCATCTTTCAAAGTCGCTTCAAAGTACAATTGAGTTCTTAGGATCAAATTATAGATTGATTGTTCATCCCTTAAACGGATCAAATAATCACTGGAATGAGGATCGATGTTTAACAGGGATTTCATGAATTCGTCATCTAATCTTTCAACAAATGAGAAAATGGAACCCAGAATCCTCTTGACACCATCAGAATCTTCTTTAGGTTCATCTTCAATGAAATCGATTGGATCAGCAAATTCATTAACTTGGTAGGTGTCAATTGTCTGGTCTAAAATAGACAATAATTTAC...
YMR309C_NIP1_894380|0
0
H3
TTAGTTCCTATTCTAAACAGATCTTTAAACCTATCGTCATCATCTCAAGAGCAATTAAGACAAACCGTTATTGTCGTGGAGAATTTGACAAGATTGGTGAATAATCGTAATGAAATTGAAAGCTTCATCCCTCTACTACTACCCGGTATCCAAAAGGTTGTTGATACTGCGTCCTTACCTGAAGTTCGTGAATTGGCTGAAAAGGCCCTTAACGTTCTAAAAGAAGATGACGAAGCTGATAAGGAAAACAAATTCTCAGGCAGACTGACTTTAGAAGAAGGTAGGGATTTCTTACTTGATCACCTCAAGGACATTAAAGC...
YPL226W_NEW1_123171|0
0
H3
AACACACACTATTTAAAAATCCTTGTCCTGTATAAACTACCGTACCGAGCCATTGGTCATCTTTAGACCACAAGCGAACTGTTCCATCCCTCGAAACACTAGCAACCTTTGAATCATCCACAGCTACCACATCCCTGACGTCCTGATCGTGCCCTTTAAGTGTTGCACTCAATTGATATCCCATACTCCAAATCTGCTGCTCTACACCTTACTATCACACATGAATATATATATATAAAATAAGCCAAGACAGTGGCCTTCCCTTATTATCAGCGTACTAAAATCTCATATGATTTATTTTTCGTGGTCCTGAACGAGTG...
iYKL213C_34174|1
1
H3
CGTATAGTCTCAAGAGGAATAGGTGTCATTTTTGGTGCTTCTCCATACTTTATGTCAAGGATAAAGACATACTTGGGTTGTGCCTCAGCCTCACAAAGTGAAGTAGCTACAGATGAACCTGGTTGTAATACATCAAAATTTTTAATTGGATTGTGTACGAGATTCGGAATACACTCATGTTCATGACCCCATATCACCATATCCAGGAAATCTGGCAAGAACTGTTCAGGTAAAAATGCAGTATTCGTGTGACCTGTATGATTTTGATGGACGCACATTAAATTAAACCATTCACCTTCTCGCATAGTCGGTACTTCAAA...
YMR224C_MRE11_719997|1
1
H3
TAAAAACACACGTGTATGTTTTTTTCCGCACGCATGAGCAGAATTAGACGGTTTTATAAACGTTCGCTCTGTGTATTTCACGTACCTTCCCGGATACAAATTTCGTCATGGTTTATATTGGAAGTTTTTCCGAAATTGCGGGTTTTGAAGGTTTATCCGAGACGACGTGAGGGGTAGGCAGGAGAAGGAAGGCTGTTTACGTACGCAAGGATGGGGTACGACAATGATTCAATTTCTGTTGCTAAGTTTTTTGTTATTTTTTCATATTGGCCTACTCATCTTTGATATCATTATTTTTTCGAATCTACATTAGAGCGTGT...
iYGL210W_94976|0
0
H3
CAGAAAGAGAGCTGTCGACAACATTCCAGTTGGTCCAAACTTCGACGACGAAGAAGACGAAGGTGAAGACTTATGTAACTGTGAATTCTCTTTGGCTTATGGTGCTAAAATCTTGTTGAACAAGACCCAATTAAGATTGAAGAGAGCCAGAAGATATGGTATCTGTGGTCCAAACGGTTGTGGTAAGTCCACTTTAATGAGAGCTATTGCCAACGGTCAAGTTGATGGTTTCCCAACCCAAGAAGAATGTAGAACCGTCTACGTCGAACACGACATTGATGGTACTCACTCTGACACTTCCGTCTTGGATTTCGTTTTCG...
YLR249W_YEF3_638249|0
0
H3
ATCTTTTGCAGTAAATAAAGTGTCCAATGGCAAGCCTTCAACAGGTTGGCAACGATCTAGAAAATCTGGTCTTAGTCTTCCAATCCAATTCTTGATGAAGTTTGTAAAGAAACTCGTACTGAACCAAGCGAGTGATAAACCAAGGAGAGATGTGTACAAAATAAAAATCAAATGTCTTCTATCGGCCAAAATGGAACCAATTATCAATATGGTTAAAGATGGCACGACAAAACTATAAACAAACAACATGTTGTTATTTACACGTTCAGTTGTCGCATAAGGATGCGATATAGTGAGATCGTTAATGTAAAACTGACGTT...
YDR284C_DPP1_1031216|1
1
H3
TAAGTACTGAAAAGATAGCGTCGTTTTATACCCAGTATTTCAGGAACTCTAATTCGGTAGTCGTGAATCTTTGCTCACCAACTACAGCAGCAGTAGCAACAAAGAAGGCCGCAATTGATTTGTATATACGAAACAATACAATACTACTACAGAAATTCGTTGGACAGTACTTGCAGATGGGCAAAAAGATAAAAACATCTTTAACACAGGCACAAACCGATACAATCCAATCACTGCCCCAGTTTTGTAATTCGAATGTCCTCAGTGGTGAGCCCTTGGTACAGTACCAGGCATTCAACGATCTGTTGGCACTCTTTAAG...
YAL037W_YAL037W_74419|1
1
H3
GTTCGATATGGTTACAAATATGGTGAAATTGAGGAATTTAAGGAGATTGTATTGTTCATCCCGTCTCTTAAGAACTATACAGAACGGGAGAATTTCTTCTGTTTCTTCAATATCGTTGAGTAAGAAATATACTACAAAATCTGCCAAAGAAGGTGAAGAAAACGTAGAAAGGAAGCATGAGGAGGAAAAAAAGGATACATTAAAAAGTTCCAGTGTACCAACTTCACGAATATCGAGATTGTTCCATTATGGCTCACTGGCAGCAGGGGTGGGCATGAATGCCGCAGCAAAAGGCATATCGGAAGTTGCAAAGGGCAATT...
YGL119W_ABC1_284688|1
1
H3
GCTAAGCCTTCTAGAAGGCCGGAAAAGGCTCAGAAATGCCTGCAGTGCAAAGCTTTTCAGGAACGTTACTAGCGCAAGTCTTCTGGCTGCTGTTTAGCAACCTCTTTGCAGGTATTTCGCAGCGGGATTTCGCAGGTGTCTTGCCTAGCGTATCAACACTTAGTAGCAGTTAGGTTCTAAAGTTAAACTCCACTTCCGTTCTTCTTCGAGTTCCTTTTTGAGCGGTTTGGCCTATCTTGTCTCATCATCTTGTTAGTGTGCGACGTTATCTGTGGGTGAGAGAGGCTATTTTTGGTCCGGTGAAAAAAAAATTTTTTCTC...
iYBL055C_117203|0
0
H3
CGATGGCTACAACCAGGGCTACTCAGTATCCCCCACAAATAAACAGCAATAATTTTAATACTAATCAAGCATCTGTACCTCCACAAATGAGATCTAATCCACAACAGCCGCCTCAAGATAAACCAGCTGGCCAGTCAATTTGGTTGTAAGCAACATATATTGCTCAAAACGCACAAAAATAAACATATGTATATATAGACATACACACACACATATATATATATATATTATTATTATTATTTACATATACGTACACACAATTCCATATCGAGTTAATATATACAATTCTGGCCTTCTTACCTAAAAAGATGATAGCTAAA...
iYPL204W_165860|1
1
H3
TGGGAATCCGCAATTTGGATAAACTCCTGCACACAAATTTAGAAGCTTTATGTACTTATAGTTTGTCGAAGAATTATTTTTAGTAAAACTGCTTGCCAAACTCATGGACTTATAGATTGCGGACACAATATCAAAGTAATATGTGCCTACTATATCCCTTTGTTGCGGTTCTAGATGTAAAAGGGTATATAGCAGATCTAATTTCGCATTTTTTAACTCTACACCGTCACTTTGCTCAATGGACAACAAACATTGAAAAATCTGGCACTTTGGAGCCAAAAAGTAATCTACCTTAGCCAACTGGTCTCCCACCAGGGAAT...
YIL017C_VID28_320850|1
1
H3
CATATTTATAGAATATATAGGATAATTATCATCCCCTCAATCAAGTTGATATTCCGTTTTGACAACAGGTCACTTCTGCAAGGTTGCTTATATTAAAATTGTGAATAAGCATGATATTAGACGGACTCATAATTGAATGGTTATCAGTTAATTGACTCTCGGTAGCCAAGTTGGTTTAAGGCGCAAGACTGTAATTTACCACTACGAAATCTTGAGATCGGGCGTTCGACTCGCCCCCGGGAGATATTTTTTTACTTTTGATTACCTTAATTTTTGAAGATTCTCAATTTCATGTATTAGGGATTGCTTTAAGGTAGCTA...
itY(GUA)M2_838022|0
0
H3
TGAATTTATTCCGGGAATATTCAAGTTATGTATATCTCTTTTCATATTCTTAAATACACATACTCATAATATCTTGTCGAAAATACGCGGTGTAGGGAGTTATGGTGGATAACTTTTTCACGATTAGAAGAAAAGGAAAATTTCATTATTCGTAGCTTAACATGGCAAAAACGAGAAAGACATATAATCAAAACGTGAGTTTCCTGTGGAAAAAAAAAAAAGGGAACCTCTGGTTACGATGATATACCTGCGTGAAAAAGGACAGTTATTACCAATACATACAAAGGCTTAATAAGTGTAAAATATATATCTGCCGAGAC...
iYAL003W_143618|0
0
H3
TAGTGAGTCTTGGAGGCAATAAGGGGAATGCTCGGTTCTGGAATCCTAAAAATGTCCCTTTTCCTTTTGATGGAGATGATGACAAAGCCATCGTGGAGCATTACATTAGAGACAAGTATATTTTGGGTAAATTCAGGTATGATGAAATAAAGCCTGAAGACTTTGGATCCAGAATGGATGATTTTGATGGGGAATCGGACAGGTTTGATGAAAGAAATAGAAGTAGGAGCAGGAGCAGATCTCATTCTTTCTATAAAGGGGGCCATAATAGGTCTGACTACGGCGGTTCCAGGGACTCATTCCAAAGCAGTGGAAGCAGA...
YGL181W_GTS1_158421|1
1
H3
CTCTAACAGAAAAATCAGCCTCATTAAGCCATAGTGATTTGGGCGGCGAAATTTTAAATGGTACAGGAAAGAACCGCACCCCCAATGATGGCCAAGAATCAAATGAAAGTGATGGGAGTCCCGAAAGTGATGAGAGTCCCGAAAGTGAAGAAAGTAGCGACAACAGTGATTCGAGCGATAGTGACGATATGAGACCTTTACCGAGGCCATTATTTATGAAGAAAAAGGCCAATAATTTGCAGAAAGCTACCAAGATAGATCAACCCTGGAATGCCCAAGATGACGCACGAGTTCTGCAAACAAAGAAGGAAAATATGATA...
YBR152W_SPP381_546671|1
1
H3
TGATAGCGGATGAGAACGGAACAAACAGCGCTATAGCTAATGAGCAAGAGGAAAAATCCGAAGAAGTAAAAGCTGAAGATGATACTGGTGAAGAAGAAGAGGATGACCCAGTGATCGAAGAGTTTCCATTGAAGATCTCCGGAGAAGAGGAGTCACTGCACGTGTTTCAGTATGCTAATAGACCAAGGCTAGTAGGACGCAAACCTGCTGAGCATCCGTTTATCTCTGCAGCAAGATATAAACCCAAGTCGCACCTATGGGAAATAGATATTCCTTTGGATGAGCAGGCCTTCTATAACAAGGATAAGGCTGAGAGCGAA...
YKR025W_RPC37_487754|1
1
H3
TCCTATTGATTATGGGTTCGAATAGTACCAGATGTTTTGCCAATCCTAAATCGGTAGGAAAGTGGCTTGTCGTCGTCAGGCTTATTATCAACTCTTATGCACAAGAAAGGTACTCATCTTCTATAAACTACATAAGACCTGAATCTAATCAAAGGGAGAAAGCGCAGAACATCAGATTTAAAGCGGTTTTGCTTGATACACTCAGCCTTGTCTCTTTGTAAGGATTTTGGGGTACCTATGAATAATACATCTAGTAGTGTTAGTAAACCAACGTATGGGATTTTGGGATACATAGTTTTCCAGTGTTTCTTATCCGTGAT...
iYLR142W_427070|0
0
H3
GAAAGGTCTTTTGAGCCTTTGTCGGCAAATTCCTCGGAAGAGAAGTCCAACCTTATTGAATTTGAAGCATGAAATCGTGCTTATCAATTTTATGTCACCCTAAAACATCTGTACGTGTTTATATAGATATTTAAAGCAATATTTGCCAGGATTTGGTGAAGATCCCTCATATAACTCTCATAAATGCGGATTTTCGGAGCGAAAAAAGCCTAAATTCTTGTCTGGAAGTATAATTGGCGGTGAAATAGAAAAGGTGGCAATCACGACTGAAAAGGGTACAGCTTTCGCAACTGACATATACAGACAGTGAAAAGTAATAA...
iYGR100W_693276|1
1
H3
TTCATCCCTACCACTATATAATGGGAAAGAATTTCTTGCGAACATTCTCCTGATATGATTATTTCCTGGTACAGCCTTGTTAGCACTTCGTTCCAGCATTTCTATCTGGCTGTAAATATTCTCATGCGGCCCCATTGAAGGCATTGTAGTTCTAGGGACACCACTATGGCTCCCTCCGGGAGAGGCTGCAGGCGCACCTCCGCTATCTATGGTAATATGCTGAGCTGTAGTTGCATTTCCTGGTTCATAAGCAGTAGATGGATGGCTACCGCTCATTCTCCTTCCAGAAAATGTACCCGTGGTATTGACAGTACTCCCAT...
YBR280C_YBR280C_763833|1
1
H3
TGTCGTTTCATTATCTGTATGACTGTCGTAACTTTGAATCGATCTAATGTGTTGACCCTGTCTCAGGCTCACCCATGGCGGCGCCTGCACCTGTGGGTGAAGGAAGAAAGACGATGTTTGTGAGGGAACTGAATTGGGTTGAAGTTCATATCCTAAACAAACACTTCACCAGCCATGGATGCATGCCTTGTCTTTTCGCAGTTGGTGGCATGAAAATATATATCACCCACCAAACCCTCTTACTCTTTTCTTACCAAGTAACTCCAGTAAGTGCTCGTTTTTTTCTTCTTCCATTCAAACCTGCTTAAAAACCTCGACAA...
iYAL034C_82459|1
1
H3
TTTGTTGAAAATTTAGGCTAACATTTTCTAATAGGTGACATTACAAAATCCAGTCTTACAGTTATGCTTAATATTGACCAGTAGTCATATTACTGGCATATTATCTGTTAGAATAGTGACATACTATTCTTTATTACTGATGAATTTGTAATTGTGAAGCAATTAGACGCAAACGTCATTGATGTGTCAGGATGGAGAACGATATAGACTATGTGATAATGTAAAGTTAAATATCTGTTACTTGAACAATTAACTTGACGTTTGTATATGTAAGGATATGGGTCATTACATAGAAACTATAGTAAATAGTCTCAGTATTG...
itW(CCA)J_415984|0
0
H3
GTGTCCTCTTGGAGAAGCAAGATTGATATTCATGCAGGCTAGATTCAAGTATGGTATGACGAATGGCGTCACCATACCCTCTCAATTAAGATACTTAAGGTACCATGAGTTTTTTATTACCCATGAAAAAGCAGCCCAAGAAGGAATTTCTAATGAGGCGGTAAAATTCAAATTTAAATTCAGGCTTGCTAAAATGACGTTCCTCCGCCCATCAAGCTTAATCACTTCCGAGTCTGCTATTGTAACTACTAAGATTCAACACTACAACGATGATAGAAATGCCCTCCTAACCCGAAAAGTAGTATATTCAGATATCATGG...
YNL128W_TEP1_383243|1
1
H3
AAGTGCAGGAATGAGTCACTGTGCTCCATATATAGTAGGCTGTTCAAACTGGGTCTGTTTTTCGCCCAACTGTGTGTGAAAAGTGTGGTTTCTAGCGCAGAATTGCAAGACTGTATCAGCACCTCTCATTATGCGACTAAGCTGACCCGTTATTTCAATGATAATGGTAGTACACACGATGGTGCAGATGCAGGTGCTACCGTGCTGCCCACTGGCGACGATTTCCAGTACCTGTTTGAAAGGGACTATGTTACTTTTCTTCCGACGGGCGTGTTGACCATCTTCCCCTGTGCCAAGGCTATAAGGTACAAGCCATCGAC...
YJR046W_TAH11_522303|1
1
H3
End of preview. Expand in Data Studio

Dataset Card for Dataset Name

The nucleotide_transformer_downstream_tasks dataset features the 18 downstream tasks presented in the Nucleotide Transformer paper. They consist of both binary and multi-class classification tasks that aim at providing a consistent genomics benchmark.

⚠️We note that we have revised and improved our benchmark during the peer-review process. The datasets featured in this repository are used up to this release. We highly encourage to move to the new version available here, which we believe to be much more robust.⚠️

Dataset Summary

The different datasets are collected from 4 different genomics papers:

  • DeePromoter: Robust Promoter Predictor Using Deep Learning: The datasets features 3,065 TATA promoters and 26,532 non-TATA promoters, with each promoter yielding a negative sequence by randomly sampling parts of the sequence. The promoter_all dataset will feature all the promoters and their negative counterparts, while the promoter_tata and promoter_no_tata respectively provide the TATA and non-TATA parts of the dataset.
  • A deep learning framework for enhancer prediction using word embedding and sequence generation: To build the training dataset, the authors collect 742 strong enhancers, 742 weak enhancers and 1484 non-enhancers, and augment the dataset with 6000 synthetic enhancers and 6000 synthetic non-enhancers produced with a generative model. The test dataset is comprised of 100 strong enhancers, 100 weak enhancers and 200 non enhancers. The original paper uses this dataset to do both binary classification (i.e a sample gets classified as non-enhancer or enhancer) and 3-class classification (i.e a sample gets classified as non-enhancer, weak enhancer or strong enhancer). Both tasks are respectively tackled in the enhancers and enhancers_types datasets.
  • SpliceFinder: ab initio prediction of splice sites using convolutional neural network: The authors introduce a dataset containing 10,000 samples of donor site, acceptor site, and non-splice-site, resulting in 30,000 total samples that are featured in the splice_sites_all dataset.
  • Spliceator: multi-species splice site prediction using convolutional neural networks: Two datasets are introduced by this paper, each of them contain splice sites and their corresponding negative datasets. The dataset splice_sites_acceptor features acceptor splice sites and the other, splice_sites_donor, donor splice sites.
  • Qualitatively predicting acetylation and methylation areas in DNA sequences: The paper introduces a set of datasets featuring epigenetic marks identified in the yeast genome, namely acetylation and metylation nucleosome occupancies. Nucleosome occupancy values in these ten datasets were obtained with Chip-Chip experiments and further processed into positive and negative observations to provide the datasets corresponding to the following histone marks: H3, H4, H3K9ac, H3K14ac, H4ac, H3K4me1, H3K4me2, H3K4me3, H3K36me3 and H3K79me3

Dataset Structure

| Task | Number of train sequences | Number of test sequences | Number of labels | Sequence length |
| --------------------- | ------------------------- | ------------------------ | ---------------- | --------------- |
| promoter_all | 53,276 | 5,920 | 2 | 300 |
| promoter_tata | 5,509 | 621 | 2 | 300 |
| promoter_no_tata | 47,767 | 5,299 | 2 | 300 |
| enhancers | 14,968 | 400 | 2 | 200 |
| enhancers_types | 14,968 | 400 | 3 | 200 |
| splice_sites_all | 27,000 | 3,000 | 3 | 400 |
| splice_sites_acceptor | 19,961 | 2,218 | 2 | 600 |
| splice_sites_donor | 19,775 | 2,198 | 2 | 600 |
| H3 | 13,468 | 1,497 | 2 | 500 |
| H4 | 13,140 | 1,461 | 2 | 500 |
| H3K9ac | 25,003 | 2,779 | 2 | 500 |
| H3K14ac | 29,743 | 3,305 | 2 | 500 |
| H4ac | 30,685 | 3,410 | 2 | 500 |
| H3K4me1 | 28,509 | 3,168 | 2 | 500 |
| H3K4me2 | 27,614 | 3,069 | 2 | 500 |
| H3K4me3 | 33,119 | 3,680 | 2 | 500 |
| H3K36me3 | 31,392 | 3,488 | 2 | 500 |
| H3K79me3 | 25,953 | 2,884 | 2 | 500 |
Downloads last month
2,638

Models trained or fine-tuned on InstaDeepAI/nucleotide_transformer_downstream_tasks

Spaces using InstaDeepAI/nucleotide_transformer_downstream_tasks 2

Collection including InstaDeepAI/nucleotide_transformer_downstream_tasks