Paper • 2506.14963 • Published
_schema_version int64 | checks_pass bool | date string | info string | name string | statistics dict | tests dict | train list | valid list |
|---|---|---|---|---|---|---|---|---|
1 | true | 20241104T180759.080Z | Method: cc, train group a; | cc_train_a | {"e0s":[4.796298934616344e-15,-16.370982192772885,-2.2737367544323206e-13,0.0,-8.526512829121202e-14(...TRUNCATED) | {"test":[10018566,10018791,10018906,10018937,10018978,10019015,10019038,10019064,10019110,10019121,1(...TRUNCATED) | [7692147,4307087,1751120,1653968,8939535,988562,4578530,2987868,1728538,6206514,2259596,471787,90679(...TRUNCATED) | [9431859,4012024,7864607,9080832,774747,9693758,1541172,1775669,3204223,9532044,9238425,9779083,3013(...TRUNCATED) |
1 | true | 20241104T180842.323Z | Method: cc, train group b; | cc_train_b | {"e0s":[-2.489516883226371e-15,-16.322400217850827,-4.547473508864641e-13,4.547473508864641e-13,2.27(...TRUNCATED) | {"test":[10018566,10018791,10018906,10018937,10018978,10019015,10019038,10019064,10019110,10019121,1(...TRUNCATED) | [3386514,8323364,3436202,5005708,5818730,7496519,6541024,44717,3430419,6232478,7059307,8352013,54268(...TRUNCATED) | [2583107,4732278,2093296,2094847,7352750,8904505,10287713,3962183,2528987,4740886,654169,197683,5453(...TRUNCATED) |
1 | true | 20241104T180925.080Z | Method: cc, train group c; | cc_train_c | {"e0s":[-2.6202336020446953e-17,-16.327483854122875,-3.410605131648481e-13,-4.547473508864641e-13,-5(...TRUNCATED) | {"test":[10018566,10018791,10018906,10018937,10018978,10019015,10019038,10019064,10019110,10019121,1(...TRUNCATED) | [5585650,3257833,4450586,8457743,4891517,8857520,4430570,1367683,8521576,5598348,10059849,9365038,75(...TRUNCATED) | [8876870,1267503,1487695,3142634,3976550,6151003,10223798,3486803,8499425,4913675,2225527,3563199,55(...TRUNCATED) |
1 | false | 20241104T181006.672Z | Method: cc, train group d; | cc_train_d | {"e0s":[-7.65218838620791e-15,-16.32756945197798,4.547473508864641e-13,-5.684341886080802e-14,0.0,0.(...TRUNCATED) | {"test":[10018566,10018791,10018906,10018937,10018978,10019015,10019038,10019064,10019110,10019121,1(...TRUNCATED) | [4509553,2306821,7547802,4678486,8104194,6733913,687286,677569,5257433,7124472,2771130,6972823,16012(...TRUNCATED) | [3351192,5094370,1982836,5224900,5700744,5416703,679592,9316820,9317175,4554214,8735067,3183297,3334(...TRUNCATED) |
1 | true | 20241104T180440.627Z | Method: dft, train group a; | dft_train_a | {"e0s":[4.798304311368801e-15,-16.28322630297248,-2.2737367544323206e-13,0.0,-8.526512829121202e-14,(...TRUNCATED) | {"test":[10018567,10018792,10018907,10018938,10018979,10019016,10019039,10019065,10019111,10019122,1(...TRUNCATED) | [7692148,4307088,1751121,1653969,8939536,988563,4578531,2987869,1728539,6206515,2259597,471788,90679(...TRUNCATED) | [9431860,4012025,7864608,9080833,774748,9693759,1541173,1775670,3204224,9532045,9238426,9779084,3013(...TRUNCATED) |
1 | true | 20241104T180530.523Z | Method: dft, train group b; | dft_train_b | {"e0s":[-2.4909113201813763e-15,-16.236183482785236,-4.547473508864641e-13,4.547473508864641e-13,2.2(...TRUNCATED) | {"test":[10018567,10018792,10018907,10018938,10018979,10019016,10019039,10019065,10019111,10019122,1(...TRUNCATED) | [3386515,8323365,3436203,5005709,5818731,7496520,6541025,44718,3430420,6232479,7059308,8352014,54268(...TRUNCATED) | [2583108,4732279,2093297,2094848,7352751,8904506,10287714,3962184,2528988,4740887,654170,197684,5453(...TRUNCATED) |
1 | true | 20241104T180620.338Z | Method: dft, train group c; | dft_train_c | {"e0s":[-2.6503842287823075e-17,-16.233412491940953,-3.979039320256561e-13,-4.547473508864641e-13,-5(...TRUNCATED) | {"test":[10018567,10018792,10018907,10018938,10018979,10019016,10019039,10019065,10019111,10019122,1(...TRUNCATED) | [5585651,3257834,4450587,8457744,4891518,8857521,4430571,1367684,8521577,5598349,10059850,9365039,75(...TRUNCATED) | [8876871,1267504,1487696,3142635,3976551,6151004,10223799,3486804,8499426,4913676,2225528,3563200,55(...TRUNCATED) |
1 | false | 20241104T180709.593Z | Method: dft, train group d; | dft_train_d | {"e0s":[-7.657009917488205e-15,-16.26207329868862,4.547473508864641e-13,-5.684341886080802e-14,0.0,0(...TRUNCATED) | {"test":[10018567,10018792,10018907,10018938,10018979,10019016,10019039,10019065,10019111,10019122,1(...TRUNCATED) | [4509554,2306822,7547803,4678487,8104195,6733914,687287,677570,5257434,7124473,2771131,6972824,16012(...TRUNCATED) | [3351193,5094371,1982837,5224901,5700745,5416704,679593,9316821,9317176,4554215,8735068,3183298,3334(...TRUNCATED) |
1 | true | 20241104T181048.401Z | Method: xtb, train group a; | xtb_train_a | {"e0s":[2.452596662187856e-16,-13.800419630929092,-1.4210854715202004e-14,0.0,-3.552713678800501e-15(...TRUNCATED) | {"test":[10020608,10020839,10020959,10020995,10021035,10021072,10021091,10021120,10021161,10021175,1(...TRUNCATED) | [7694377,4309246,1753261,1656110,8941697,990751,4580682,2989963,1730687,6208642,2261748,473810,90700(...TRUNCATED) | [9433991,4014170,7866733,9082976,776913,9695809,1543287,1777775,3206374,9534126,9240481,9781220,3015(...TRUNCATED) |
1 | true | 20241104T181155.044Z | Method: xtb, train group b; | xtb_train_b | {"e0s":[-1.3383430805357442e-16,-13.767883750908922,-2.4868995751603507e-14,4.263256414560601e-14,1.(...TRUNCATED) | {"test":[10020608,10020839,10020959,10020995,10021035,10021072,10021091,10021120,10021161,10021175,1(...TRUNCATED) | [3388645,8325493,3438305,5007846,5820866,7498623,6543183,46771,3432548,6234596,7061450,8354144,54290(...TRUNCATED) | [2585239,4734369,2095488,2097028,7354853,8906651,10289844,3964319,2531145,4742989,656342,199816,5456(...TRUNCATED) |
End of preview. Expand in Data Studio
Multi-Fidelity Training of Machine-Learned Force Fields — Dataset
Dataset accompanying the paper Understanding Multi-Fidelity Training of Machine-Learned Force-Fields.
File Structure
data/
├── data.lmdb # LMDB database with atomic structures and labels
├── metadata.parquet # Lightweight metadata index
├── schema.json # Schema for the metadata
├── {method}_train_{a,b,c,d}.json # Train/validation/test split definitions (12 files)
data.lmdb— The main database containing atomic positions, atomic numbers, energies, and forces for each structure.metadata.parquet— A metadata index with columns:formula,conformation_idx,method,n_atoms,energy,forces_present,energy_unit,forces_unit,idx.{method}_train_{a,b,c,d}.json— Split files defining train, validation, and test indices for each method (dft,xtb,cc) and training group (a–d). Indices reference entries in the LMDB database.schema.json— Schema definition for the metadata fields.
Code
The code to reproduce the experiments in the paper is available at github.com/microsoft/multi-fidelity-training-mlff.
Citation
@online{Gardner2025Understanding,
title = {Understanding Multi-Fidelity Training of Machine-Learned Force-Fields},
author = {Gardner, John L. A. and Schulz, Hannes and Helie, Jean and Sun, Lixin and Simm, Gregor N. C.},
date = {2025-06-17},
eprint = {2506.14963},
eprinttype = {arXiv},
eprintclass = {physics},
doi = {10.48550/arXiv.2506.14963},
url = {http://arxiv.org/abs/2506.14963}
}
License
This dataset is released under the MIT License.
Contact
- John Gardner — johngardner@microsoft.com
- Gregor Simm — gregorsimm@microsoft.com
- Downloads last month
- 59
