paper_id stringclasses 221
values | title stringclasses 221
values | model stringclasses 1
value | strategy stringclasses 6
values | pair_id stringclasses 37
values | variant_id int64 1 3 | style stringclasses 5
values | query stringlengths 217 1.23k | input_context stringlengths 77 187k | oracle_response stringlengths 38 201k | response stringlengths 679 34.8k | finished bool 2
classes | response_chars int64 679 34.8k | score float64 0 1 | n_criteria int64 0 10 | n_yes int64 0 10 | rubric_criteria stringlengths 91 4.74k | judged_criteria stringlengths 2 6.97k |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | direction_to_idea | p001 | 1 | concise | I am working on approximate machine unlearning in high-dimensional regimes where the number of parameters is comparable to the sample size ($p \sim n$). Current approaches like Zou et al. (2025) rely on standard $\epsilon$-certifiability and Laplace noise, which requires at least two Newton steps even for a single dele... | Modern ML models (from healthcare tools to systems like ChatGPT) are trained on data that often includes sensitive personal information, creating risks of memorization and leakage. Laws such as Union (2016); of California Department of Justice (2018); of Canada (2022) enforce a "right to be forgotten," requiring remova... | **Comparing with conclusions of Zou et al. (2025).** In high dimensions, the privacy-accuracy interplay is subtler than in low dimensions. Zou et al. (2025) is the only theoretical work we know of that has studied machine unlearning when $p \sim n$ by relaxing some of the standard optimization assumptions on the loss, ... | To address the challenge of high-dimensional machine unlearning ($p \sim n$) where existing two-step Newton-based methods (like Zou et al. 2025) impose unnecessary computational overhead, we must fundamentally decouple the **certifiability condition** from the **optimization step count** and introduce a mechanism that ... | true | 17,876 | 0.1 | 10 | 1 | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose using a single noisy Gaussian-Newton step to achieve unlearning?", "reference_excerpt": "develop an approach where one noisy Gaussian-Newton step suffices to unlearn a regularized empirical risk minimizer."}, {"criterion_id": 2, "aspect"... | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose using a single noisy Gaussian-Newton step to achieve unlearning?", "reference_excerpt": "develop an approach where one noisy Gaussian-Newton step suffices to unlearn a regularized empirical risk minimizer.", "judgment": "no", "evidence":... |
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | problem_gap_to_method | p003 | 2 | detailed | Most theoretical work on machine unlearning assumes the number of model parameters is much smaller than the sample size. However, in high-dimensional settings, these standard optimization assumptions fail. We want to develop an approximate unlearning method that avoids full retraining by updating the original estimator... | Modern ML models (from healthcare tools to systems like ChatGPT) are trained on data that often includes sensitive personal information, creating risks of memorization and leakage. Laws such as Union (2016); of California Department of Justice (2018); of Canada (2022) enforce a "right to be forgotten," requiring remova... | To process deletions efficiently, without full retraining, and obscure residual information, we update $\hat{\beta}$ via the randomized procedure below.[^2]
$$\tilde{\beta}_{\setminus \mathcal{M}} := \bar{A}(\hat{\beta}, \mathcal{D}_{\mathcal{M}}, T(\mathcal{D}_n), \mathbf{b}), \text{ where } \mathbf{b} \text{ is a ran... | To address the challenge of machine unlearning in high-dimensional settings ($d \gg n$, where $d$ is the parameter dimension and $n$ is the original sample size, with $n$ potentially smaller than or comparable to $d$), we must abandon the standard convexity and global gradient descent convergence assumptions that under... | true | 14,530 | 0.9 | 10 | 9 | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose updating the original estimator using a randomized procedure to avoid full retraining?", "reference_excerpt": "To process deletions efficiently without full retraining... you could update the original estimator using a randomized procedu... | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose updating the original estimator using a randomized procedure to avoid full retraining?", "reference_excerpt": "To process deletions efficiently without full retraining... you could update the original estimator using a randomized procedu... |
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | method_to_experiments | p004 | 2 | detailed | I am working on a machine unlearning algorithm that avoids full retraining by updating the original estimator $\hat{\beta}$ using a randomized procedure with Gaussian noise $\mathbf{b}$. The goal is to satisfy $(\phi, \epsilon)$-Gaussian certifiability for privacy while maintaining accuracy, measured by Generalization ... | To process deletions efficiently, without full retraining, and obscure residual information, we update $\hat{\beta}$ via the randomized procedure below.[^2]
$$\tilde{\beta}_{\setminus \mathcal{M}} := \bar{A}(\hat{\beta}, \mathcal{D}_{\mathcal{M}}, T(\mathcal{D}_n), \mathbf{b}), \text{ where } \mathbf{b} \text{ is a ran... | **Logistic regression model.** For simplicity, we use ridge penalized $r(\beta) = \|\beta\|_2^2$ logistic loss $\ell(y|x^\top \beta)$, with random features $x_i \sim N(0, \frac{1}{n} I_p)$ and true parameter $\beta_* \sim N(0, I_p)$. To depict high dimensionality, we take $n = p$, penalty parameter $\lambda = 0.5$, and... | To demonstrate the superior performance of Gaussian perturbation over Laplace noise in your machine unlearning algorithm regarding Generalization Error Divergence (GED) under $(\phi, \epsilon)$-Gaussian certifiability, you must construct an experimental framework that rigorously disentangles the effects of dimensionali... | true | 16,188 | 0.5 | 10 | 5 | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose evaluating the Generalization Error Divergence (GED) across both synthetic and real-world datasets?", "reference_excerpt": "design a comprehensive suite of experiments evaluating the Generalization Error Divergence (GED) across synthetic... | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response propose evaluating the Generalization Error Divergence (GED) across both synthetic and real-world datasets?", "reference_excerpt": "design a comprehensive suite of experiments evaluating the Generalization Error Divergence (GED) across synthetic... |
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | results_to_analysis | p006 | 1 | concise | In our unlearning experiments with large dimensions ($p=1000$ to $10000$), Gaussian noise consistently achieves lower Generalization Error Difference (GED) and Unlearning Error Difference (UED) than eps-delta and Laplace noise. To formally prove our theoretical results in this high-dimensional setting, what standard as... | ### 5.1 LARGE DIMENSIONAL AND REAL DATA EXPERIMENTS
In this section, we add additional experiments with larger dimension $p$. The problem setting is exactly the same as in the previous section, but $p$ now varies from 1000 to 10000. Figure 4 shows that the same pattern as in Figure 1 earlier persists here. In the abov... | Assumption (B1) on mean-zero sub-Gaussian features is a mild and frequently made in many papers on high dimensional statistics. See for example Miolane & Montanari (2021); Rahnama Rad & Maleki (2020); Auddy et al. (2024); Zheng et al. (2017); Donoho & Montanari (2016); Bellec et al. (2025). Assumption (B2) is not a str... | To formally prove theoretical results regarding the performance of Gaussian noise versus $\epsilon$-$\delta$ (Exponential Mechanism) and Laplace noise in the context of *unlearning* large-scale models ($p \in [1000, 10000]$), you must move beyond generic privacy-unlearning trade-offs and construct a framework that spec... | true | 16,199 | 0.5 | 10 | 5 | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response state that the theoretical proofs rely on two main assumptions?", "reference_excerpt": "rely on two standard assumptions"}, {"criterion_id": 2, "aspect": "key_idea", "criterion": "Does the response identify that one of the assumptions must be ma... | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response state that the theoretical proofs rely on two main assumptions?", "reference_excerpt": "rely on two standard assumptions", "judgment": "no", "evidence": "The response categorizes assumptions into three areas (data geometry, model architecture, a... |
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | results_to_analysis | p007 | 3 | conversational | I'm trying to write up the theoretical analysis for our unlearning results. We saw great performance in high dimensions ($p \sim n$), but I need to explain why prior theoretical frameworks don't apply here. Specifically, I want to show that assuming $\mu I_p \preceq \nabla^2 f(\beta, z_i) \preceq L I_p$ with constant $... | 
<!-- Figure 1 description: Three log-log plots showing Generalization Error Divergence (GED) vs.... | **Failure of the assumptions in Allouah et al. (2025b), Sekhari et al. (2021).** In their theory Allouah et al. (2025b) assumes[^1] that $f$ satisfies the standard optimization assumptions. More precisely, they assume $\beta \to f(\beta, z_i)$ is simultaneously $\mu$-strongly convex and $L$-smooth, i.e.,
$$\mu I_p \pre... | To address the theoretical gap your unlearning framework encounters in the high-dimensional regime ($p \sim n$), we must first deconstruct the assumptions of standard convex optimization literature (e.g., those relying on uniform Hessian bounds $\mu I \preceq \nabla^2 f \preceq L I$). These bounds imply that the loss l... | true | 18,872 | 0.3 | 10 | 3 | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response state that simultaneous strong convexity and smoothness conditions are incompatible in the high-dimensional proportional regime?", "reference_excerpt": "demonstrate that simultaneous strong convexity and smoothness conditions are incompatible."}... | [{"criterion_id": 1, "aspect": "key_idea", "criterion": "Does the response state that simultaneous strong convexity and smoothness conditions are incompatible in the high-dimensional proportional regime?", "reference_excerpt": "demonstrate that simultaneous strong convexity and smoothness conditions are incompatible.",... |
0FJYicpOj0 | Gaussian certified unlearning in high dimensions: A hypothesis testing approach | cmu-pope-4b-v0300-iter-099 | mixed_context | p008 | 1 | concise | "We have demonstrated that a Gaussian-perturbed one-step Newton estimator achieves vanishing General(...TRUNCATED) | "