Posts by Collection

portfolio

publications

Fair Data Representation for Machine Learning at the Pareto Frontier

Published in Journal of Machine Learning Research (JMLR), 24 (2023), 1–63, 2023

We develop a principled framework for constructing fair data representations that achieve explicit, controllable trade-offs between utility and fairness—characterizing and computing solutions along a Pareto frontier.

Recommended citation: Shizhou Xu, Thomas Strohmer. (2023). “Fair Data Representation for Machine Learning at the Pareto Frontier.” Journal of Machine Learning Research, 24, 1–63.
Download Paper | Download Slides

On the (In)Compatibility between Individual and Group Fairness

Published in Under review (SIAM Journal on Mathematics of Data Science — SIMODS), 2024

We analyze fundamental tensions between individual and group fairness notions, clarifying when they can or cannot be simultaneously satisfied and what trade-offs are unavoidable.

Recommended citation: Shizhou Xu, Thomas Strohmer. (2024). “On the (In)Compatibility between Individual and Group Fairness.” Under review at SIAM Journal on Mathematics of Data Science.
Download Paper | Download Slides

WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity

Published in Under review (Journal of the American Statistical Association), 2024

We introduce WHOMP, a Wasserstein-homogeneity optimality principle for subgroup splitting in comparative experiments (clinical trials, social experiments, and A/B tests). The method yields interpretable criteria, efficient estimators, and strong empirical gains over random partitioning, covariate-adaptive randomization, rerandomization, and anti-clustering baselines.

Recommended citation: Shizhou Xu, Thomas Strohmer. (2024). “WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity.” Under review at Journal of the American Statistical Association.
Download Paper | Download Slides

Forgetting-MarI: LLM Unlearning via Marginal Information Regularization

Published in Under review, 2025

We propose marginal-information regularization for LLM unlearning, targeting targeted forgetting with strong utility retention and practical, evaluation-driven design.

Recommended citation: Shizhou Xu, Yuan Ni, Stefan Broecker, Thomas Strohmer. (2025). “Forgetting-MarI: LLM Unlearning via Marginal Information Regularization.” Under review.
Download Paper | Download Slides

Machine Unlearning via Information-Theoretic Regularization

Published in Manuscript, 2025

We develop information-theoretic regularization principles for machine unlearning, aiming to remove targeted information while maintaining general utility and enabling principled evaluation.

Recommended citation: Shizhou Xu, Thomas Strohmer. (2025). “Machine Unlearning via Information-Theoretic Regularization.” under review at Mathematical Foundations of Machine Learning.
Download Paper | Download Slides

Multi-resolution Enhancement for Full Spectrum Neural Representations

Published in Under review at Nature Machine Intelligence, 2025

We develop multi-resolution enhancement strategies for full-spectrum neural representations, improving fidelity across scales with an emphasis on robust learning and generalization.

Recommended citation: Yuan Ni, Z. Chen, Shizhou Xu, C. Peng, R. Plumley, C. H. Yoon, J. Thayer, J. Turner. (2025). “Multi-resolution Enhancement for Full Spectrum Neural Representations.” Under review at Nature Machine Intelligence.
Download Paper | Download Slides

Utility–Separation Pareto Frontier: An Information-Theoretic Characterization

Published in , 2025

We provide an information-theoretic characterization of the separation-utility trade-off, yielding a principled Pareto frontier perspective for designing and evaluating separation-based objectives.

Recommended citation: Shizhou Xu. (2025). “Utility–Separation Pareto Frontier: An Information-Theoretic Characterization.”
Download Paper | Download Slides

teaching

MATH 127C — Real Analysis (Summer 2025)

, , 2025

This page mirrors announcements, policies, and a living schedule for the Summer 2025 offering. Lecture notes and problem sets reflect the topics we covered this term: metric spaces, compactness/connectedness, multivariable differentiability (Jacobian, chain/implicit/inverse theorems), $k$–volume and Gram determinants, change of variables, Fubini/Tonelli, and Green/Stokes/Divergence.

Shizhou Xu

Posts by Collection

portfolio

publications

Fair Data Representation for Machine Learning at the Pareto Frontier

On the (In)Compatibility between Individual and Group Fairness

WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity

Forgetting-MarI: LLM Unlearning via Marginal Information Regularization

Machine Unlearning via Information-Theoretic Regularization

Multi-resolution Enhancement for Full Spectrum Neural Representations

Utility–Separation Pareto Frontier: An Information-Theoretic Characterization

talks

Fair Data Representation for Machine Learning

Fairness in Machine Learning

Fair Data Representation for Machine Learning at the Pareto Frontier

Fair Data Representation for Machine Learning at the Pareto Frontier

Fair Data Representation for Machine Learning at the Pareto Frontier

Machine Unlearning for Scientific Discovery

Machine Unlearning via Information-Theoretic Regularization

WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity

WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity

WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity

Machine Unlearning via Information-Theoretic Regularization

Machine Unlearning via Information-Theoretic Regularization

Machine Unlearning via Information-Theoretic Regularization

teaching

MATH 127C — Real Analysis (Summer 2025)