Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
portfolio
publications
Fair Data Representation for Machine Learning at the Pareto Frontier
Published in Journal of Machine Learning Research (JMLR), 24 (2023), 1–63, 2023
We develop a principled framework for constructing fair data representations that achieve explicit, controllable trade-offs between utility and fairness—characterizing and computing solutions along a Pareto frontier.
Recommended citation: Shizhou Xu, Thomas Strohmer. (2023). “Fair Data Representation for Machine Learning at the Pareto Frontier.” Journal of Machine Learning Research, 24, 1–63.
Download Paper | Download Slides
On the (In)Compatibility between Individual and Group Fairness
Published in Under review (SIAM Journal on Mathematics of Data Science — SIMODS), 2024
We analyze fundamental tensions between individual and group fairness notions, clarifying when they can or cannot be simultaneously satisfied and what trade-offs are unavoidable.
Recommended citation: Shizhou Xu, Thomas Strohmer. (2024). “On the (In)Compatibility between Individual and Group Fairness.” Under review at SIAM Journal on Mathematics of Data Science.
Download Paper | Download Slides
WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity
Published in Under review (Journal of the American Statistical Association), 2024
We introduce WHOMP, a Wasserstein-homogeneity optimality principle for subgroup splitting in comparative experiments (clinical trials, social experiments, and A/B tests). The method yields interpretable criteria, efficient estimators, and strong empirical gains over random partitioning, covariate-adaptive randomization, rerandomization, and anti-clustering baselines.
Recommended citation: Shizhou Xu, Thomas Strohmer. (2024). “WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity.” Under review at Journal of the American Statistical Association.
Download Paper | Download Slides
Forgetting-MarI: LLM Unlearning via Marginal Information Regularization
Published in Under review (ICLR 2026), 2025
We propose marginal-information regularization for LLM unlearning, targeting targeted forgetting with strong utility retention and practical, evaluation-driven design.
Recommended citation: Shizhou Xu, Yuan Ni, Stefan Broecker, Thomas Strohmer. (2025). “Forgetting-MarI: LLM Unlearning via Marginal Information Regularization.” Under review at ICLR 2026.
Download Paper | Download Slides
Machine Unlearning via Information-Theoretic Regularization
Published in Manuscript (available on request), 2025
We develop information-theoretic regularization principles for machine unlearning, aiming to remove targeted information while maintaining general utility and enabling principled evaluation.
Recommended citation: Shizhou Xu, Thomas Strohmer. (2025). “Machine Unlearning via Information-Theoretic Regularization.” Manuscript.
Download Paper | Download Slides
Multi-resolution Enhancement for Full Spectrum Neural Representations
Published in Under review (Nature Machine Intelligence), 2025
We develop multi-resolution enhancement strategies for full-spectrum neural representations, improving fidelity across scales with an emphasis on robust learning and generalization.
Recommended citation: Yuan Ni, Z. Chen, Shizhou Xu, C. Peng, R. Plumley, C. H. Yoon, J. Thayer, J. Turner. (2025). “Multi-resolution Enhancement for Full Spectrum Neural Representations.” Under review at Nature Machine Intelligence.
Download Paper | Download Slides
Utility–Separation Pareto Frontier: An Information-Theoretic Characterization
Published in Under review (Journal of Machine Learning Research — JMLR), 2025
We provide an information-theoretic characterization of the utility–separation trade-off, yielding a principled Pareto frontier perspective for designing and evaluating separation-based objectives.
Recommended citation: Shizhou Xu. (2025). “Utility–Separation Pareto Frontier: An Information-Theoretic Characterization.” Under review at Journal of Machine Learning Research.
Download Paper | Download Slides
talks
Fair Data Representation for Machine Learning
Published:
Talk on fairness objectives and data pre-processing approaches for controlling fairness–accuracy trade-offs.
Fairness in Machine Learning
Published:
Invited talk introducing core fairness notions, practical pitfalls, and research directions in trustworthy ML.
Fair Data Representation for Machine Learning at the Pareto Frontier
Published:
Workshop talk on provable fairness–utility trade-offs and algorithmic construction along the Pareto frontier.
Fair Data Representation for Machine Learning at the Pareto Frontier
Published:
Conference presentation of the JMLR work on fair data representation via Pareto-frontier trade-offs.
Fair Data Representation for Machine Learning at the Pareto Frontier
Published:
Invited talk on Pareto-frontier methods for fair data representation with provable trade-offs.
Machine Unlearning for Scientific Discovery
Published:
Invited talk on why unlearning matters for scientific workflows (data governance, model updates, and reliability), and how to evaluate it.
Machine Unlearning via Information-Theoretic Regularization
Published:
Seminar talk on information-theoretic unlearning: objectives, algorithms, and empirical evaluation considerations.
WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity
Published:
Seminar talk introducing WHOMP, theory/algorithms, and empirical comparisons for experimental design.
WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity
Published:
Spotlight talk presenting WHOMP and its advantages over classical randomization and rerandomization baselines.
WHOMP: Improving Upon Randomized Controlled Trials via Wasserstein Homogeneity
Published:
Workshop talk on WHOMP: optimality criteria and algorithms for subgroup splitting in comparative experiments.
Machine Unlearning via Information-Theoretic Regularization
Published:
Conference talk connecting unlearning goals to principled regularization and measurable evaluation pipelines.
Machine Unlearning via Information-Theoretic Regularization
Published:
Talk (to appear). Overview of information-theoretic regularization for machine unlearning, with emphasis on auditability and evaluation.
Machine Unlearning via Information-Theoretic Regularization
Published:
Talk (to appear). Mathematical framing of unlearning objectives and practical verification-oriented evaluation.
teaching
MATH 127C — Real Analysis (Summer 2025)
, , 2025
This page mirrors announcements, policies, and a living schedule for the Summer 2025 offering. Lecture notes and problem sets reflect the topics we covered this term: metric spaces, compactness/connectedness, multivariable differentiability (Jacobian, chain/implicit/inverse theorems), $k$–volume and Gram determinants, change of variables, Fubini/Tonelli, and Green/Stokes/Divergence.
