# 2022 Young Researchers Workshop Speaker Abstracts

There are 19 speakers scheduled for this year's Young Researchers Workshop. Abstracts for all talks are below in alphabetical order by speakers last name.

## Titles and Abstracts

### Daniel Alabi

Hypothesis Testing for Differentially Private Linear Regression

In this work, we design differentially private hypothesis tests for the following problems in the general linear model: testing a linear relationship and testing for the presence of mixtures. The majority of our hypothesis tests are based on differentially private versions of the F-statistic for the general linear model framework, which are uniformly most powerful unbiased in the non-private setting. We also present other tests---not based on the F-statistic---for these problems. We show that the differentially private F-statistic converges to the asymptotic distribution of its non-private counterpart. As a corollary, the statistical power of the differentially private F-statistic converges to the statistical power of the non-private F-statistic. Through a suite of Monte Carlo based experiments, we show that our tests achieve desired significance levels and have a high power that approaches the power of the non-private tests as we increase sample sizes or the privacy-loss parameter. We also show when our tests outperform existing methods in the literature.

### Yeganeh Alimohammadi

Using Network Structure to Estimate Epidemics

People's interaction networks play a critical role in epidemics. However, precise mapping of the network structure is often expensive or even impossible. I will show that it is unnecessary to map the entire network. Instead, contact tracing a few samples from the population is enough to estimate an outbreak's likelihood and size. I will present a nonparametric estimator based on the contact tracing results and give theoretical guarantees on the estimator's accuracy for a large class of networks. Finally, I will present two examples of real-world applications of using the network structures for epidemic control: school reopening and placing vaccine sites.

### Léonard Boussioux

Multimodality, models, algorithms, and applications to sustainability

This talk proposes two complementary methodologies that illustrate how combining optimization, machine learning, and deep learning can help solve sustainability challenges. The first methodology investigates how to transform high-dimensional, messy, and unstructured data into impactful predictive and prescriptive analytics. We propose a novel data-driven framework - Gather, Extract, Predict - that generalizes state-of-the-art machine learning and deep learning tools into efficient and practical feature extractors. We develop data pre-processing, mining, and fusion pipelines to process large-scale multimodal data sources such as text, images, time series, and tables. We demonstrate the capacity of such multimodal systems to improve upon single-modality approaches with two real-world applications: healthcare operations and hurricane forecasting. Next, we complement our first methodology by presenting a novel application of adaptive robust optimization for ensemble modeling of time series forecasting models. We show how the robust combination of various forecasts can improve performance and assist decision-making under uncertainty. Overall, our two methodologies foster interdisciplinary approaches to unlock new impactful analytics use cases.

### Alireza Fallah

Privacy Mechanisms for Data Markets

The data of billions of people worldwide are used every day for improving search algorithms, recommendations on online platforms, personalized advertising, and the design of new drugs, services, and products. A common concern with many of these data-intensive applications centers on privacy — as a user's data is harnessed, more and more information about her behavior and preferences is uncovered and potentially utilized by platforms and advertisers. Consequently, we need to adjust the design of data markets to include privacy-preserving mechanisms. In this regard, a key practical question remains: how do we decide how much privacy heterogeneous, strategic users will obtain?

This talk attempts to answer this question and study the impact of data market architecture on the design of privacy mechanisms for purchasing data from privacy-sensitive strategic users. In particular, we consider a platform's problem of collecting data from privacy-aware users to estimate a parameter of interest. We formulate this question as a Bayesian-optimal mechanism design problem, in which an individual can share her data in exchange for a monetary reward but at the same time has a private heterogeneous privacy cost which we quantify using differential privacy. We consider two popular differential privacy settings: central and local. In both settings, we establish minimax lower bounds for the estimation error and derive (near) optimal estimators for given heterogeneous privacy loss levels for users. Next, we pose the mechanism design problem as the optimal selection of an estimator and payments that elicit truthful reporting of users' privacy sensitivities. We further develop efficient algorithmic mechanisms to solve this problem in both privacy settings.

### Yiding Feng

Batching and Optimal Multi-stage Bipartite Allocations

In several applications of real-time matching of demand to supply in online marketplaces, the platform allows for some latency to batch the demand and improve the efficiency of the resulting matching. Motivated by these applications, we study the optimal trade-off between batching and inefficiency in the context of designing robust online allocations. In particular, we consider K-stage variants of the classic vertex weighted bipartite b-matching and AdWords problems in the adversarial setting, where online vertices arrive stage- wise and in K batches—in contrast to online arrival. Our main result for both problems is an optimal (1-(1-1/K)^K)-competitive (fractional) matching algorithm, improving the classic (1-1/e) competitive ratio bound known for the online variants of these problems (Mehta et al., 2007; Aggarwal et al., 2011).

Our main technique at high-level is developing algorithmic tools to vary the trade-off between “greedyness” and “hedging” of the matching algorithm across stages. We rely on a particular family of convex- programming based matchings that distribute the demand in a specifically balanced way among supply in different stages, while carefully modifying the balancedness of the resulting matching across stages. More precisely, we identify a sequence of polynomials with decreasing degrees to be used as strictly concave regularizers of the maximum weight matching linear program to form these convex programs. At each stage, our fractional multi-stage algorithm returns the corresponding regularized optimal solution as the matching of this stage (by solving the convex program). By providing structural decomposition of the underlying graph using the optimal solutions of these convex programs and recursively connecting the regularizers together, we develop a new multi-stage primal-dual framework to analyze the competitive ratio of this algorithm. We further show this algorithm is optimal competitive, even in the unweighted case, by providing an upper- bound instance in which no online algorithm obtains a competitive ratio better than (1-(1-1/K)^K). We extend our results to integral allocations in the vertex weighted b-matching problem, AdWords problem, and configuration allocation problem with large budgets/small bid over budget ratios assumptions.

This talk is based on joint work with Rad Niazadeh.

### Bailey Flanigan

The impact of altruism on elections

Absent restrictive assumptions on voters' utilities, all deterministic voting rules have unbounded distortion — i.e., they may select candidates that give arbitrarily poor social welfare compared to optimal. We circumvent this impossibility with no assumptions on voters' utilities. Instead, we tweak the standard model of voters to make them altruistic: when deciding how to rank alternatives, a voter weighs not only their own utilities, but also how much utility each alternative will give to others. We show that a tiny amount of altruism across voters can substantially limit distortion, guaranteeing constant distortion for multiple voting rules and bounded distortion for many others. Most importantly, this result is actionable in that it suggests an implementable way to help elections produce higher-welfare outcomes: democratic deliberation, which has been shown to increase voters' tendency to account for the common good in how they vote.

### Rohan Ghuge

The Power of Adaptivity for Stochastic Submodular Cover

In the stochastic submodular cover problem, the goal is to select a subset of stochastic items of minimum expected cost to cover a submodular function. Solutions in this setting correspond to sequential decision processes that select items one by one adaptively (depending on prior observations). While such adaptive solutions achieve the best objective, the inherently sequential nature makes them undesirable in many applications. We show how to obtain solutions that approximate fully-adaptive solutions using only a few "rounds" of adaptivity. We study both independent and correlated settings, proving smooth tradeoffs between the number of adaptive rounds and the solution quality. Experiments on synthetic and real datasets show qualitative improvements in the solutions as we allow more rounds of adaptivity; in practice, solutions with a few rounds of adaptivity are nearly as good as fully-adaptive solutions.

### Yichun Hu

Fast Rates for Contextual Linear Optimization

Incorporating side observations in decision making can reduce uncertainty and boost performance, but it also requires we tackle a potentially complex predictive relationship. While one may use off-the-shelf machine learning methods to separately learn a predictive model and plug it in, a variety of recent methods instead integrate estimation and optimization by fitting the model to directly optimize downstream decision performance. Surprisingly, in the case of contextual linear optimization, we show that the na¨ıve plug-in approach actually achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance. We show this by leveraging the fact that specific problem instances do not have arbitrarily bad near-dual-degeneracy. While there are other pros and cons to consider as we discuss and illustrate numerically, our results highlight a nuanced landscape for the enterprise to integrate estimation and optimization. Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.

### Michael Huang

Policy Evaluation And Learning In Small-data, Weakly-coupled Settings

For optimization problems in small-data, large-scale settings, leveraging problem structure is crucial to learning effective, data-driven policies. We propose a novel debiasing method for policy evaluation and learning tailored to weakly-coupled problems and policies in small-data, large-scale settings. Unlike cross-validation, our method does not sacrifice training data for policy evaluation. We prove our method is asymptotically optimal as the problem size grows, even if data are limited. Our results illuminate how the degree of coupling in the problem affects our method’s performance. We illustrate our results for a general class of angular, linear optimization problems.

### Yujia Jin

The Complexity of Optimizing Single and Multi-player Games

In this talk, I will discuss recent advances in establishing the complexity of solving infinite-horizon single-player games, i.e. Markov decision processes, and multi-player games, i.e. turn-based stochastic games and simultaneous stochastic games. These games are canonical models for decision making with one or more agents and we solve them by finding approximate Nash equilibrium. For Markov decision processes, which are known to be polynomial-time solvable, we design optimization methods with improved sample efficiency and show a matching lower bound in certain regimes. For solving simultaneous stochastic games, which are known to be PPAD-hard, we establish its membership in PPAD. Further, for turn-based stochastic games, whose complexity was previously unknown, we show that it is PPAD-complete. All our results utilize foundational linear algebraic structure of the game, and apply to both discounted and average-reward games.

### Shuangning Li

Random Graph Asymptotics for Treatment Effect Estimation under Network Interference

The network interference model for causal inference places all experimental units at the vertices of an undirected exposure graph, such that treatment assigned to one unit may affect the outcome of another unit if and only if these two units are connected by an edge. This model has recently gained popularity as means of incorporating interference effects into the Neyman--Rubin potential outcomes framework; and several authors have considered estimation of various causal targets, including the direct and indirect effects of treatment. In this talk, we consider large-sample asymptotics for treatment effect estimation under network interference in a setting where the exposure graph is a random draw from a graphon. When targeting the direct effect, we show that---in our setting---popular estimators are considerably more accurate than existing results suggest, and provide a central limit theorem in terms of moments of the graphon. Meanwhile, when targeting the indirect effect, we leverage our generative assumptions to propose a consistent estimator in a setting where no other consistent estimators are currently available. We also show how our results can be used to conduct a practical assessment of the sensitivity of randomized study inference to potential interference effects. Overall, our results highlight the promise of random graph asymptotics in understanding the practicality and limits of causal inference under network interference.

This is joint work with Stefan Wager.

### Wenlong Mu

Statistical theory for reinforcement learning: Oracle inequalities, Markov chains, and stochastic approximation

Dynamic programming provides a formalism for making near-optimal decisions in sequential settings. A central task in approximate dynamic programming and reinforcement learning is to estimate the solution to the Bellman fixed point equation. Given the large state spaces that arise in practice, function approximation plays an essential role, and the resulting projected Bellman equations are often solved using stochastic approximation. Despite the practical popularity of these methods, their statistical behavior is not well understood. In this talk, we discuss some recent progress in analyzing Markovian stochastic approximation procedures for projected fixed point equations. Among other results, we discuss a novel class of instance-dependent oracle inequalities, and exhibit their optimality via local minimax lower bounds.

### Bento Natura

Interior point methods are not worse than Simplex

Whereas interior point methods provide polynomial-time linear programming algorithms, the running time bounds depend on bit-complexity or condition measures that can be unbounded in the problem dimension. This is in contrast with the simplex method that always admits an exponential bound. We introduce a new polynomial-time path-following interior point method where the number of iterations also admits a combinatorial upper bound O(2^n n^{1.5} log n) for an n-variable linear program in standard form. This complements previous work by Allamigeon, Benchimol, Gaubert, and Joswig (SIAGA 2018) that exhibited a family of instances where any path-following method must take exponentially many iterations. The number of iterations of our algorithm is at most O(n^{1.5} log n) times the number of segments of any piecewise linear curve in the wide neighborhood of the central path. In particular, it matches the number of iterations of any path following interior point method up to this polynomial factor. The overall exponential upper bound derives from studying the `max central path', a piecewise-linear curve with the number of pieces bounded by the total length of 2n shadow vertex simplex paths. Our algorithm falls into the family of layered least squares interior point methods introduced by Vavasis and Ye (Math. Prog. 1996). In contrast to previous layered least squares methods that partition the kernel of the constraint matrix into coordinate subspaces, our method creates layers based on a general subspace providing more flexibility. Our result also implies the same bound on the number of iterations of the trust region interior point method by Lan, Monteiro, and Tsuchiya (SIOPT 2009). This is joint work with Xavier Allamigeon, Daniel Dadush, Georg Loho, and László Végh.

Ruoqi Shen

Proximal Framework in Sampling: Structured Logconcave Sampling with a Restricted Gaussian Oracle

We develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to generically improve dependences on problem condition number from polynomial to linear. A key ingredient in our framework is the notion of a “restricted Gaussian oracle' (RGO) for function g, which is a sampler for distributions whose negative log-likelihood sums a quadratic (in a multiple of the identity) and g. We give algorithms for sampling several structured logconcave families to high accuracy. By combining our reduction framework with our new samplers, we obtain the state-of-the-art bounds for sampling composite densities and finite-sum densities to high accuracy in total variation distance.

### Sean Sinclair

Sequential Fair Allocation: Achieving the Optimal Envy-Efficiency Tradeoff Curve

Managing and optimizing the operations of complex systems often involves making tradeoffs between different objectives such as fairness, risk, accuracy, and efficiency. These problems are exacerbated in online settings where uncertainty propagates through objectives differently. How these criteria interact is often not well understood or characterized, and there typically does not exist a single objective function that determines which tradeoffs are better than others.

We tackle a problem established from a partnership with the Food Bank of the Southern Tier in optimizing their mobile food-pantry operations. We model it as an online resource allocation problem under demand uncertainty, where individuals arrive over T rounds, with each round having a random number of individuals characterized by their type. We provide an exact characterization of the achievable (envy, efficiency) pairs in this model, showing that any algorithm achieving low envy must suffer from poor efficiency, and vice-versa. Moreover, we complement this exact characterization with a simple algorithm capable of achieving any desired point along the trade-off curve.

### Yunzong Xu

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability

We consider the general (stochastic) contextual bandit problem under the realizability assumption, i.e., the expected reward, as a function of contexts and actions, belongs to a general function class F. We design a fast and simple algorithm that achieves the statistically optimal regret with only O(log T) calls to an offline regression oracle across all T rounds. The number of oracle calls can be further reduced to O(loglog T) if T is known in advance. Our results provide the first universal and optimal reduction from contextual bandits to offline regression, solving an important open problem in the contextual bandit literature. A direct consequence of our results is that any advances in offline regression immediately translate to contextual bandits, statistically and computationally. This leads to faster algorithms and improved regret guarantees for broader classes of contextual bandit problems.

### Yuling Yan

Model-based Reinforcement Learning for Offline Zero-Sum Markov Games

This paper makes progress towards learning Nash equilibria in two-player zero-sum Markov games from offline data. Specifically, consider a $\gamma$-discounted infinite-horizon Markov game with $S$ states, where the max-player has $A$ actions and the min-player has $B$ actions. We propose a pessimistic model-based algorithm with Bernstein-style lower confidence bounds --- called VI-LCB-Game --- that provably finds an $\varepsilon$-approximate Nash equilibrium with a sample complexity no larger than $\frac{C_{\mathsf{clipped}}^{\star}S(A+B)}{(1-\gamma)^{3}\varepsilon^{2}}$ (up to some log factor). Here, $C_{\mathsf{clipped}}^{\star}$ is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy $\varepsilon$ can be any value within $\big(0,\frac{1}{1-\gamma}\big]$. Our sample complexity bound strengthens prior art by a factor of $\min\{A,B\}$, achieving minimax optimality for the entire $\varepsilon$-range. An appealing feature of our result lies in algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.

### Mengxin (Maxine) Yu

Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments

We study offline reinforcement learning under a novel model called strategic MDP, which characterizes the strategic interactions between a principal and a sequence of myopic agents with private types. Due to the bilevel structure and private classes, strategic MDP involves information asymmetry between the principal and the agents. We focus on the offline RL problem, where the goal is to learn the optimal policy of the principal concerning a target population of agents based on a pre-collected dataset that consists of historical interactions. The unobserved private types confound such a dataset as they affect both the rewards and observations received by the principal. We propose a novel algorithm, Pessimistic policy Learning with Algorithmic iNstruments (PLAN), which leverages the ideas of instrumental variable regression and the pessimism principle to learn a near-optimal principal's policy in the context of general function approximation. Our algorithm is based on the critical observation that the principal's actions serve as valid instrumental variables. In particular, under a partial coverage assumption on the offline dataset, we prove that PLAN outputs a $1/\sqrt{K}$-optimal policy with K being the number of collected trajectories. We further apply our framework to some special cases of strategic MDP, including strategic regression, strategic bandit, and noncompliance in recommendation systems.

### Sophie Yu

Random graph matching at Otter's tree-counting constant in polynomial time

Given a pair of graphs, the problem of graph matching or network alignment refers to finding the underlying vertex correspondence that maximally aligns the edges. This is a ubiquitous problem arising in a variety of applications across diverse fields such as network privacy, computational biology, computer vision, and natural language processing. Graph matching is an instance of the notoriously difficult quadratic assignment problem (QAP), which is NP-hard to solve or approximate.

Despite the worst-case computational hardness of QAP, I will present a computationally efficient graph matching algorithm based on counting a special family of trees. When the two networks are Erdős–Rényi random graphs with correlated edges through the hidden vertex correspondence, we show that our algorithm correctly matches all but a vanishing fraction of vertices with high probability as soon as the edge correlation exceeds the square root of Otter's constant. Moreover, we further upgrade the almost exact recovery to exact recovery whenever it is information-theoretically possible. This is the first polynomial-time algorithm that achieves exact and almost exact matching with an explicit constant correlation for both dense and sparse graphs.

Based on joint work with Cheng Mao (GaTech), Yihong Wu (Yale), and Jiaming Xu (Duke).