Solving linear systems of equations is an essential component in science and technology, including in many machine learning algorithms. Existing quantum algorithms have demonstrated large speedup in solving linear systems, but the required quantum resources are not available on near-term quantum devices. In this work, we study potential near-term quantum algorithms for linear systems of equations. We investigate the use of variational algorithms for solving $Ax = b$ and analyze the optimization landscape for variational algorithms. We found that a wide range of variational algorithms designed to avoid barren plateaus, such as properly-initialized imaginary time evolution and adiabatic-inspired optimization, still suffer from a fundamentally different plateau problem. To circumvent this issue, we design a potentially near-term adaptive alternating algorithm based on a core idea: the classical combination of variational quantum states. We have conducted numerical experiments solving linear systems as large as $2^{300} \times 2^{300}$ by considering special systems that can be simulated efficiently on a classical computer. These experiments demonstrate the algorithm's ability to scale to system sizes within reach in near-term quantum devices of about $100$-$300$ qubits.

We consider the one-variable fragment of first-order logic extended with Presburger constraints. The logic is designed in such a way that it subsumes the previously-known fragments extended with counting, modulo counting or cardinality comparison and combines their expressive powers. We prove NP-completeness of the logic by presenting an optimal algorithm for solving its finite satisfiability problem.

Configuration integer programs (IP) have been key in the design of algorithms for NP-hard high-multiplicity problems since the pioneering work of Gilmore and Gomory [Oper. Res., 1961]. Configuration IPs have a variable for each possible configuration, which describes a placement of items into a location, and whose value corresponds to the number of locations with that placement. In high multiplicity problems items come in types, and are represented succinctly by a vector of multiplicities; solving the configuration IP then amounts to deciding whether the input vector of multiplicities of items of each type can be decomposed into a given number of configurations. We make this implicit notion explicit by observing that the set of all input vectors decomposable into configurations forms a monoid, and solving the configuration IP is the Monoid Decomposition problem. Motivated by applications, we enrich this problem in two ways. First, sometimes each configuration additionally has an objective value, yielding an optimization problem of finding a "best" decomposition under the given objective. Second, there are often different types of configurations for different types of locations. The resulting problem is to optimize over decompositions of the input multiplicity vector into configurations of several types, and we call it Multitype Integer Monoid Optimization, or MIMO. We develop fast exact algorithms for various MIMO with few or many location types and with various objectives. Our algorithms build on a novel proximity theorem connecting the solutions of a certain configuration IP to those of its continuous relaxation. We then cast several fundamental scheduling and bin packing problems as MIMOs, and thereby obtain new or substantially faster algorithms for them. We complement our positive algorithmic results by hardness results.

We propose a generic system model for a special category of interdependent networks, demand-supply networks, in which the demand and the supply nodes are associated with heterogeneous loads and resources, respectively. Our model sheds a light on a unique cascading failure mechanism induced by resource/load fluctuations, which in turn opens the door to conducting stress analysis on interdependent networks. Compared to the existing literature mainly concerned with the node connectivity, we focus on developing effective resource allocation methods to prevent these cascading failures from happening and to mitigate/confine them upon occurrence in the network. To prevent cascading failures, we identify some dangerous stress mechanisms, based on which we quantify the robustness of the network in terms of the resource configuration scheme. Afterward, we identify the optimal resource configuration under two resource/load fluctuations scenarios: uniform and proportional fluctuations. We further investigate the optimal resource configuration problem considering heterogeneous resource sharing costs among the nodes. To mitigate/confine ongoing cascading failures, we propose two network adaptations mechanisms: intentional failure and resource re-adjustment, based on which we propose an algorithm to mitigate an ongoing cascading failure while reinforcing the surviving network with a high robustness to avoid further failures.

This paper investigates a family of adaptive importance sampling algorithms for probability density function exploration. The proposed approach consists in modeling the sampling policy, the sequence of distributions used to generate the particles, as a mixture distribution between a flexible kernel density estimate (based on the previous particles), and a naive heavy tail density. When the share of samples generated according to the naive density goes to zero but not too quickly, two types of results are established: (i) uniform convergence rates are derived for the sampling policy estimate; (ii) a central limit theorem is obtained for the sampling policy estimate as well as for the resulting integral estimates. The fact that the asymptotic variance is the same as the variance of an oracle procedure, in which the sampling policy is chosen as the optimal one, illustrates the benefits of the approach. The practical behavior of the resulting algorithms is illustrated in a simulation study.

We present a reinforcement learning (RL) framework to synthesize a control policy from a given linear temporal logic (LTL) specification in an unknown stochastic environment that can be modeled as a Markov Decision Process (MDP). Specifically, we learn a policy that maximizes the probability of satisfying the LTL formula without learning the transition probabilities. We introduce a novel rewarding and path-dependent discounting mechanism based on the LTL formula such that (i) an optimal policy maximizing the total discounted reward effectively maximizes the probabilities of satisfying LTL objectives, and (ii) a model-free RL algorithm using these rewards and discount factors is guaranteed to converge to such policy. Finally, we illustrate the applicability of our RL-based synthesis approach on two motion planning case studies.

Complex networks are often either too large for full exploration, partially accessible or partially observed. Downstream learning tasks on incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream learning tasks and given resource collection constraints are of great interest. In this paper we formulate the task-specific network discovery problem in an incomplete network setting as a sequential decision making problem. Our downstream task is vertex classification.We propose a framework, called Network Actor Critic (NAC), which learns concepts of policy and reward in an offline setting via a deep reinforcement learning algorithm. A quantitative study is presented on several synthetic and real benchmarks. We show that offline models of reward and network discovery policies lead to significantly improved performance when compared to competitive online discovery algorithms.

Nowadays, ensuring the quality becomes challenging for most modern software systems when constraints are given for the combinations of configurations. Combinatorial interaction strategies can systematically reduce the number of test cases to construct a minimal test suite without affecting the effectiveness of the tests. This paper presents a new efficient search-based strategy to generate constrained interaction test suites to cover all possible combinations. The paper also shows a new application of constrained interaction testing in software fault searches. The proposed strategy initially generates the set of all possible t-tuple combinations; then, it filters out the set by removing the forbidden t-tuples using the base forbidden tuple (BFT) approach. The strategy also utilizes a mixed neighborhood tabu search (TS) to construct optimal or near-optimal constrained test suites. The efficiency of the proposed method is evaluated through a comparison against two well-known state-of-the-art tools. The evaluation consists of three sets of experiments for 35 standard benchmarks. Additionally, the effectiveness and quality of the results are assessed using a real-world case study. Experimental results show that the proposed strategy outperforms one of the competitive strategies, ACTS, for approximately 83% of the benchmarks and achieves similar results to CASA for 65% of the benchmarks when the interaction strength is 2. For an interaction strength of 3, the proposed method outperforms other competitive strategies for approximately 60% and 42% of the benchmarks. The proposed strategy can also generate constrained interaction test suites for an interaction strength of 4, which is not possible for many strategies. Real-world case study shows that the generated test suites can effectively detect injected faults using mutation testing.

Intelligent reflecting surface (IRS) serves as an emerging paradigm to enhance the wireless transmission with a low hardware cost and a reduced power consumption. In this letter, we investigate the IRS-assisted downlink multi-input multi-output (MIMO) system, where an Alice-Bob pair wishes to communicate with the assist of IRS. Our goal is to maximize the spectral efficiency by designing the active beamformer at Alice and the passive reflecting phase shifters (PSs) at the IRS, which turns out to be an intractable mixed integer non-convex optimization problem. To tackle this problem, we propose an efficient algorithm to obtain an effective solution. Specifically, the PSs at the IRS are optimized to maximize the sum of gains of different paths. Such a criterion is also referred to as the maximizing the sum of gains (MSG) principle. Then, an alternating direction method of multipliers (ADMM) algorithm is developed to solve the MSG-based optimization problem. With the obtained PSs, the beamformer at Alice is obtained by classic singular value decomposition (SVD) and water-filling (WF) solutions.

In this effort, we propose a convex optimization approach based on weighted $\ell_1$-regularization for reconstructing objects of interest, such as signals or images, that are sparse or compressible in a wavelet basis. We recover the wavelet coefficients associated to the functional representation of the object of interest by solving our proposed optimization problem. We give a specific choice of weights and show numerically that the chosen weights admit efficient recovery of objects of interest from either a set of sub-samples or a noisy version. Our method not only exploits sparsity but also helps promote a particular kind of structured sparsity often exhibited by many signals and images. Furthermore, we illustrate the effectiveness of the proposed convex optimization problem by providing numerical examples using both orthonormal wavelets and a frame of wavelets. We also provide an adaptive choice of weights which is a modification of the iteratively reweighted $\ell_1$-minimization method.

We present an algorithm based on continuation techniques that can be applied to solve numerically minimization problems with equality constraints. We focus on problems with a great number of local minima which are hard to obtain by local minimization algorithms with random starting guesses. We are particularly interested in the computation of minimal norm solutions of underdetermined systems of polynomial equations. Such systems arise, for instance, in the context of the construction of high order optimized differential equation solvers. By applying our algorithm, we are able to obtain 10th order time-symmetric composition integrators with smaller 1-norm than any other integrator found in the literature up to now.

This paper proposes a novel approach for extending monocular visual odometry to a stereo camera system. The proposed method uses an additional camera to accurately estimate and optimize the scale of the monocular visual odometry, rather than triangulating 3D points from stereo matching. Specifically, the 3D points generated by the monocular visual odometry are projected onto the other camera of the stereo pair, and the scale is recovered and optimized by directly minimizing the photometric error. It is computationally efficient, adding minimal overhead to the stereo vision system compared to straightforward stereo matching, and is robust to repetitive texture. Additionally, direct scale optimization enables stereo visual odometry to be purely based on the direct method. Extensive evaluation on public datasets (e.g., KITTI), and outdoor environments (both terrestrial and underwater) demonstrates the accuracy and efficiency of a stereo visual odometry approach extended by scale optimization, and its robustness in environments with challenging textures.

In order to autonomously learn to control unknown systems optimally w.r.t. an objective function, Adaptive Dynamic Programming (ADP) is well-suited to adapt controllers based on experience from interaction with the system. In recent years, many researchers focused on the tracking case, where the aim is to follow a desired trajectory. So far, ADP tracking controllers assume that the reference trajectory follows time-invariant exo-system dynamics-an assumption that does not hold for many applications. In order to overcome this limitation, we propose a new Q-function which explicitly incorporates a parametrized approximation of the reference trajectory. This allows to learn to track a general class of trajectories by means of ADP. Once our Q-function has been learned, the associated controller copes with time-varying reference trajectories without need of further training and independent of exo-system dynamics. After proposing our general model-free off-policy tracking method, we provide analysis of the important special case of linear quadratic tracking. We conclude our paper with an example which demonstrates that our new method successfully learns the optimal tracking controller and outperforms existing approaches in terms of tracking error and cost.

Cascading bandit (CB) is a variant of both the multi-armed bandit (MAB) and the cascade model (CM), where a learning agent aims to maximize the total reward by recommending $K$ out of $L$ items to a user. We focus on a common real-world scenario where the user's preference can change in a piecewise-stationary manner. Two efficient algorithms, \texttt{GLRT-CascadeUCB} and \texttt{GLRT-CascadeKL-UCB}, are developed. The key idea behind the proposed algorithms is incorporating an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT), within classical upper confidence bound (UCB) based algorithms. Gap-dependent regret upper bounds of the proposed algorithms are derived and both match the lower bound $\Omega(\sqrt{T})$ up to a poly-logarithmic factor $\sqrt{\log{T}}$ in the number of time steps $T$. We also present numerical experiments on both synthetic and real-world datasets to show that \texttt{GLRT-CascadeUCB} and \texttt{GLRT-CascadeKL-UCB} outperform state-of-the-art algorithms in the literature.

Top