# Numerical Analysis (math.NA)

• The main computational cost in the training of and prediction with Convolution Neural Networks (CNNs) typically stems from the convolution. In this paper, we present three novel ways to parameterize the convolution more efficiently, significantly decreasing the computational complexity. Commonly used CNNs filter the input data using a series of spatial convolutions with compact stencils that couple features from all channels and point-wise nonlinearities. In this paper, we propose three architectures that are cheaper to couple the channel dimension and thereby reduce both the number of trainable weights and the computational cost of the CNN. The first architecture is inspired by tensor-products and imposes a circulant coupling of the channels. The second and third architectures arise as discretizations of a new type of residual neural network (ResNet) that is inspired by Partial Differential Equations (PDEs) or reaction-diffusion type. The coupling patterns of the first two architectures are applicable to a large class of CNNs. We outline in our numerical experiments that the proposed architectures, although considerably reducing the number of trainable weights, yield comparable accuracy to existing CNNs that are fully coupled in the channel dimension.
• In this work, we present a new derivative-free optimization method and investigate its use for training neural networks. Our method is motivated by the Ensemble Kalman Filter (EnKF), which has been used successfully for solving optimization problems that involve large-scale, highly nonlinear dynamical systems. A key benefit of the EnKF method is that it requires only the evaluation of the forward propagation but not its derivatives. Hence, in the context of neural networks it alleviates the need for back propagation and reduces the memory consumption dramatically. However, the method is not a pure "black-box" global optimization heuristic as it efficiently utilizes the structure of typical learning problems. We propose an important modification of the EnKF that enables us to prove convergence of our method to the minimizer of a strongly convex function. Our method also bears similarity with implicit filtering and we demonstrate its potential for minimizing highly oscillatory functions using a simple example. Further, we provide numerical examples that demonstrate the potential of our method for training deep neural networks.
• Robust principal component analysis (RPCA) is a well-studied problem with the goal of decomposing a matrix into the sum of low-rank and sparse components. In this paper, we propose a nonconvex feasibility reformulation of RPCA problem and apply an alternating projection method to solve it. To the best of our knowledge, we are the first to propose a method that solves RPCA problem without considering any objective function, convex relaxation, or surrogate convex constraints. We demonstrate through extensive numerical experiments on a variety of applications, including shadow removal, background estimation, face detection, and galaxy evolution, that our approach matches and often significantly outperforms current state-of-the-art in various ways.
• Efficient high-order time integration schemes are necessary to capture the complex processes involved in atmospheric flows over long periods of time. In this work, we propose a high-order, implicit-explicit numerical scheme that combines Multi-Level Spectral Deferred Corrections (MLSDC) and the Spherical Harmonics (SH) transform to solve the wave-propagation problems arising from the shallow-water equations on the rotating sphere. The iterative temporal integration is based on a sequence of corrections distributed on coupled space-time levels to perform a significant portion of the calculations on a coarse representation of the problem and hence to reduce the time-to-solution while preserving accuracy. In our scheme, referred to as MLSDC-SH, the spatial discretization plays a key role in the efficiency of MLSDC, since the SH basis allows for consistent transfer functions between space-time levels that preserve important physical properties of the solution. We study the performance of the MLSDC-SH scheme with shallow-water test cases commonly used in numerical atmospheric modeling. We use this suite of test cases, which gradually adds more complexity to the nonlinear system of governing partial differential equations, to perform a detailed analysis of the convergence rate of MLSDC-SH upon refinement in time. We illustrate the good stability properties of MLSDC-SH and show that the proposed scheme achieves up to eighth-order accuracy in time. Finally, we study the conditions in which MLSDC-SH achieves its theoretical speedup, and we show that it can significantly reduce the computational cost compared to single-level Spectral Deferred Corrections (SDC).
• Convolution has been playing a prominent role in various applications in science and engineering for many years. It is the most important operation in convolutional neural networks. There has been a recent growth of interests of research in generalizing convolutions on curved domains such as manifolds and graphs. However, existing approaches cannot preserve all the desirable properties of Euclidean convolutions, namely compactly supported filters, directionality, transferability across different manifolds. In this paper we develop a new generalization of the convolution operation, referred to as parallel transport convolution (PTC), on Riemannian manifolds and their discrete counterparts. PTC is designed based on the parallel transportation which is able to translate information along a manifold and to intrinsically preserve directionality. PTC allows for the construction of compactly supported filters and is also robust to manifold deformations. This enables us to preform wavelet-like operations and to define deep convolutional neural networks on curved domains.
• This paper studies the unconditional strong convergence rate for a fully discrete scheme of semilinear stochastic evolution equations, under a generalized Lipschitz-type condition on both drift and diffusion operators. Applied to the one-dimensional stochastic advection-diffusion-reaction equation with multiplicative white noise, the main theorem shows that the spatial and temporal strong convergence orders are $1/2$ and $1/4$, respectively. This is the first optimal strong approximation result for semilinear SPDEs with gradient term driven by non-trace class noises. Numerical tests are performed to verify theoretical analysis.
• Deep networks, especially Convolutional Neural Networks (CNNs), have been successfully applied in various areas of machine learning as well as to challenging problems in other scientific and engineering fields. This paper introduces Butterfly-Net, a low-complexity CNN with structured hard-coded weights and sparse across-channel connections, which aims at an optimal hierarchical function representation of the input signal. Theoretical analysis of the approximation power of Butterfly-Net to the Fourier representation of input data shows that the error decays exponentially as the depth increases. Due to the ability of Butterfly-Net to approximate Fourier and local Fourier transforms, the result can be used for approximation upper bound for CNNs in a large class of problems. The analysis results are validated in numerical experiments on the approximation of a 1D Fourier kernel and of solving a 2D Poisson's equation.
• We present an offline/online computational procedure for computing the dual norm of parameterized linear functionals. The key elements of the approach are (i) an empirical test space for the manifold of Riesz elements associated with the parameterized functional, and (ii) an empirical quadrature procedure to efficiently deal with parametrically non-affine terms. We present a number of theoretical results to identify the different sources of error and to motivate the technique. Finally, we show the effectiveness of our approach to reduce both offline and online costs associated with the computation of the time-averaged residual indicator proposed in [Fick, Maday, Patera, Taddei, Journal of Computational Physics, 2018 (accepted)].
• We introduce a family of implicit probabilistic integrators for initial value problems (IVPs) taking as a starting point the multistep Adams-Moulton method. The implicit construction allows for dynamic feedback from the forthcoming time-step, by contrast with previous probabilistic integrators, all of which are based on explicit methods. We begin with a concise survey of the rapidly-expanding field of probabilistic ODE solvers. We then introduce our method, which builds on and adapts the work of Conrad et al. (2016) and Teymur et al. (2016), and provide a rigorous proof of its well-definedness and convergence. We discuss the problem of the calibration of such integrators and suggest one approach. We give an illustrative example highlighting the effect of the use of probabilistic integrators - including our new method - in the setting of parameter inference within an inverse problem.
• This paper proposes some novel efficient and accurate adaptive two-grid (ATG) finite element algorithms for linear and nonlinear partial differential equations (PDEs). In these algorithms, they use the information of the solutions on k-th level adaptive meshes, instead of on the uniform meshes, to find the solutions on (k+1)-th level adaptive meshes. They transform the non-symmetric positive definite (non-SPD) PDEs into symmetric positive definite (SPD) PDEs, and transform the nonlinear PDEs into the linear PDEs. These algorithms have the following advantages: 1. Comparing with adaptive methods, they do not need to solve the nonlinear systems; 2. Comparing with two-grid methods, the degrees of freedom are largely reduced; 3. Comparing with the cases when uniform meshes are used for coarse level approximation, they are easily implemented; they are more efficient and accurate since only the interpolation of the solution on newly refined meshes needs to be computed, and the interpolation error is also reduced; they are especially efficient when many steps of mesh refinements are used since the computational cost of computing solutions on uniform meshes is large then. Next, this paper constructs a residue-type a posteriori error estimator for general non-SPD linear problems. We prove the upper bound of the oscillation term, and this gets rid of the assumption that the oscillation term is a high order term (h.o.t.), which may not be true generally due to the low regularity of the numerical solution. Based on this result, the reliability and efficiency of the error estimator are established. Finally, the convergence of the error on the adaptive meshes is proved when bisection is used for the mesh refinement.
• We develop and analyze an ultraweak variational formulation for a variant of the Kirchhoff-Love plate bending model. Based on this formulation, we introduce a discretization of the discontinuous Petrov-Galerkin type with optimal test functions (DPG). We prove well-posedness of the ultraweak formulation and quasi-optimal convergence of the DPG scheme. The variational formulation and its analysis require tools that control traces and jumps in $H^2$ (standard Sobolev space of scalar functions) and $H(\mathrm{div\,Div})$ (symmetric tensor functions with $L_2$-components whose twice iterated divergence is in $L_2$), and their dualities. These tools are developed in two and three spatial dimensions. One specific result concerns localized traces in a dense subspace of $H(\mathrm{div\,Div})$. They are essential to construct basis functions for an approximation of $H(\mathrm{div\,Div})$. To illustrate the theory we construct basis functions of the lowest order and perform numerical experiments for a smooth and a singular model solution. They confirm the expected convergence behavior of the DPG method both for uniform and adaptively refined meshes.
• The main of this work is to use the unit lower triangular matrices for solving inverse eigenvalue problem of nonnegative matrices and present the easier method to solve this problem.
• In this paper I detail a connection between cubic splines and a popular compact finite difference formula that the traditional (monomial basis) method of computing the cubic splines had obscured. On a uniform mesh the simplest Padé scheme for generating fourth-order compact finite differences gives exactly the derivatives at the interior nodes needed to guarantee twice-continuous differentiability; that is, a spline. For nonuniform meshes it is (slightly) different. I argue that for nonuniform meshes the compact approach offers some potential advantages, and even for uniform meshes it offers a new way to treat the edge effects.
• In this paper we deal with the numerical solution of a Hele--Shaw-like system via a cell model with active motion. Convergence of approximations is established for well-posed initial data. These data are chosen in such a way the time derivate is positive at the initial time. The numerical method is constructed by means of a finite element procedure together with the use of a closed-nodal integration. This gives rise to an algorithm which preserves positivity whenever a right-angled triangulation is considered. As a result, uniform-in-time a priori estimates are proven which allows us to pass to limit towards a solution to the Hele--Shaw problem.
• Generalized eigenvalue problems involving a singular pencil are very challenging to solve, both with respect to accuracy and efficiency. The existing package Guptri is very elegant but may sometimes be time-demanding, even for small and medium-sized matrices. We propose a simple method to compute the eigenvalues of singular pencils, based on one perturbation of the original problem of a certain specific rank. For many problems, the method is both fast and robust. This approach may be seen as a welcome alternative to staircase methods.
• We present in this paper a novel numerical reconstruction method for solving a 3D coefficient inverse problem with scattering data generated by a single direction of the incident plane wave. This inverse problem is well-known to be a highly nonlinear and ill-posed problem. Therefore, optimization-based reconstruction methods for solving this problem would typically suffer from the local-minima trapping and require strong a priori information of the solution. To avoid these problems, in our numerical method, we aim to construct a cost functional with a globally strictly convex property, whose minimizer can provide a good approximation for the exact solution of the inverse problem. The key ingredients for the construction of such functional are an integro-differential formulation of the inverse problem and a Carleman weight function. Under a (partial) finite difference approximation, the global strict convexity is proven using the tool of Carleman estimates. The global convergence of the gradient projection method to the exact solution is proven as well. We demonstrate the efficiency of our reconstruction method via a numerical study of experimental backscatter data for buried objects.
• The energy preserving discrete gradient methods are generalized to finite-dimensional Riemannian manifolds by definition of a discrete approximation to the Riemannian gradient, a retraction, and a coordinate center function. The resulting schemes are intrinsic and do not depend on a particular choice of coordinates, nor on embedding of the manifold in a Euclidean space. Generalizations of well-known discrete gradient methods, such as the average vector field method and the Itoh--Abe method are obtained. It is shown how methods of higher order can be constructed via a collocation-like approach. Local and global error bounds are derived in terms of the Riemannian distance function and the Levi-Civita connection. Some numerical results on spin system problems are presented.
• Central to the quest for a deeper understanding of the cancer growth and spread process, the naturally multiscale character of cancer invasion demands appropriate multiscale modelling and analysis approach. The cross-talk between the tissue scale (macro-scale) cancer cell population dynamics and the cell-scale (micro-scale) proteolytic molecular processes along the tumour boundary plays a particularly important role within the invasion processes, leading to dramatic changes in tumour morphology and influencing the overall pattern of cancer spread. Building on the multiscale moving boundary framework proposed in Trucu et al. (Multiscale Model. Simul 11(1): 309-335), in this work we propose a new formulation of this process involving a novel derivation of the macro-scale boundary movement law based on micro-dynamics, involving a transport equation combined with the level-set method. This is explored numerically in a novel finite element macro-micro framework based on cut-cells.
• In the phase-field modeling of brittle fracture, anisotropic constitutive assumptions for the degradation of stored elastic energy due to fracture are crucial to preventing cracking in compression and obtaining physically sound numerical solutions. Three energy decomposition models, the spectral decomposition, the volumetric-deviatoric split, and an improved volumetric-deviatoric split, and their effects on the performance of the phase-field modeling are studied. Meanwhile, anisotropic degradation of stiffness may lead to a small energy remaining on crack surfaces, which violates crack boundary conditions and can cause unphysical crack openings and propagation. A simple yet effective treatment for this is proposed: define a critically damaged zone with a threshold parameter and then degrade both the active and passive energies in the zone. A dynamic mesh adaptation finite element method is employed for the numerical solution of the corresponding elasticity system. Four examples, including two benchmark ones, one with complex crack systems, and one based on an experimental setting, are considered. Numerical results show that the spectral decomposition and improved volumetric-deviatoric split models, together with the improvement treatment of crack boundary conditions, can lead to crack propagation results that are comparable with the existing computational and experimental results. It is also shown that the numerical results are not very sensitive to the parameter defining the critically damaged zone.