- arXiv.org
- Popular Physics
- Space Physics
- General Physics
- Optics
- Biological Physics
- Atomic and Molecular Clusters
- Fluid Dynamics
- History and Philosophy of Physics
- Physics and Society
- Data Analysis, Statistics and Probability
- Medical Physics
- Plasma Physics
- Geophysics
- Atomic Physics
- Instrumentation and Detectors
- Chemical Physics
- Classical Physics
- Computational Physics
- Accelerator Physics
- Physics Education
- Atmospheric and Oceanic Physics

- Analysis of PDEs
- Number Theory
- Information Theory
- Statistics Theory
- History and Overview
- Mathematical Physics
- Probability
- Combinatorics
- Operator Algebras
- Algebraic Geometry
- Group Theory
- Representation Theory
- Complex Variables
- Symplectic Geometry
- Geometric Topology
- Dynamical Systems
- General Mathematics
- Metric Geometry
- Logic
- Optimization and Control
- Numerical Analysis
- Differential Geometry
- General Topology
- Quantum Algebra
- Functional Analysis
- Classical Analysis and ODEs
- Algebraic Topology
- Spectral Theory
- Commutative Algebra
- Rings and Algebras
- K-Theory and Homology
- Category Theory

- General Literature
- Formal Languages and Automata Theory
- Information Theory
- Computational Engineering, Finance, and Science
- Symbolic Computation
- Information Retrieval
- Emerging Technologies
- Neural and Evolutionary Computing
- Computer Vision and Pattern Recognition
- Learning
- Operating Systems
- Databases
- Multiagent Systems
- Sound
- Social and Information Networks
- Software Engineering
- Programming Languages
- Systems and Control
- Hardware Architecture
- Human-Computer Interaction
- Artificial Intelligence
- Cryptography and Security
- Discrete Mathematics
- Computational Complexity
- Computer Science and Game Theory
- Digital Libraries
- Distributed, Parallel, and Cluster Computing
- Mathematical Software
- Performance
- Numerical Analysis
- Other Computer Science
- Robotics
- Networking and Internet Architecture
- Computation and Language
- Logic in Computer Science
- Multimedia
- Computers and Society
- Computational Geometry
- Graphics
- Data Structures and Algorithms

- Oct 16 2017 cs.MS arXiv:1710.04985v1The acceleration of sparse matrix computations on modern many-core processors, such as the graphics processing units (GPUs), has been recognized and studied over a decade. Significant performance enhancements have been achieved for many sparse matrix computational kernels such as sparse matrix-vector products and sparse matrix-matrix products. Solving linear systems with sparse triangular structured matrices is another important sparse kernel as demanded by a variety of scientific and engineering applications such as sparse linear solvers. However, the development of efficient parallel algorithms in CUDA for solving sparse triangular linear systems remains a challenging task due to the inherently sequential nature of the computation. In this paper, we will revisit this problem by reviewing the existing level-scheduling methods and proposing algorithms with self-scheduling techniques. Numerical results have indicated that the CUDA implementations of the proposed algorithms can outperform the state-of-the-art solvers in cuSPARSE by a factor of up to $2.6$ for structured model problems and general sparse matrices.

A formally verified proof of the Central Limit Theorem

Zoltán Zimborás May 28 2014 04:42 UTC