Loss-function learning for digital tissue deconvolution


The gene expression profile of a tissue averages the expression profiles of all cells in this tissue. Digital tissue deconvolution (DTD) addresses the following inverse problem: Given the expression profile $y$ of a tissue, what is the cellular composition $c$ of that tissue? If $X$ is a matrix whose columns are reference profiles of individual cell types, the composition $c$ can be computed by minimizing $\mathcal L(y-Xc)$ for a given loss function $\mathcal L$. Current methods use predefined all-purpose loss functions. They successfully quantify the dominating cells of a tissue, while often falling short in detecting small cell populations. Here we learn the loss function $\mathcal L$ along with the composition $c$. This allows us to adapt to application-specific requirements such as focusing on small cell populations or distinguishing phenotypically similar cell populations. Our method quantifies large cell fractions as accurately as existing methods and significantly improves the detection of small cell populations and the distinction of similar cell types.
Submitted 25 Jan 2018 to Quantitative Methods [q-bio.QM]
Published 26 Jan 2018
Author comments: 13 pages, 7 figures