## Abstract

Arising from cryogenic electron microscopy image analysis, “Einstein from noise” refers to spurious patterns that can emerge as a result of averaging a large number of white-noise images aligned to a reference image through rotation and translation. Although this phenomenon is often attributed to model bias, quantitative studies on such bias are lacking. Here, we introduce a simple framework under which an image of p pixels is treated as a vector of dimension p, and a white-noise image is a random vector uniformly sampled from the (p − 1)-dimensional unit sphere. Moreover, we adopt the cross-correlation of two images, which is a similarity measure based on the dot product of image pixels. This framework explains geometrically how the bias results from averaging a properly chosen set of white-noise images that are most highly cross-correlated with the reference image. We quantify the bias in terms of three parameters: the number of white-noise images (n), the image dimension (p), and the size of the selection set (m). Under the conditions that n, p, and m are all large and (ln n)^{2}/p and m/n are both small, we show that the bias is approximately [Formula presented], where γ = (m/p) ln (n/m).

Original language | English |
---|---|

Pages (from-to) | 2355-2379 |

Number of pages | 25 |

Journal | Statistica Sinica |

Volume | 31 |

DOIs | |

State | Published - 2021 |

## Keywords

- Cross correlation
- cryogenic electron microscopy
- extreme value distribution
- high dimensional data analysis
- model bias
- white-noise image