published

Concentration in the Generalized Chinese Restaurant Process

R. I. Oliveira, A. Pereira, R. Ribeiro

Sankhya A 84 (2) : 628-670 (2020).

Abstract

The Generalized Chinese Restaurant Process (GCRP) describes a sequence of exchangeable random partitions of the numbers . This process is related to the Ewens sampling model in Genetics and to Bayesian nonparametric methods such as topic models. In this paper, we study the GCRP in a regime where the number of parts grows like nα with α > 0. We prove a non-asymptotic concentration result for the number of parts of size . In particular, we show that these random variables concentrate around ckV∗nα where V∗nα is the asymptotic number of parts and ck ≈ k−(1+α) is a positive value depending on k. We also obtain finite-n bounds for the total number of parts. Our theorems complement asymptotic statements by Pitman and more recent results on large and moderate deviations by Favaro, Feng and Gao.