Limit Distributions and Random Trees Derived from the Birthday Problem with Unequal Probabilities

Michael Camarri (University of California, Berkeley)
Jim Pitman (University of California, Berkeley)


Given an arbitrary distribution on a countable set, consider the number of independent samples required until the first repeated value is seen. Exact and asymptotic formulae are derived for the distribution of this time and of the times until subsequent repeats. Asymptotic properties of the repeat times are derived by embedding in a Poisson process. In particular, necessary and sufficient conditions for convergence are given and the possible limits explicitly described. Under the same conditions the finite dimensional distributions of the repeat times converge to the arrival times of suitably modified Poisson processes, and random trees derived from the sequence of independent trials converge in distribution to an inhomogeneous continuum random tree.

Full Text: Download PDF | View PDF online (requires PDF plugin)

Pages: 1-18

Publication Date: November 16, 1999

DOI: 10.1214/EJP.v5-58


  1. D. Aldous. Probability Approximations via the Poisson Clumping Heuristic, Volume 77 of Applied Mathematical Sciences. Springer-Verlag, New York, 1989. Math. Review MR90k:60004
  2. D. Aldous. The continuum random tree I. Ann. Probab., 19:1-28, 1991. Math. Review MR91i:60024
  3. D. Aldous. The continuum random tree III. Ann. Probab., 21:248-289, 1993. Math. Review MR94c:60015
  4. D. Aldous and J. Pitman. A family of random trees with random edge lengths. Random Structures and Algorithms, 15:176-195, 1999. Math. Review MR1 704 343
  5. D. Aldous and J. Pitman. Inhomogeneous continuum random trees and the entrance boundary of the additive coalescent. Technical Report 525, Dept. Statistics, U.C. Berkeley, 1998. To appear in Prob. Th. and Rel. Fields. Available via Math. Review number not available.
  6. A. D. Barbour, L. Holst, and S. Janson. Poisson Approximation. Clarendon Press, 1992. Math. Review MR93g:60043
  7. P. Billingsley. Probability and Measure. Wiley, New York, 1995. 3rd ed. Math. Review MR95k:60001
  8. A. Broder. Generating random spanning trees. In Proc. 30'th IEEE Symp. Found. Comp. Sci., pages 442-447, 1989. Math. Review number not available.
  9. M. Camarri. Asymptotics for $k$-fold repeats in the birthday problem with unequal probabilities. Technical Report 524, Dept. Statistics, U.C. Berkeley, 1998. Available via Math. Review number not available.
  10. T. Carleman. Zur Theorie der Linearen Integralgleichungen. Math. Zeit, 9:196-217, 1921. Math. Review number not available.
  11. D. J. Daley and D. Vere-Jones. An Introduction to the Theory of Point Processes. Springer-Verlag, Berlin, 1988. Math. Review MR90e:60060
  12. R. Durrett. Probability: Theory and Examples. Wadsworth-Brooks/Cole, 1995. 2nd ed. Math. Review MR98m:60001
  13. S. Evans and J. Pitman. Construction of Markovian coalescents. Ann. Inst. Henri Poincaré, 34:339-383, 1998. Math. Review MR99k:60184
  14. M. H. Gail, G. H. Weiss, N. Mantel, and S. J. O'Brien. A solution to the generalized birthday problem with application to allozyme screening for cell culture contamination. J. Appl. Probab., 16(2):242-251, 1979. Math. Review MR80d:92011
  15. L. Holst. On birthday, collectors' WHERE article_id=occupancy and other classical urn problems. International Statistical Review, 54:15 - 27, 1986. Math. Review MR89g:60028
  16. L. Holst. The general birthday problem. Random Structures Algorithms, 6(2-3):201-208, 1995. Proceedings of the Sixth International Seminar on Random Graphs and Probabilistic Methods in Combinatorics and Computer Science, ``Random Graphs '93'' (Poznan, 1993). Math. Review MR97f:60023
  17. J. Jaworski. On a random mapping $(T,P_j)$. J. Appl. Probab., 21:186 - 191, 1984. Math. Review MR85k:05091
  18. K. Joag-Dev and F. Proschan. Birthday problem with unlike probabilities. Amer. Math. Monthly, 99:10 - 12, 1992. Math. Review number not available.
  19. A. Joyal. Une théorie combinatoire des séries formelles. Adv. in Math., 42:1-82, 1981. Math. Review MR84d:05025
  20. R. Lyons and Y. Peres. Probability on Trees and Networks. Book in preparation, available at, 1996. Math. Review not available.
  21. S. Mase. Approximations to the birthday problem with unequal occurrence probabilities and their application to the surname problem in Japan. Ann. Inst. Statist. Math., 44(3):479-499, 1992. Math. Review MR93i:60018
  22. A. Meir and J. Moon. The distance between points in random trees. J. Comb. Theory, 8:99-103, 1970. Math. Review MR41 #8286
  23. J. Pitman. Abel-Cayley-Hurwitz multinomial expansions associated with random mappings, forests and subsets. Technical Report 498, Dept. Statistics, U.C. Berkeley, 1997. Available via Math. Review number not available.
  24. J. Pitman. The multinomial distribution on rooted labeled forests. Technical Report 499, Dept. Statistics, U.C. Berkeley, 1997. Available via Math. Review number not available.
  25. J. Pitman. Coalescent random forests. J. Comb. Theory A., 85:165-193, 1999. Math. Review MR2000a:05064
  26. A. Rényi. On the enumeration of trees. In R. Guy, H. Hanani, N. Sauer, and J. Schonheim, editors, Combinatorial Structures and their Applications, pages 355-360. Gordon and Breach, New York, 1970. Math. Review MR41 #8309
  27. B. Simon. Functional Integration and Quantum Physics, volume 86 of Pure and applied mathematics. Academic Press, New York, 1979. Math. Review MR84m:81066
  28. C. Stein. Application of Newton's identities to a generalized birthday problem and to the Poisson Binomial distribution. Technical Report 354, Dept. Statistics, Stanford University, 1990. Math. Review number not available.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.