### Limit Distributions and Random Trees Derived from the Birthday Problem with Unequal Probabilities

**Michael Camarri**

*(University of California, Berkeley)*

**Jim Pitman**

*(University of California, Berkeley)*

#### Abstract

Given an arbitrary distribution on a countable set, consider the number of independent samples required until the first repeated value is seen. Exact and asymptotic formulae are derived for the distribution of this time and of the times until subsequent repeats. Asymptotic properties of the repeat times are derived by embedding in a Poisson process. In particular, necessary and sufficient conditions for convergence are given and the possible limits explicitly described. Under the same conditions the finite dimensional distributions of the repeat times converge to the arrival times of suitably modified Poisson processes, and random trees derived from the sequence of independent trials converge in distribution to an inhomogeneous continuum random tree.

Full Text: Download PDF | View PDF online (requires PDF plugin)

Pages: 1-18

Publication Date: November 16, 1999

DOI: 10.1214/EJP.v5-58

#### References

- D. Aldous.
*Probability Approximations via the Poisson Clumping Heuristic*, Volume 77 of*Applied Mathematical Sciences*. Springer-Verlag, New York, 1989. Math. Review MR90k:60004 - D. Aldous. The continuum random tree I.
*Ann. Probab.*, 19:1-28, 1991. Math. Review MR91i:60024 - D. Aldous. The continuum random tree III.
*Ann. Probab.*, 21:248-289, 1993. Math. Review MR94c:60015 - D. Aldous and J. Pitman. A family of random trees with random edge lengths.
*Random Structures and Algorithms*, 15:176-195, 1999. Math. Review MR1 704 343 - D. Aldous and J. Pitman. Inhomogeneous continuum random trees and the
entrance boundary of the additive coalescent.
Technical Report 525, Dept. Statistics, U.C. Berkeley, 1998.
To appear in
*Prob. Th. and Rel. Fields*. Available via http://www.stat.berkeley.edu/users/pitman. Math. Review number not available. - A. D. Barbour, L. Holst, and S. Janson.
*Poisson Approximation*. Clarendon Press, 1992. Math. Review MR93g:60043 - P. Billingsley.
*Probability and Measure*. Wiley, New York, 1995. 3rd ed. Math. Review MR95k:60001 - A. Broder. Generating random spanning trees.
In
*Proc. 30'th IEEE Symp. Found. Comp. Sci.*, pages 442-447, 1989. Math. Review number not available. - M. Camarri. Asymptotics for $k$-fold repeats in the birthday problem with unequal probabilities. Technical Report 524, Dept. Statistics, U.C. Berkeley, 1998. Available via http://www.stat.berkeley.edu. Math. Review number not available.
- T. Carleman. Zur Theorie der Linearen Integralgleichungen.
*Math. Zeit*, 9:196-217, 1921. Math. Review number not available. - D. J. Daley and D. Vere-Jones.
*An Introduction to the Theory of Point Processes*. Springer-Verlag, Berlin, 1988. Math. Review MR90e:60060 - R. Durrett.
*Probability: Theory and Examples*. Wadsworth-Brooks/Cole, 1995. 2nd ed. Math. Review MR98m:60001 - S. Evans and J. Pitman. Construction of Markovian coalescents.
*Ann. Inst. Henri Poincaré*, 34:339-383, 1998. Math. Review MR99k:60184 - M. H. Gail, G. H. Weiss, N. Mantel, and S. J. O'Brien.
A solution to the generalized birthday problem with application to
allozyme screening for cell culture contamination.
*J. Appl. Probab.*, 16(2):242-251, 1979. Math. Review MR80d:92011 - L. Holst. On birthday, collectors' WHERE article_id=occupancy and other classical urn
problems.
*International Statistical Review*, 54:15 - 27, 1986. Math. Review MR89g:60028 - L. Holst. The general birthday problem.
*Random Structures Algorithms*, 6(2-3):201-208, 1995. Proceedings of the Sixth International Seminar on Random Graphs and Probabilistic Methods in Combinatorics and Computer Science, ``Random Graphs '93'' (Poznan, 1993). Math. Review MR97f:60023 - J. Jaworski. On a random mapping $(T,P_j)$.
*J. Appl. Probab.*, 21:186 - 191, 1984. Math. Review MR85k:05091 - K. Joag-Dev and F. Proschan. Birthday problem with unlike probabilities.
*Amer. Math. Monthly*, 99:10 - 12, 1992. Math. Review number not available. - A. Joyal. Une théorie combinatoire des séries formelles.
*Adv. in Math.*, 42:1-82, 1981. Math. Review MR84d:05025 - R. Lyons and Y. Peres.
*Probability on Trees and Networks*. Book in preparation, available at http://www.ma.huji.ac.il/~lyons/prbtree.html, 1996. Math. Review not available. - S. Mase. Approximations to the birthday problem with unequal occurrence
probabilities and their application to the surname problem in Japan.
*Ann. Inst. Statist. Math.*, 44(3):479-499, 1992. Math. Review MR93i:60018 - A. Meir and J. Moon. The distance between points in random trees.
*J. Comb. Theory*, 8:99-103, 1970. Math. Review MR41 #8286 - J. Pitman. Abel-Cayley-Hurwitz multinomial expansions associated with random mappings, forests and subsets. Technical Report 498, Dept. Statistics, U.C. Berkeley, 1997. Available via http://www.stat.berkeley.edu/users/pitman. Math. Review number not available.
- J. Pitman. The multinomial distribution on rooted labeled forests. Technical Report 499, Dept. Statistics, U.C. Berkeley, 1997. Available via http://www.stat.berkeley.edu/users/pitman. Math. Review number not available.
- J. Pitman. Coalescent random forests.
*J. Comb. Theory A.*, 85:165-193, 1999. Math. Review MR2000a:05064 - A. Rényi. On the enumeration of trees.
In R. Guy, H. Hanani, N. Sauer, and J. Schonheim, editors,
*Combinatorial Structures and their Applications*, pages 355-360. Gordon and Breach, New York, 1970. Math. Review MR41 #8309 - B. Simon.
*Functional Integration and Quantum Physics*, volume 86 of*Pure and applied mathematics*. Academic Press, New York, 1979. Math. Review MR84m:81066 - C. Stein. Application of Newton's identities to a generalized birthday problem and to the Poisson Binomial distribution. Technical Report 354, Dept. Statistics, Stanford University, 1990. Math. Review number not available.

This work is licensed under a Creative Commons Attribution 3.0 License.