this dir | view | cards | source | edit | dark
top
Exam
Markov Chains
Bayesian Statistics
Conditional Expectation
Balls & Bins
- Really useful approximation
- 1−x≈e−x
- example usage: birthday paradox
- (1−3651)(1−3652)…(1−365k−1)≈e−2⋅365k(k−1)
- because 1−3651≈e−3651 etc.
- Union bound
- for events A1,A2,… it holds that P(⋃Ai)≤∑P(Ai)
- therefore P(none of Ai)≥1−∑P(Ai)
- Balls and bins model
- we will be throwing m balls randomly into n bins
- Xi … number of balls in bin i
- How many bins are empty?
- P(Xi=0)=(1−n1)m≈e−nm
- Ii=1 if Xi=0 (otherwise 1)
- E(∑iIi)=∑iEIi≈ne−nm
- How many balls are in bin i?
- Xi∼Bin(m,n1)≈Pois(nm)
- EXi=nm
- Max-load likely upper bound
- definition: max-load … max{X1,…,Xn}
- theorem: for large enough n we have P(maxload≥logn3loglogn)≤n1
- proof
- Max-load likely lower bound
- theorem: for large enough n we have P(maxload≤lognloglogn)≤n1
- proof
- Exact case vs. Poisson case
- theorem: any event that happens with probability ≤p in the Poisson case happens with probability ≤p⋅em in the exact case
- proof
Stochastic Processes
- Bernoulli process
- infinite sequence of independent identically distributed Bernoulli trials
- sequence of RVs X1,X2,… that are independent and each follows Ber(p) distribution
- Xk=1 … success/arrival at time k
- Quantities of a Bernoulli process
- Nt … number of successes up to time t
- Nt∼Bin(t,p)
- E[Nt]=t⋅p
- Tk … time of the k-th success
- Lk=Tk−Tk−1
- waiting time
- memory-less
- Lk∼Geom(p)
- E[Lk]=p1
- P(L=l)=(1−p)l−1⋅p
- Tk properties
- Tk=L1+⋯+Lk
- by linearity E[Tk]=pk
- Tk has Pascal distribution of order k
- P(Tk=t)={(k−1t−1)pk(1−p)t−k0for t≥kotherwise
- Alternative description of a Bernoulli process
- we can describe the situation by the interarrival times
- i.i.d. random variables L1,L2,…∼Geom(p)
- Tk=L1+⋯+Lk
- Xk=1 whenever Tt=k for some t
- clearly, the sequence X1,X2,… is a Bernoulli process
- Merging of Bernoulli processes
- two independent Bernoulli processes, Zi=Xi∨Yi
- BP(p)[merge with]BP(q)=BP(1−(1−p)(1−q))
- =BP(p+q−pq)
- Splitting of Bernoulli processes
- Xi,Yi,Zi
- Zi∼BP(p)
- if Zi=0, then Xi=Yi=0
- if Zi=1, then with probability q we set (Xi,Yi)=(1,0), otherwise (Xi,Yi)=(0,1)
- then Xi∼BP(pq) and Yi∼BP(p(1−q))
- Poisson process
- continuous time
- T1,T2,T3,… are the times of individual arrivals
- Nt … number of arrivals up to time t
- λ … intensity (how many successes per unit of time)
- Nt∼Pois(λ⋅t)
- Lk∼Exp(λ)
- Tk has Erlang distribution, E[Tk]=λk
- Poisson process interval independence
- for any sequence 0≤t0<t1⋯<tk
- RVs (Nti−Nti+1)
- are independent
- each one follows Pois(λ(ti+1−ti))
- Merging of Poisson process
- PP(λ)[merge with]PP(λ′)=PP(λ+λ′)
- Splitting of Poisson process
- for each success of PP(λ), we toss a coin with probability p (with probability p, the success is counted as valid)
- the resulting process is PP(λp)
Non-parametric Tests
- Permutation test
- two-sample test
- A/B testing
- A … control group (no treatment)
- B … group with treatment applied
- is the A vs. B difference only random?
- we observe … T(A,B)
- T … test statistic
- example: T=Aˉ−Bˉ
- F={T(π(A∪B))∣π∈Sm+n}
- p-value: percentage of F more extreme than observed T(A,B)
- speed-up … we use only k random samples from Sm+n
- for paired test, we only permute Xi with Yi
- Signed test
- paired test
- pairs Xi,Yi
- Zi=Yi−Xi
- H0 … P(Xi>Yi)=21
- count number of pairs where Zi>0 (there was a measurable improvement after the treatment)
- use Bin(n,21) to get the p-value
- Wilcoxon signed rank test
- paired test
- procedure
- let Xi be the difference Bi−Ai
- sort by absolute value, assign ranks (Ri)
- calculate T (or T+ or T−, it does not matter)
- T+=∑Xi>0Ri
- T−=∑Xi<0Ri
- T=T+−T−=∑Ri⋅sign(Xn)
- compare with scores for measurements with random +/− signs
- we use stronger null hypothesis
- H0: median = 0 & distribution of X is symmetric
- note: X is symmetric ⟺A,B have the same continuous distribution (source)
- Mann-Whitney U-test
- two-sample test
- we have two populations, X and Y
- we get X1,…,Xm and Y1,…,Yn
- usually, m=n
- S(Xi,Yj)=⎩⎨⎧1210Xi>YjXi=YjXi<Yj
- U=∑i=1m∑j=1nS(Xi,Yj)
- sign test is a paired variant of this
- we use the same approach as in the permutation test with U instead of T=Xˉm−Yˉn
- alternative view: we sort X∪Y and count pairs “Xi before Yj”
- when to use it?
- numerical data … real numbers (weight, time, price) → permutation test
- ordinal data … classes (light/medium/heavy, quick/slow) → U-test
Moment Generating Functions and their applications
- Markov inequality
- for X nonnegative
- ∀a≥1:P(X≥a⋅μ)≤a1
- Chebyshev inequality
- ∀a>0:P(∣X−μ∣≥a⋅σ)≤a21
- Chernoff bounds
- for X=∑i=1nXi where Xi∼Ber(pi) are independent
- μ=∑i=1npi
- ∀δ∈(0,1]:P(X≥(1+δ)⋅μ)≤e−μδ2/3
- ∀δ∈(0,1):P(X≤(1−δ)⋅μ)≤e−μδ2/2
- note: there are many more alternative bounds