Introduction to Algorithms/Skip List
- Pugh 1989.
- dynamic search data structure.
- $O(\lg n)$ in expectation with high probability ($\approx 1-\frac{1}{n^{\alpha}}$).
Starting from scratch
(sorted) linked list, $\Theta (n)$ worst-case search.
Two (sorted) linked list?
Search(x)
: (Ex. 2 levels)
- walk right on top list until go right would go too far.
- walk down and go right unti find $x$ or ($>x$)
What keys go in L1?
- best is spread them out uniformly.
$\implies$ cost of search $\approx | L_{1} | +\frac{ | L_2 | }{ | L_{1} | }$. |
To minimize $x+\frac{n-x}{x}$ we chose $x = [\sqrt{ n }]$ and cost = $2\sqrt{ n }$.
More! For $k$ list, cost is $k n^{1/k}$. Minimize: $k=\lg n \implies$ cost is $\lg n \cdot n^{1/\lg n}=2\lg n$. And the ratio between two lists is 2.
Skip List
roughly subject to insert and delete.
Insert(x)
Search(x)
to find where $x$ fits in the bottom list.- Insert $x$ into the bottom list. Invarient bottom list stores all the elems.
- Which other lists should store $x$?
- Flip a fair coin.
- If heads: promote $x$ to the next level and flip again.
- Change: store $-\infty$ in every lists.
delete(x)
: just throw it away.
Intuitively: pretty good on average. Claim: really really good almost always.
Analysis
Theorem With high prob, every search in $n$-element skip list costs $O(\lg n)$.
With high prob: (w.h.p) event $E$ occurs w.h.p for any $\alpha \geqslant 1$ there is a suitable choice of const for which $E$ occurs with prob $\geqslant 1-O(\frac{1}{n^{\alpha}})$.
Booles’s inequality/union bounds:
\[P(E_{1} \cup E_{2} \cup \dots \cup E_{k}) \leqslant P(E_{1}) + P(E_{2})+\dots+P(E_{k})\]Suppose $\overline{E_{i}}$ occurs w.h.p, $k=n^c$, then $\overline{E_{1}} \cap \overline{E_{2}} \cap \dots \cap \overline{E_{k}}$ occurs w.h.p. $\implies$ All the searches in $O(\lg n)$ is w.h.p.
Lemma: w.h.p. # levels = $O(\lg n)$.
Proof: error prob for $c\lg n$ levels = $P(>c\lg n \ \text{levels})$.
\[P \leqslant nP(\text{x gets promoted}\geqslant c\lg n \ \text{times})=n \cdot \frac{1}{2^{c\lg n}}= \frac{1}{n^{c-1}}\]Choose $c-1=\alpha$.
Core Idea: backwards analyze research.
- search starts (ends ) at node in bottom list.
- at each node visited:
- if wasn’t promoted higher then go left.
- promoted then go up.
- stop at the root (starts).
Proof: # up moves $<$ # levels $\leqslant c\lg n$ w.h.p (lemma). $\implies$ w.h.p. # moves $\leqslant$ coin flips till $c\lg n$ heads. Claim that’s $O(\lg n)$ w.h.p.
Proof of the claim: Let’s say flip $10c\lg n$ coins, error prob:
\[P(\leqslant c\lg n \text{ Heads}) \leqslant \binom{10c\lg n}{c\lg n}\left( \frac{1}{2} \right)^{9c\lg n}\] \[\because \binom{y}{x} \leqslant \left( \frac{ey}{x} \right)^x\] \[\therefore P\leqslant \frac{(10e)^{c\lg n}}{2^{9c\lg n}}=\frac{1}{2^{[9-\lg(10e)]c\lg n}}\]Let $\alpha = [9-\lg(10e)]c$. For $ac \lg n$ coins, that’s $[a-1-\lg(ea)]c$.$\square$
Comments