# Python: Binary Search / Bisection / Binary Search Tree (BST)

Yao Yao on October 4, 2017

• bisection 是 “二等分” (指 action)，bisector 是 “二等分线”，这俩其实是几何的概念，不管你怎么引申，回溯到几何是最好理解的
• bisection method 是 “二分法”，是一系列使用 bisection 思想的方法的统称
• binary search 是 “二分查找”，这么翻译也行吧，我觉得英文的问题更大一点，叫 bisection search 不好么？
• binary search tree (BST) 是 “二叉搜索树”，从 tree 的角度而言，叫 binary 是 OK 的
• 但是你 binary search 不能翻译成 “二叉搜索”，我们在搜索时没有 “叉” 这个动作

## 1. Bisection in Geometry

• bisect.bisect_left(X, 1) 就等价于 bisect X at the left boundary of (the neighborhood of) 1
• 那么 $x=1$ 这个点就属于右半边
• bisect.bisect_right(X, 1) 就等价于 bisect X at the right boundary of (the neighborhood of) 1
• 那么 $x=1$ 这个点就属于左半边

## 2. Bisection in Python

bisect.bisect_leftbisect.bisect_right 主要用来切 sequence，当然前提是假设 sequence 是 sorted (like an $x$-axis)。

def bisect_left(a, x, lo=0, hi=None):
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x:  # ANCHOR
lo = mid+1
else:
hi = mid
return lo

def bisect_right(a, x, lo=0, hi=None):
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] <= x:  # ANCHOR
lo = mid+1
else:
hi = mid
return lo


Return the index where to insert item x in list a, assuming a is sorted.

### 2.1 When x in a

• bisect_lefta[lo] 会停靠在 first occurrence of == x，然后一直走 else，让 hi 左移，往 lo 靠近
• bisect_right 如果一直有 a[mid] <= x，那么 lo 会一直右移，直到 first occurrence of > x，然后再是 hi 左移，往 lo 靠近

from bisect import bisect_left, bisect_right

a = [0, 1, 1, 1, 2]

i, j = bisect_left(a, 1), bisect_right(a, 1)
print(i, j)

# Output: 1 4


a = [0, 1, 1, 1, 2]
^        ^
i        j


• $\forall e \in$ a[:i], $e < x$
• $\forall e \in$ a[i:], $e \geq x$
• a[i] == x
• $\forall e \in$ a[:j], $e \leq x$
• $\forall e \in$ a[j:], $e > x$
• a[j] > x
• $\forall e \in$ a[i:j], $e = x$

• a.insert(i, ?) inserts just before the leftmost x
• a.insert(j, ?) inserts just after the rightmost x
• 无论是 a.insert(i, 1) 还是 a.insert(j, 1) 可以把 1 放置到正确的地方并保持 a sorted
# after a.insert(i, ?)
a = [0, ?, 1, 1, 1, 2]
^
i

# after a.insert(j, ?)
a = [0, 1, 1, 1, ?, 2]
^
j


1. a[0] == x 然后做 bisect_left(a, x) $\Rightarrow$ 会得到 i == 0
2. a[-1] == x 然后做 bisect_right(a, x) $\Rightarrow$ 会得到 j == len(a)
from bisect import bisect_left, bisect_right

a = [0, 1, 1, 1, 2]

i = bisect_left(a, 0)   # i == 0
j = bisect_right(a, 2)  # j == 5

a = [0, 1, 1, 1, 2] _
^              ^
i              j


### 2.2 When x not in a

x not in a 的时候， # ANCHOR 的两句，if a[mid] < xif a[mid] <= x 都会跨越到 first occurrence of > x，所以此时这两个函数是相同的逻辑，偏 bisect_right 的风格：

from bisect import bisect_left, bisect_right

a = [0, 0, 0, 0, 22]

i, j = bisect_left(a, 1), bisect_right(a, 1)
print(i, j)

# Output: 4 4


a = [0, 0, 0, 0, 22]
^^
ij


• $\forall e \in$ a[:i], $e < x$
• $\forall e \in$ a[i:], $e > x$

• a.insert(i, 1) 可以把 1 放置到正确的地方并保持 a sorted
• insert 之后一般都会有 a[i-1] < a[i] == 1 < a[i+1] (边界的特殊情况不考虑)

### 2.3 总结

• bisect_left(a, x) returns the index of the last occurrence of < x plus 1
• bisect_right(a, x) returns the index of the first occurrence of > x

def binary_search(a, x, lo=0, hi=None):
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x:
lo = mid+1
elif a[mid] == x:  # ANCHOR
return mid
else:
hi = mid
return -1  # -1 is the error code if x not found in a


1. bisect module 并没有 提供 binary search 的接口
2. 在强假设 x in a 存在时，bisect_left(a, x) 碰巧 返回的是 the index of the first occurence of x，但它的逻辑 并不是 严格的 binary search

from bisect import bisect_left

def binary_search(a, x):
i = bisect_left(a, x)

# bisect_left 不会 return -1
if a[i] != x:
return -1

return i


## 4. A Subtle Bug in Above Code

mid = lo + (hi - lo) // 2


## 5. Binary Search Tree (BST)

TBD

blog comments powered by Disqus