Disjoint Set (Improved)

개선된 버전인 Disjoint Set에 대해 알아보려고 한다. 보다 자세한 내용은 아래 페이지를 확인하면 된다.

(https://www.geeksforgeeks.org/union-find-algorithm-set-2-union-by-rank/)

이전에 살펴본 Disjoint Set은 Union-Find 알고리즘 사용하여 undirected graph 에서 순환을 찾는 방식을 취했었다.

def find_parent(self, parent, i):
    if parent[i] == -1:
        return i
    if parent[i] != -1:
        return self.find_parent(parent, parent[i])

        # A utility function to do union of two subsets 

def union(self, parent, x, y):
    x_set = self.find_parent(parent, x)
    y_set = self.find_parent(parent, y)

parent[x_set] = y_set

다시금 코드를 살펴보면 위와 같은데 Worst Case 상황에서는 O(n) 시간 복잡도가 발생하게 된다.

즉, Parent를 찾기 위해 내부적으로 재귀적으로 접근하는데 마치 Linked List 처럼 Linear하게 연결되기 때문이다. (아래 참조)

(Union Subset이 크면 클수록 더 오래 걸린다.)

Let there be 4 elements 0, 1, 2, 3

Initially, all elements are single element subsets.
0 1 2 3 

Do Union(0, 1)
   1   2   3  
  /
 0

Do Union(1, 2)
     2   3   
    /
   1
 /
0

Do Union(2, 3)
         3    
        /
      2
     /
   1
 /
0

이와 같은 문제를 개선된 버전에서는 O(Logn)으로 줄일 수 있다. 개선된 버전에서 사용하는 알고리즘은 Deeper Tree에 Smaller Tree를 붙이는 방식을 사용한다.

이 방식은 union by rank로 여기서 rank는 Tree의 높이 대신에 사용한다. (내부적으로 경로 압축 기법을 사용했을 때, 항상 rank가 Tree의 높이와 같이 않으므로) 또한 Tree의 크기는 rank로 사용한다.

그럼 글로만 설명해서는 이해가 쉽게 되지 않으므로 아래 그림을 살펴보도록 하자.

Let us see the above example with union by rank
Initially, all elements are single element subsets.
0 1 2 3 

Do Union(0, 1)
   1   2   3  
  /
 0

Do Union(1, 2)
   1    3
 /  \
0    2

Do Union(2, 3)
    1    
 /  |  \
0   2   3

union by rank에서는 경로 압축 기법을 사용하는데 find()를 호출했을 때 해당 노드를 평평하게(flatten) 만든다.

여기서 find()는 해당 노드 x의 root 노드를 반환해준다. 즉 find()는 Tree의 특정 노드에서 위로 올라가며 root 노드를 찾는다. 이후 해당 root 노드는 노드 x의 부모 노드가 된다.

(따라서 다시 찾을 때 Tree를 훑어보지 않아도 된다.)

Let the subset {0, 1, .. 9} be represented as below and find() is called
for element 3.
              9
         /    |    \  
        4     5      6
     /     \        /  \
    0        3     7    8
            /  \
           1    2  

When find() is called for 3, we traverse up and find 9 as representative
of this subset. With path compression, we also make 3 as the child of 9 so 
that when find() is called next time for 1, 2 or 3, the path to root is reduced.

               9
         /    /  \    \
        4    5    6     3 
     /           /  \   /  \
    0           7    8  1   2

이를 바탕으로 다시 그림을 살펴보면, 3의 값을 갖고 있는 노드에서 find()가 호출될 때 root 노드 9를 찾게 되고, 해당 노드가 3의 루트 노드가 되면서 두 번째 그림 처럼 바뀌게 된다.

그럼 코드로 살펴보면 다음과 같다.

# A union by rank and path compression based
# program to detect cycle in a graph
from collections import defaultdict


# a structure to represent a graph
class Graph:

    def __init__(self, num_of_v):
        self.num_of_v = num_of_v
        self.edges = defaultdict(list)

    # graph is represented as an
    # array of edges
    def add_edge(self, u, v):
        self.edges[u].append(v)


class Subset:
    def __init__(self, parent, rank):
        self.parent = parent
        self.rank = rank

    # A utility function to find set of an element


# node(uses path compression technique)
def find(subsets, node):
    if subsets[node].parent != node:
        subsets[node].parent = find(subsets, subsets[node].parent)
    return subsets[node].parent


# A function that does union of two sets
# of u and v(uses union by rank)
def union(subsets, u, v):
    # Attach smaller rank tree under root
    # of high rank tree(Union by Rank)
    if subsets[u].rank > subsets[v].rank:
        subsets[v].parent = u
    elif subsets[v].rank > subsets[u].rank:
        subsets[u].parent = v

    # If ranks are same, then make one as
    # root and increment its rank by one
    else:
        subsets[v].parent = u
        subsets[u].rank += 1


# The main function to check whether a given
# graph contains cycle or not
def isCycle(graph):
    # Allocate memory for creating sets
    subsets = []

    for u in range(graph.num_of_v):
        subsets.append(Subset(u, 0))

    # Iterate through all edges of graph,
    # find sets of both vertices of every
    # edge, if sets are same, then there
    # is cycle in graph.
    for u in graph.edges:
        u_rep = find(subsets, u)

        for v in graph.edges[u]:
            v_rep = find(subsets, v)

            if u_rep == v_rep:
                return True
            else:
                union(subsets, u_rep, v_rep)

많이 길어보이지만 클래스 선언하는 부분을 제외하고는 메인 코드를 몇 줄 되지 않는다.

하나씩 살펴보면 먼저 Subset 클래스에 rank가 포함되어 있다. rank는 위의 이론에서 잠시 언급했듯이 해당 루트 노드가 가지고 있는 하위 트리의 개수를 나타낸다.

(각각의 노드들이 root 노드에 의해 flatten 되기 때문에)

또한 union 메소드는 rank를 사용하여 서로 합친다. 이전 방식은 단순히 Linked List 방식으로 묶는 방식이었다면 union by rank는 rank가 높은 곳을 우선시하고 만약 rank가 같다면 root 노드 밑에 flatten 하게 된다.

이렇듯 개선된 Disjoint Set은 Union-Find를 사용하는 기법에 비해 좀 더 시간적으로 빠르다.

저작자표시

'IT > Algorithm' 카테고리의 다른 글

Bellman Ford (0)	2019.08.25
Disjoint Set (0)	2019.07.22
0-1 Knapsack (0)	2019.06.28
Window Sliding Technique (0)	2019.06.22
Trie (0)	2019.06.11

Small talks with something

Disjoint Set (Improved)

'IT > Algorithm' 카테고리의 다른 글

댓글

티스토리툴바

Disjoint Set (Improved)

'IT > Algorithm' 카테고리의 다른 글

관련글

댓글

티스토리툴바