Pipelined External Memory Triangle counting/listing algorithm
by
Roman Dementiev

counts/lists the number of triangles in a graph

Input: list of edges E of graph G

Algorithm:

  1. E := Sort E (assume if exists edge <u,v> then exists <v,u>, otherwise call Duplicate(E) )

  2. scan E, compute degrees list D := <node_id,degree>

  3. D := sort D by degree

  4. scan D and number node_ids producing list R := <node_id,new_id> (new_id is the value of counter, used for numbering)

  5. Sort R by node_id

  6. Renaming (heuristic: the nodes with the largest degrees are processed last)

    6.1 E' := scan E and R adding new_id for the source node: <u,v> => <u',u,v>

    6.2 E' := sort E' by v (the third component)

    6.3 E' := scan E' and R adding new_id for the destination node: <u',u,v,> => <u',v',u,v>

  7. Orient:

    7.1 scan E' and output only <u',v',u,v> if u'<v'

    7.2 sort E' by u',v' lexicographically

  8. Generate edge queries: (assume maximum degree < c M )

    oldu := -1;

    adj := {}

    foreach <u',v',u,v> in E'

      if (u' != oldu) // adjacency list of a new nodes

        oldu := u';

        adj := {v'};

                    else

        foreach w in adj

          insert <w,v',u> into Q (question list, each entry asks whether edge <w,v'> exists => exists a triangle via node u)

          if list Q is longer than k*length(E') than call AnswerQueries(Q, E')

                              add v' to the set adj


AnswerQueries(Q, E')
1. Q := sort Q by w,v' lexicographically

2. scan Q and E' if entries match then count/output the found triangle

3. make list Q empty


Duplicate(E)

1. E := scan E, for each edge <v,w> insert edge <w,v> at the beginning of E

2. E := sort E lexicographically

(3.) E := remove duplicates from E (optional)


Source code and Makefile (Stxxl required): [download .tgz]

Figure 1. Pipelined data flow graph for trangle listing/counting algorithm

Pipelined data flow graph for triangle listing/counting algorithm