In the shower this morning revisited the sorting subject. The prior post presented a non-efficient, though interesting (to me)) approach using a topological ordering graph algorithm. Is it possible to make it efficient?
Linear graph creation
So I went back to my initial idea of creating the graph with three types of edges: less than (LT), greater than (GT), and equal (EQ). Thus, it would be a linear process to iterate the list of objects and create the graph based on the relationship between neighbors. That is, compare the first with the second, the second with the third and so forth. With the original list [9, 2, 12, 40, 12, 3, 1]
The resulting graph is:
Embedded sub graphs
Is there any way to take that and break and merge it so that the algorithmic efficiency is high? With the labeling of edges we created a weighted digraph.
What we have is potentially three digraphs. Listed by taking the inverse of each edge (sink,source):
The less than graph LT: (2,9),(12,40),(3,12),(1,3)
The greater than graph GT: (2,12),(12x,40)
The equals graph EQ: in this example empty, but the 12 -> 12x equality is in the original data.
Sub digraph sorting and merge
One approach to sorting the complete list is taking each subgraph (LT, GT, EQ), sorting each and then merging the results.
The LT graph sorted is:
The GT graph sorted is:
Then we just apply a merge algorthm. The gain with this approach is that we broke the graph into three or two subgraphs. We can reapply this recursively, but this is just a rediscovery of existing sorts like Merge Sort?
Continued graph augmentation?
Combined into one graph we get:
Note that if we take the sinks end of each edge we have:
less direction: 2, 12, 3, 1
greater direction: 12x, 2
Compare this to the original graph from previous blog post where we we compared every node to each other:
Just noticed that in the presorted digraph above there is a link from vertex 1 to 2 then to 3, but there is also a link from 1 to 3. If we remove the shortest link and keep the longest multiple links (the reverse of some graph algorithm goals) we are left with the correct sequence 1, 2, 3. Hmmm.
No sort, just links between subgraphs?
Could be possible to do a merge of the later separated graphs? Or can the original graph be augmented to capture extra information? Lets see, if I walk GT, and compare each source with LT, and the first greater than or equal to source vertex create a link to it and to the sink of that node? If the vertexes already appear skip? Something like:
But this one shows an ambiguity ’2′ points to both ’9′ and ’12′.
Updates
- 2013-06-10: Just saw this interesting page on using Topological Ordering for shortest paths in a DAG: Shortest Path in Directed Acyclic Graph
Further reading
- Weighted graphs and networks
- M. Jessup’s sample code on StackOverflow
- Topological.java
- topological_sort
- 4.2 Directed Graphs. Text book, part of course content at Princeton
- Directed graph
- “JUnit Tests as Inner Classes”
- An informal introduction to O(N) notation
- Bogosort
- Algorithm of the Week: Dijkstra Shortest Path in a Graph
- Design Distributed Digraph Algorithms using MapReduce; Journal of Computational Information Systems 7: 7 (2011) 2267-2276. http://www.jofcis.com/publishedpapers/2011_7_7_2267_2276.pdf.