Dependency Sorting in Ruby with TSort

If you write Ruby with any regularity, you've probably experienced the dependency-managing wonders of Bundler. What you didn't know, however, was that you can use the same dependency-sorting goodness within your own application in other contexts.

Say Hello to TSort

TSort is a module included in the Ruby standard library for executing topological sorts. Under the hood, Bundler uses TSort to detangle your gems' web of dependencies. Managing gem dependencies, however, is just the tip of the iceberg when it comes to processes that can benefit from topological sorting. With just a trivial amount of work, you can put its awesome power to work in your own projects.

Use Case: Adding Sample Data to a Database

Imagine we have a single task that must populate a database with several records. But life isn't easy; our records have associations with each other as demonstrated in the pseudocode below.

user_1 = User.create address: address_1

school_1 = School.create address: address_2, faculty: [user_1]
school_2 = School.create address: address_3

address_1 = Address.create zip_code: zip_code_1
address_2 = Address.create zip_code: zip_code_2
address_3 = Address.create zip_code: zip_code_2

zip_code_1 = ZipCode.create
zip_code_2 = ZipCode.create

The Problem

If we ran the pseudocode above, it would obviously blow up with NameErrors because multiple records include associational references to records that are yet to be created.

In this simple, contrived example, it would be straightforward to manually sort the statements to ensure that we insert the records that others depend on first. But when our dependency relationships are more complex, or when we simply have a much larger number of records, manual sorting is out of the question. (Who ever liked doing anything manually anyway?)

The Solution

Enter TSort. We can use TSort to programmatically determine an order that these records can be inserted successfully.

The most straightforward fashion of using TSort is through the creation and subsequent sorting of a dependency hash where keys represent an object and values are arrays of references to the objects on which the key object depends.

So if skiing depends on snow, and snow depends on clouds and cold, our dependency hash might look like this:

{
 'clouds' => [],
 'cold' => [],
 'skiing' => ['snow'],
 'snow' => ['clouds', 'cold']
}

We only concern ourselves with first-level dependencies; we'll leave TSort to worry about the dependencies of our objects' dependencies.

In order to topologically sort our dependency hash, we need to mix in some TSort functionality. The easiest way to do this is simply to create a subclass of Hash like so:

require 'tsort'
class TsortableHash < Hash
 include TSort

 alias tsort_each_node each_key
 def tsort_each_child(node, &block)
 fetch(node).each(&block)
 end
end

Next we can use our new class to build a dependency hash. The dependency hash for our sample data insert task might look like the psuedocode below.

dependency_hash = \
 TsortableHash[
 user_1 => [address_1],
 school_1 => [address_2, user_1],
 school_2 => [address_3],
 address_1 => [zip_code_1],
 address_2 => [zip_code_2],
 address_3 => [zip_code_2],
 zip_code_1 => [],
 zip_code_2 => []
 ]

Once we've created our tsortable dependency hash, the hard work is over. TSort the hash for great justice and we're left with an array of insert processes that can be executed in order without catastrophe.

dependency_hash.tsort
#=> [zip_code_1, address_1, user_1, zip_code_2, address_2, school_1, address_3, school_2]

# If you have circular dependencies, #tsort will raise a TSort::Cyclic exception.

TSort is an incredibly powerful and easy-to-use tool for organizing dependency relationships. So next time you find yourself struggling with dependency ordering in Ruby, rest assured. TSort is only a click away.

Lawson Kurtz

,
Posted in Article Category: #Code
on