Process to generate hierarchical structure based on relational data?

I have a csv of employee ids, names, and a reference column with the id of their direct manager, say something like this

 emp_id, emp_name, mgr_id
1,The Boss,,
2,Manager Joe,1
3,Manager Sally,1
4,Peon Frank,2
5,Peon Jill,2
6,Peon Rodger,3
7,Peon Ralph,3

I would like to be able to generate a (json) object representing this structure, something along the lines of

DATA = {
  "parent": "The Boss",
  "children: [
    {
      "parent": "Manager Joe",
      "children": [ {"parent": "Peon Frank"}, {"parent": "Peon Jill"} ]
    },
    {
      "parent": "Manager Sally",
      "children": [ {"parent": "Peon Rodger"}, {"parent": "Peon Ralph" } ]
    }]
}

So from the data, an entry with no mgr_id represents something like the CEO or Leader.

So just collecting some thoughts.. I know this could be represented by some tree data structure with the tree traversal generating the correct output. The parser would have to be responsible for inserting into the tree. Maybe the number of children would give you a weight? Just gathering thoughts. Not sure how I would pivot around the fact that multiple objects can be constructed with children.

Is there an algorithm defined that can parse this structure I am not thinking of? Descending down into the children seems relatively straightforward. I am not seeing this in my head, and could use some help. Thank you

1

There are several ways to create structure
from flatland. Recursion is one of the best.
This
program uses it, with a preliminary step of figuring
out which items are parents.

While the strategy is language-agnostic, every language
has different details of data structures and
building blocks. I’ve rendered the approach in Python.

NB, about half this code is for display and demonstration purposes, so you can follow along and see how the algorithm is working. For production, fee free to remove those parts.

Oh…and I changed the labels from ‘parent’ to ‘name’ and ‘children’ to ‘reports’ because I found that more agreeable. You can, of course, choose whatever you like.

from pprint import pprint
from random import shuffle
from collections import defaultdict
import json

def show_val(title, val):
    """
    Debugging print helpler.
    """
    sep = '-' * len(title)
    print "n{0}n{1}n{2}n".format(sep, title, sep)
    pprint(val)


# ORIGINAL NAMES:   emp_id, emp_name, mgr_id
# SIMPLIFIED NAMES: eid,    name,     mid
text = """
323,The Boss,
4444,Manager Joe,323
3,Manager Sally,323
4,Peon Frank,4444
33,Peon Dave,3
5,Peon Jill,4444
6,Peon Rodger,3
7,Peon Ralph,3
233,Clerk Jane,99
99,Supervisor Henri,3
"""

# parse text into lines
lines = [ l.strip() for l in text.strip().splitlines() ]

# construct list of people tuples
people = [ tuple(l.split(',')) for l in lines ]

# for demonstration and testing only, shuffle the results
shuffle(people)
show_val("randomized people", people)

# contstruct list of parents
parents = defaultdict(list)
for p in people:
    parents[p[2]].append(p)
show_val("parents", parents)

def buildtree(t=None, parent_eid=''):
    """
    Given a parents lookup structure, construct
    a data hierarchy.
    """
    parent = parents.get(parent_eid, None)
    if parent is None:
        return t
    for eid, name, mid in parent:
        report = { 'name': name }
        if t is None:
            t = report
        else:
            reports = t.setdefault('reports', [])
            reports.append(report)
        buildtree(report, eid)
    return t

data = buildtree()
show_val("data", data)

show_val("JSON", json.dumps(data))

Running this shows the following output:

-----------------
randomized people
-----------------

[('233', 'Clerk Jane', '99'),
 ('4444', 'Manager Joe', '323'),
 ('33', 'Peon Dave', '3'),
 ('6', 'Peon Rodger', '3'),
 ('99', 'Supervisor Henri', '3'),
 ('3', 'Manager Sally', '323'),
 ('5', 'Peon Jill', '4444'),
 ('323', 'The Boss', ''),
 ('4', 'Peon Frank', '4444'),
 ('7', 'Peon Ralph', '3')]

-------
parents
-------

defaultdict(<type 'list'>, {'99': [('233', 'Clerk Jane', '99')], '323': [('4444', 'Manager Joe', '323'), ('3', 'Manager Sally', '323')], '3': [('33', 'Peon Dave', '3'), ('6', 'Peon Rodger', '3'), ('99', 'Supervisor Henri', '3'), ('7', 'Peon Ralph', '3')], '4444': [('5', 'Peon Jill', '4444'), ('4', 'Peon Frank', '4444')], '': [('323', 'The Boss', '')]})

----
data
----

{'name': 'The Boss',
 'reports': [{'name': 'Manager Joe',
              'reports': [{'name': 'Peon Jill'}, {'name': 'Peon Frank'}]},
             {'name': 'Manager Sally',
              'reports': [{'name': 'Peon Dave'},
                          {'name': 'Peon Rodger'},
                          {'name': 'Supervisor Henri',
                           'reports': [{'name': 'Clerk Jane'}]},
                          {'name': 'Peon Ralph'}]}]}

----
JSON
----

'{"name": "The Boss", "reports": [{"name": "Manager Joe", "reports": [{"name": "Peon Jill"}, {"name": "Peon Frank"}]}, {"name": "Manager Sally", "reports": [{"name": "Peon Dave"}, {"name": "Peon Rodger"}, {"name": "Supervisor Henri", "reports": [{"name": "Clerk Jane"}]}, {"name": "Peon Ralph"}]}]}'

Some preliminaries: It also uses print to help show nested data
structures. We’d normally get data through a database
connection, here we just parse it out of static text.
Finally, while the data you presented is beautifully ordered, with the
bosses at the top and with the lowest employeed id numbers (simplifying the)
problem, I’d like to confirm that the code works in any order. So I’ve modified
some of the id numbers to reflect a non-sequential allocation, and brought in
random.shuffle to randomize the order of data. You wouldn’t do this in
production, but as a part of testing, it increases confidence the logic is
working by design, not accident.

1

You’ve got a CSV file as your form of persistence. There are other approaches such as SQLite that you might wish to investigate giving you a bit more ability to query the structure, but you will still have the problem that to build the full tree structure to serialize out, you will need to have the full tree structure.

Working with CSV, you will need to read every line in, build the corresponding tree structure, and then write it back out, possibly with a library for json serialization. Noting you have a few ruby bits over on SO, that could look something like the answers in Ruby objects and JSON serialization (without Rails) and for reading, the CSV gem. This is just an example though, such CSV parsers and JSON serialization targets exist for every language of remotely practical value.

Thats what it really boils down to. Its possible that you could have something that looks like:

1,The Boss,,
2,Founder Dev,7
3,Manager Joe,1
4,Peon Frank,3
5,Peon Jill,3
6,Peon Rodger,7
7,Manager Sally,1

And so, you have to read it all before you can go about creating the json to serialize it to. There is no way around that.


Now, there is the possibility of a different structure in the csv that allows you do some more useful queries and quickly without needing to build the full three structure to then work with.

The nested set which can also be seen as a modified preorder tree can be useful.

The tree looks like this:

                1 Boss 14                   

2 Manager Sally 7       8 Manager Joe 13     

3 Dev 4 | 5 Rodger 6    9 Frank 10 | 11 Jill 12

And the corresponding file would be written out as:

# name  L   R
Boss,   1,  14
Sally,  2,  7
Dev,    3,  4
Rodger, 5,  6
Joe,    8,  13
Frank,  9,  10
Jill,   11, 12

With this tree, one can do things like ‘find all the managers’ by finding all the lines where R-L > 1. Or, ‘find everyone reporting to Sally’ by finding all the lines where L > 2 and R < 7. The number of people reporting to a given person is (R-L-1)/2. This allows specific questions against the tree to be able to be read much faster (without needing to build the tree itself).

The modified preorder tree does come with the expense that updates to it will touch a large number of lines compared to the adjacency list approach to an org tree. An insert will shift numbers around all over the place. For example, if Sally had another person reporting to her, that person would have the value 7,8 and you would need to modify everyone’s if(L > 7) L = L + 2 and if (R > 7) R = R + 2.

2

The solution by Jonathan Eunice is pretty clean and easy to follow. When constructing the list of parents, I’d slice off the parent information from the value side of the parent dictionary. Instead of this:

# contstruct list of parents
parents = defaultdict(list)
for p in people:
    parents[p[2]].append(p)
show_val("parents", parents)

I’d have this:

# contstruct list of parents
parents = defaultdict(list)
for p in people:
    parents[p[2]].append(p[:-1])
show_val("parents", parents)

Since I’m dropping that manager id, I’d adjust the buidtree function accordingly:

(All I’m doing is dropping the mid variable inside the for loop; I find it more readable. And we’d also get an error if we didn’t)

def buildtree(t=None, parent_eid=''):
    """
    Given a parents lookup structure, construct
    a data hierarchy.
    """
    parent = parents.get(parent_eid, None)
    if parent is None:
        return t
    for eid, name in parent:
        report = {'name': name}
        if t is None:
            t = report
        else:
            reports = t.setdefault('reports', [])
            reports.append(report)
        buildtree(report, eid)
    return t

So that when I re-run the script again, I get this output:

-----------------
randomized people
-----------------

[('6', 'Peon Rodger', '3'),
 ('323', 'The Boss', ''),
 ('33', 'Peon Dave', '3'),
 ('4444', 'Manager Joe', '323'),
 ('7', 'Peon Ralph', '3'),
 ('4', 'Peon Frank', '4444'),
 ('233', 'Clerk Jane', '99'),
 ('99', 'Supervisor Henri', '3'),
 ('5', 'Peon Jill', '4444'),
 ('3', 'Manager Sally', '323')]

-------
parents
-------

defaultdict(<class 'list'>,
            {'': [('323', 'The Boss')],
             '3': [('6', 'Peon Rodger'),
                   ('33', 'Peon Dave'),
                   ('7', 'Peon Ralph'),
                   ('99', 'Supervisor Henri')],
             '323': [('4444', 'Manager Joe'), ('3', 'Manager Sally')],
             '4444': [('4', 'Peon Frank'), ('5', 'Peon Jill')],
             '99': [('233', 'Clerk Jane')]})

----
data
----

{'name': 'The Boss',
 'reports': [{'name': 'Manager Joe',
              'reports': [{'name': 'Peon Frank'}, {'name': 'Peon Jill'}]},
             {'name': 'Manager Sally',
              'reports': [{'name': 'Peon Rodger'},
                          {'name': 'Peon Dave'},
                          {'name': 'Peon Ralph'},
                          {'name': 'Supervisor Henri',
                           'reports': [{'name': 'Clerk Jane'}]}]}]}

----
JSON
----

('{"name": "The Boss", "reports": [{"name": "Manager Joe", "reports": '
 '[{"name": "Peon Frank"}, {"name": "Peon Jill"}]}, {"name": "Manager Sally", '
 '"reports": [{"name": "Peon Rodger"}, {"name": "Peon Dave"}, {"name": "Peon '
 'Ralph"}, {"name": "Supervisor Henri", "reports": [{"name": "Clerk '
 'Jane"}]}]}]}')

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật