Class instance clustering in object reference graph for multi-entries serialization

My question is on the best way to cluster a graph of class instances around specifically marked objects (objects are the graph nodes and the references to each other are the directed edges of the graph). To better explain my question, let me explain my motivation:

I currently use a moderately complex system to serialize the data used in my projects:

“marked” objects have a specific attributes which stores a “saving entry”: the path to an associated file on disc (but it could be done for any storage type providing the suitable interface)
Those object can then be serialized automatically (eg: obj.save())
The serialization of a marked object 'a' contains implicitly all objects 'b' for which 'a' has a reference to, directly s.t: a.b = b, or indirectly s.t.: a.c.b = b for some object 'c'

This basically define specific storage entries to specific objects. I then have “container” type objects that:

can be serialized similarly (in fact their are or can-be “marked”)
they don’t serialize in their storage entries the “marked” objects (with direct reference): if a and a.b are both marked, a.save() calls b.save() and stores a.b = storage_entry(b)

So, if I serialize 'a', it will serialize automatically all objects that can be reached from 'a' through the object reference graph, possibly in multiples entries. That is what I want, and it usually provides the functionalities I need. However, there are some structural limitations to this approach:

the multi-entry saving can only works through direct connections in “container” objects, and
there are situations with undefined behavior such as if two “marked” objects 'a'and 'b' both have a reference to an unmarked object 'c'. In this case my system will stores 'c' in both 'a' and 'b' making an implicit copy which not only double the storage size, but also change the object reference graph after re-loading.

I am thinking of generalizing the process. Apart for the practical questions on implementation (I am coding in python, and use Pickle to serialize my objects), there is a general question on the way to attach (cluster) unmarked objects to marked ones.

So, my questions are:

What are the important issues that should be considered? Basically why not just use any graph parsing algorithm with the “attach to last marked node” behavior.
Is there any work done on this problem, practical or theoretical, that I should be aware of?

Example of a simple object graph to serialize:

Circles are objects and arrow references. Orange circles are marked objects.

When serializing A, object B to E should be serialized but not blue objects because they are not reachable from A. Now, B is serialized in its own entry (i.e. file) and the question is how to choose where to serialize objects C to E, to A or B entry?

Some thinking on this example:

object C is only reachable from A, so its should be attached to it
to reach E, it is necessary to pass by B, so E should be attached to B
what about D? Because after serialization if B is loaded (and not A), D should also be loaded but not A. Thus D should be serialized in B entry.

Note:
I added the tag database because I think the answer might come from that fields, even if the question is not really.

It looks like there’s a missing conceptual step going on here. An object reference is not the object itself. So, perhaps consider serializing object references, and don’t serialize object c inside a or b, for example. Serialize object c outside of them, and you save space and processing time.

Edit:
Basically, it would work like this:

<code>foreach(object reference in serializeable)

serialize(object in global)

serialize(object reference)

</code>

<code>foreach(object reference in serializeable) serialize(object in global) serialize(object reference) </code>

foreach(object reference in serializeable)
    serialize(object in global)
    serialize(object reference)

Then, when you deserialize, you just make the object reference point back to the correct object.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 19:05

Thẻ: database, graph, object, python, serialization

Thiết kế website giá rẻ

Danh mục

Class instance clustering in object reference graph for multi-entries serialization