Proper tree NoSQL structure with focus on full-text searching

I developing an app with tree(folder-file) structure, on which I should perform full-text searches with MongoDB. I did a research on the best tree structure practices and found this great article, but I still can not decide which DB structure will fit my needs.

I have the following requirements in my mind:

I should be able to perform full-text search on individual folders, as well as everything from specific users
The folders/files should be shareable, so I need to be able to perform full-text search on all items accessible by specific user

I’ve been thinking about the following structures.

Structure 1

Fields of Users collection

 1. _id - objectid
 2. name - string

Fields of Folders collection

 1. _id - objectid
 2. name - string
 3. owner - objectid
 4. sharedWith - array of objectIds
 5. location - objectid of parent folder, null if in root
 6. createDate - datetime

Fields of File collection

 1. _id - objectid
 2. name - string
 3. owner - objectid
 4. sharedWith - array of objectIds
 5. data - string
 6. location - objectId of folder
 7. createDate - datetime

So here comes my questions:

Should I use model tree structures with Parent References or Child References?
Should I use 1 collection for both files and folders(with type field) or I should separate them.
Does it worth to have only folder collection and nest documents in it.

This were my most important questions, thought I will greatly appreciate any advice on how I can improve the structure.

Some of the answers depend on how you foresee the system being used. Without knowing more about your specific requirements, my answer is aimed at a generally flexible system that could work OK with a wide variety of use cases, and not assuming any “shortcuts” (like, absolute limit on number of folders etc). More specifically:

Should I use model tree structures with Parent References or Child References?

If you use parent references, then no matter how many documents a folder might have, the size of the object representing that folder will stay constant. If you use child references, you’ll need to update the folder document object every time a file is created – this might introduce synchronization issues (2 files being added to the same folder at the same time), or document size issues (imagine a folder with a million files in it). However, having such a “normalized” structure will make it more expensive to do things like “find all folder/files nested under this root folder” without additional optimizations.

Should I use 1 collection for both files and folders(with type field) or I should separate them.

File systems typically represent both files and folders as “nodes” that then carry additional data/type information. Splitting them into separate collections only makes sense if you have some very specialized operations that you need to run on those data sets (can’t think of anything off the top of my head), and having separate collections might help.

Does it worth to have only folder collection and nest documents in it.

You will lose the ability to access individual files without loading everything else that’s in that folder. Plus, this will be problematic if the number of files per folder grows and your folder objects start getting very big. Separate documents that represent separate “nodes” of your file system is probably the way to go.

Tradeoff: if you know that you’ll have a rigid folder structure with a handful of folders and not too many documents, a nested structure could be convenient.

When answering questions like these, it’s very helpful to either know all of your requirements upfront, OR if the requirements are vague then develop a generally flexible system that’s easy(er) to change/maintain once requirements are understood better.

I generally find it useful to ask extreme questions, such as “what happens if I have a billion folders with billion files each?”, or “what if I have a structure that’s nested a billion folders deep?”. Questions like that tend to illuminate the problem in helpful ways.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 21:34

Thẻ: database, database-design, mongodb, nosql

Proper tree NoSQL structure with focus on full-text searching

I have the following requirements in my mind:

I should be able to perform full-text search on individual folders, as well as everything from specific users
The folders/files should be shareable, so I need to be able to perform full-text search on all items accessible by specific user

I’ve been thinking about the following structures.

Structure 1

Fields of Users collection

 1. _id - objectid
 2. name - string

Fields of Folders collection

 1. _id - objectid
 2. name - string
 3. owner - objectid
 4. sharedWith - array of objectIds
 5. location - objectid of parent folder, null if in root
 6. createDate - datetime

Fields of File collection

 1. _id - objectid
 2. name - string
 3. owner - objectid
 4. sharedWith - array of objectIds
 5. data - string
 6. location - objectId of folder
 7. createDate - datetime

So here comes my questions:

Should I use model tree structures with Parent References or Child References?
Should I use 1 collection for both files and folders(with type field) or I should separate them.
Does it worth to have only folder collection and nest documents in it.

This were my most important questions, thought I will greatly appreciate any advice on how I can improve the structure.

Should I use model tree structures with Parent References or Child References?

Should I use 1 collection for both files and folders(with type field) or I should separate them.

Does it worth to have only folder collection and nest documents in it.

Tradeoff: if you know that you’ll have a rigid folder structure with a handful of folders and not too many documents, a nested structure could be convenient.

Filed under: softwareengineering - @ 21:34

Thẻ: database, database-design, mongodb, nosql

Thiết kế website giá rẻ

Danh mục

Proper tree NoSQL structure with focus on full-text searching

Proper tree NoSQL structure with focus on full-text searching