Workflow: Using binary document formats in Git without locks (moving from subversion)

We’re a software consultancy with a multitude of projects for different customers. We traditionally use Subversion, but are currently considering moving to Git.

A significant portion of the documents we produce are shared with our customers (requirements, global designs, test specs, etc), and we use MS Office to produce these. In Subversion, we could use its “Lock” feature to ensure that no one was editing the same document at the same time. In Git, you can’t do that since by its distributed nature, git doesn’t have locks.

Locks are really little more than a communication mechanism, but they’re a very effective one.

Currently, our code and customer-facing documents are typically in different subfolders of a different svn repository. When moving to git, what would you recommend we do? I see a set of options:

  1. We move the svn repositories to git 1-on-1. Instead of using locks on the Office files, we do what the git people suggest and somehow try to change our workflow to fix it. This could be working in a branch on any document edit, and merging that over review. This approach breaks over e.g. Excel sheets that contain project management info; they’re easily edited by team members (and we encourage that this is done), but not subject to any formal review process

  2. We use git for code and svn for docs and project management. This has the disadvantage that certain more design-ish documents won’t be “nearby” the code it specs, increasing the chance that people forget to update them. Additionally, everybody has to use and understand two sets of tools. That said, maybe this is a great opportunity to move to text-based doc tools (latex, markdown, HTML, whatever) for non-customer-facing design docs.

  3. Like 1, but we hack up a git lock command that does what svn lock does for us (toggle the read-only flag appropriately and sync with a server through some means).

I don’t buy the argument that locks don’t work in a DVCS because the system should even work when you’re entirely offline. Svn locks can be overridden as well; they’re a communication mechanism. Without some sort of network connection, you won’t get your computer to communicate a lot.

We can’t be the only shop who’re very happy with how svn lock fits in our workflow, right?

Any ideas or tips?

I found https://stackoverflow.com/questions/119444/locking-binary-files-using-git-version-control-system but the discussion is rather technical; i’m looking for ways to solve or avoid the practical problem of two team members editing the same binary file at the same time.

10

I would advise you to stay with SVN for the MS Office documents for two reasons:

  1. It is already there and it is (in my opinion) better for keeping
    Office documents (look here). Has much more third party tools for doing this.
  2. The lock, though can be achieved in Git, is not “the Git kind of way
    of doing things”. If you need these features, stick with the tool
    that gives you the best solution.

There is a saying that I like that says something like this: “When You’re Holding a Hammer, Everything Looks Like a Nail”. Just because you are moving to Git to hold you code, it doesn’t mean that you should use it to hold your documents.

1

Code version control is not the best tool to work on Office files, because they are binary and these tools work on file-level modification.

Use a collaboration tool, like MediaWiki (free) or Atlassian Confluence (paid), from which you can easily extract Word document. Or use LaTex to generate the Office files.

Let me expand…

If you need to collaborate you must adopt a model that highlights the modifications (e.g. changed a word, rephrased, or just changed a font) to a unit, e.g. a file.

SVN and Git, even if thought for code, are low-level tools that compare their files by textual content. But the problem is that they can work only on text files, because they don’t care about the nature/contents of the file to extract a high-level modifications model.

A clear example is an image file. Though TortoiseMerge is a tool that helps SVN users by comparing the images for their real modifications, the normal VCSes run by content patches over the files. Let me explain. A tool such as TortoiseMerge can tell you that a new version of an image file is changed only by a few pixels, or luminance if it implements a more complex HSV analysis of the two files. You can add a watermark or change colour levels, a tool that compares image files will highlight you the differences if it implements good comparison algorithm. But in order to check the new file in your client must produce a delta. A delta is a set of lines that are removed and lines that are added to the file. Binary files have no line breaks if they don’t happen to have rn, or similar, in their payload, and in a delta if you change a single character you are replacing an entire line.

So here is the problem. Binary files are not good for version control because you could be almost replacing the entire file for every revision. Consider when you write Office files using MS Office and your collaborator edits with OpenOffice. If they implement even a slightly different version of the compression algorithm of OpenXML files, you will end up in completely different files even if you changed a single comma in the document.

Collaboration softwares render documents internally in a text-based format, because text is what is really meaningful to your company, and can compute the differences or handle conflicts. LaTex, or Markdown if you like, is a way to store a document as a textual file with advanced markup, so not like the classic TXT file that has no font/formatting control.

But obviously your customers won’t like to open Markdown files, will they? Ok, you can simply, and I really mean simply, use any software I’m currently too lazy to google for in order to convert a source document to PDF, Word or whatever.

Summarizing

If you start checking text files into your source control you have greater control over file history and can easily manage conflicts, especially without using VCS locks.

Before sharing a document officially you need a routine to export the source text document to an Office file

Separating the two steps makes people happy at the cost of a learning curve.

2

You can use git for those documents without adding locking. Choose a git workflow that blocks pushes to the master branch if not on master. (There are several workflows to choose from.) This will prevent people from overwriting each other’s modifications to binary document files.
Assume two people modify the same binary document. The first one that pushes it to master gets their changes in. The second one will get blocked because their copy is behind the master branch. They have to synch first. So the second person does synch. It will show a merge conflict for the binary document. That person saves their version somewhere and resolves the conflict by taking the version from master (that was pushed by the first person). At this point the second person’s files are up to date with the master branch.
They merge in their changes to the latest binary document (by hand), which will then contain both the first person’s and the second person’s changes. Then the new version is pushed to master and becomes the new master branch.
The merging is a pain, but it only happens when there is a conflict. Also, changes don’t get lost or overwritten. The conflicts are detected and users are able to resolve them cleanly.

3

Put your first 2 solutions together and you don’t need a third.

If you save your spreadsheets on disk as CSVs, Excel will still edit them and then git will be happy to merge them for you.

Similarly, you can open, edit, and save your files in Word if they are HTML or (god help us) RTF. Word of course will add more bloat than useful text, but it’s still just text that git is happy to merge for you.

Granted, these solutions assume that you don’t make use or could move away from MS-specific features which is really only possibly an issue on the Excel side.

Unless of course you also require Word to be installed on a system to be able to read your documentation, which is in itself a terrifying prospect to me…

2

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật