Zero-knowledge code hosting? [closed]

In light of recent revelations about widespread government monitoring of data stored by online service providers, zero-knowledge services are all the rage now.

A zero-knowledge service is one where all data is stored encrypted with a key that is not stored on the server. Encryption and decryption happens entirely on the client side, and the server never sees either plaintext data or the key. As a result, the service provider is unable to decrypt and provide the data to a third party, even if it wanted to.

To give an example: SpiderOak can be viewed as a zero-knowledge version of Dropbox.

As programmers, we rely heavily on, and trust some of our most sensitive data – our code – to a particular class of online service providers: code hosting providers (like Bitbucket, Assembla, and so on). I am of course talking about private repositories here – the concept of zero-knowledge does not make sense for public repositories.

My questions are:

Are there any technological barriers to creating a zero-knowledge code hosting service? For example, is there something about the network protocols used by popular version control systems like SVN, Mercurial, or Git that would make it difficult (or impossible) to implement a scheme where the data being communicated between the client and the server is encrypted with a key the server does not know?
Are there any zero-knowledge code hosting services in existence today?

You can encrypt each line seperately. If you can afford to leak your file names and approximate line lengths and the line numbers on which lines changes occur, you can use something like this:

https://github.com/ysangkok/line-encryptor

As each line is encrypted seperately (but with the same key), the uploaded changes will (like usually) only involve the relevant lines.

If it is presently not convenient enough, you could make two Git repositories, one with plaintext and one with ciphertext. When you commit in the plaintext repository (which is local), a commit hook could take the diff and run it through the line encryptor referenced above, which would apply it to the ciphertext repository. The ciphertext repository changes would be committed and uploaded.

The line encryptor above is SCM agnostic, but can read unified diff files (of plaintext) and encrypt the changes and apply them to the ciphertext. This makes it usable on any SCM that will generate you a unified diff (like Git).

I don’t think there are any barriers – consider SVN, what gets sent to the server for storage is the delta between what the previous and current version of your code – so you change 1 line, just that line gets sent to the server. The server then ‘blindly’ stores it without doing any inspection of the data itself. If you encrypted the delta and sent that instead, there would be no impact on the server, in fact you wouldn’t even need to modify the server at all.

There are other bits that might matter, such as meta data properties that are not easily encryptable – such as mime type – but others could be encrypted, eg comments in the history log, just as long as you know you have to decrypt them on the client to view. I’m not sure if the directory structure would be visible, I think it would not be visible due to the way SVN stores directories, but its possible I’m wrong. This might not matter to you if the contents are secure however.

This would mean you couldn’t have a web site with the various code view features, no server-side repository browser or log viewer. No code diffs, no online code review tools.

Something like this already exists, to a point, Mozy stores your data encrypted with your private key (you can use their own, and they make noises about “if you lose your own key, too bad, we can’t restore your data for you”, but that’s more targeted at the common user). Mozy also stores a history of your files, so you can retrieve previous versions. Where it falls down is that upload is on a regular basis, not checkin when you want, and I believe it discards old versions when you run out of storage space. But the concept is there, they could modify it to provide secure source control using their existing system.

I hate to do one of those ‘this isn’t quite going to answer your question’ answers.. but..

I can think of two ready solutions which should address these worries.

Host a private Git server on your own. Then put that server on a VPN to which you give your team members access. All communication to and from the server would be encrypted, and you could of course encrypt the server at the OS-level.
BitSync should do the trick as well. Everything would be encrypted, and in a huge network which would be available from anywhere. Might actually be a really good application of all this BitCoin/BitMessage/BitSync technology..

Lastly, the folks over at https://security.stackexchange.com/ might have some more insight.

As I understand it, the way git pull works is that the server sends you a pack file that contains all the objects that you want, but don’t have currently. And vice versa for git push.

I think you couldn’t do it like this directly (because this means the server has to understand the objects). What you could do instead is to let the server work just with a series of encrypted pack files.

To do pull, you download all the pack files that were added since your last pull, decrypt them and apply to your git repo. To do push, you first have to do pull, so that you know the state of the server. If there are no conflicts, you create a pack file with your changes, encrypt it and upload it.

With this approach, you would end up with large number of tiny pack files, which would be quite inefficient. To fix that, you could download a series of pack files, decrypt, combine them into one pack file, encrypt and upload them to the server, marking them as a replacement for that series.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 15:17

Thẻ: encryption, privacy, project-hosting, version-control

Thiết kế website giá rẻ

Danh mục

Zero-knowledge code hosting? [closed]