How do SaaS companies provide data liberation services? [closed]

Overview

I considered posting this to DBA but I think the question goes beyond the database. Of all the software-related Stack Exchange sites, the Programmers FAQ seemed most in tune with what I’m asking here.

My company provides a SaaS offering. Our customers collect a lot of data using our service, and we wish to make that data more easily available to our users than our web application currently provides. Our largest customer databases can run into the tens of GB including indexes. Stripping away indexes, a single database can still be several GB and will grow with each additional year of service.

The primary use cases our prospects and customers cite are:

  • As a prospect/customer, I want guaranteed access to my data in the event your company should fail.
  • As a prospect/customer, I want access to query my data so that I can build my own reports, dashboards, applications using that data.

I like the phrase “data liberation” to classify these kinds of use cases. Google seems to be at the vanguard of the data liberation movement. They’ve done admirable work to enable their customers to get access to their data and to raise awareness of the state of SaaS data accessibility. As a SaaS provider, I’d like to give our customers better access to their data.

The Evolution

In the past, we developed many custom report-like web pages to give customers a degree of access. This led to a lot of complex, custom pages that we must maintain.

We later added a separate reporting solution and began charging for custom reports. Although this gives us more flexibility and we can deliver faster, this is also costly to maintain and not scalable. Customers also don’t like being charged for every custom report, as they see it as us charging them for access to data they own.

While we are meeting the needs of some customers, others want more. Some want to combine their SaaS data with in-house data, and they can’t do that without the ability to query their SaaS data.

There are many approaches to solving these problems. I’ll talk about the two use cases and approaches I’ve considered, then ask for your input.

I’m going to willfully avoid the question, “How useful is the data without the application?”

Side Note: Our database is Microsoft SQL Server, but that’s largely irrelevant.

Use Case 1

As a prospect/customer, I want guaranteed access to my data in the event your company should fail.

Host Our Own Download Service

A relatively easy solution would be to set up our own download service on a separate server hosted in the same data center. We’d put nightly database backups here for every customer to download at their leisure.

Pros:

  • We use free internal bandwidth to move files around.
  • Customer only incurs bandwidth costs when they want to download.

Cons:

  • Some customers may download nightly “just because we can”.
  • Provisions must be made to keep the download service operational in the event the company fails, otherwise we’ve solved nothing. The customer still has to trust us to keep this service running, which is the core issue in the first place.

An extreme variation of this model is to create a separate company to own the download service and pre-fund the hosting for a period of time sufficient to allow customers to download their last backup.

Push Database Backups to the Customer

Here we copy full nightly backups to a storage account or service (FTP over SSL, SFTP/SSH, Dropbox, Box.net, SkyDrive, Amazon S3/Glacier, etc.) of the customers’ choosing.

Pros:

  • Customer always has access to their data.

Cons:

  • This method is bandwidth intensive and wasteful. We’re pushing mostly the same data, night after night. The customer only needs the data in the event of a catastrophe.
  • This method runs the risk of losing up to 24 hours of the customers’ data if we only copy nightly backups. Copying periodic diff backups would help to reduce the recovery point objective at the expense of more bandwidth.
  • Managing all the possible destination types, auth credentials and integrating new targets is complex.

A less complex variation is to require our customers to provide credentials to a storage service of a single type, like Amazon Glacier. Everyone uses Glacier, or you don’t benefit from the service.

Third-Party Backup Tie-In

There are services out there like Backupify that hook into Google Apps and back up data on a regular basis. I’m seeking out more services like this which could provide an independent, turn-key solution that our customers could sign up and pay for, and which we could fully automate.

Unless the third party service ties into SQL Server in an intelligent way to only pull data differences from night to night, this is still going to be bandwidth intensive.

Native Database Replication

There are several built-in SQL Server features of interest, such as log shipping, mirroring, replication and “always-on availability groups”. These all require a SQL Server on the remote side, as well as requiring the remote server be joined to a single domain and/or be part of a Windows Server Failover Cluster controlled by us. These methods therefore don’t seem viable for this use case.

Use Case 2

As a prospect/customer, I want access to download and/or query my data so that I can build my own reports, dashboards, applications using that data.

Web API

Providing a comprehensive REST or SOAP/XML (or both) API would help some, but not most of our customers. Many customers really want some kind of SQL access, through ODBC/JDBC/ADO.NET or the like.

Open Data Protocol (OData)

We could expose denormalized data sets and present them using OData (see also SO questions on OData). However, being newish (1) I’m not sure OData will become mainstream enough to warrant a long-term investment and (2) it may be challenging for our customers to both understand and access. I’m not married to this opinion. I do know that the PowerPivot plugin for Excel 2010 can connect to OData sources, and Tableau (a tool at least one customer users) can as well. That’s promising.

Data Replication

We’ve experimented with using Microsoft SQL Server transactional (pub/sub) replication of de-normalized subsets of data with a couple of customers. This requires VPNs and DBAs, making it complicated to set up and maintain. The dependence on SQL Server isn’t ideal.

Third Party Connectors

I’ve researched and found a couple of companies who provide offerings around opening SaaS data to end customers.

  • Connection Cloud
  • Progress DataDirect

Both companies have products that would seem to enable exactly what I’m looking to do. These are the only two providers I’ve found so far. I’m talking to representatives from both companies to evaluate their offerings.

The Question

How have you solved your own customers’ “data liberation” problems?

3

We do this process with data from/for school districts. Our approach includes:

  • focus on excel and csv for both import and export of data as this is what most non-technical customers are familiar with. csv is extremely ‘universal’ as a data transport format.

  • Not doing database backups for customers. The db is not much use without the app that currently access it.

To be honest though, I’ve worked for half a dozen companies that provide saas in one way or another and this has never been a hot topic for any of them. The main focus is the apps that currently accompany the data and the value they provide and that’s about where it stops.

This doesn’t address some of your questions but I hope it helps. It’s also not a good idea to ask a ton of minor questions within a big one. Just like software, it’s best to take one small question at a time, even when tackling a big problem.

1

We’ve performed some of this with our application.

The best approach is a simple one. Namely,

  1. Provide a simple web api with basic querying capability;
  2. Provide a simple report generator / downloads into csv; and,
  3. Provide a way to schedule auto exports of data.

Those can work hand in hand. You don’t want to build anything overly complicated as few would bother trying to use it. However, if you can allow your customer to effectively do a “select * from …” call then you are good.

Of course some of them want direct SQL access. This is almost universally a bad idea depending upon your data structures. Never mind that you most likely have a level of business logic embedded in the application itself that would be difficult to teach others about.

It’s far better to give them a simple api with a few parameters allowing them to pull some json data or a csv/excel export; with CSV being preferred.

Most devs can work with csv. Amazingly few can work with json / xml; this actually boggles my mind.

Don’t give them backups. Those have limited usability that is usually heavily dependent upon the business logic in the application to work well.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật