We currently have 3 similar, but slightly different applications that use the same data. We load the (same) data into each application that uses it. The applications are similar and use the same technology (Ruby on Rails) but were developed by different programmers.
In consolidating this data, our management seem the benefit to using the same tools (gems, modules, etc) but does not seem aware of many other reasons why we should consolidate things centrally.
I’ve worked in several different organizations including one position in Data Warehousing so the centralization of data seems like a no-brainer to me.
The life-cycle of the data is the following year for the current paying customer, but then also longer term as we get the data from the various organizations that have it, over the years, and we’ll want to compare one org and one year to others.
However I want to put together a list of all the reasons to centralize (or not) this set of data. So far I have:
-
reduce duplication of effort. When we implement a function we currently may have to do it in two places, increasing both the effort to implement the function in two places and also the cost of writing it twice.
-
Quality of data. Creating two data set that have similar but different contents will lead to discrepencies between the two that will grow over time, reducing the value of the data.
I am looking for a list of other reasons, particularly those that relate to the overall cost.
6
Ok, you said you are looking for pros and cons for centralizing the data, and gave yourself some pros. Here are some things when not to centralize may be the better option:
-
the data in place does change so seldom that the it simply does not pay to change your existing applications. Say, changing your applications needs 1 month development effort altogether for those three applications. But the data just changes once a year, is provided from outside, and the effort to provide it or enter it in all those 3 applications is one day. So your “return-of-invest” point is ~30 years in the future.
-
you can create an easy way to maintain the data just in one place and transform it automatically from that place to the place where the 3 applications access that data
-
the three applications must be kept strictly decoupled, so each of them can run without the other, without using data which belongs to one of the other (or belongs to a fourth one). This may have technical reasons (for example, fail safety) or legal reasons (for example, licensing issues).
-
the three applications need the data in different versions / in a different state of up-to-dateness
If none of those things applies, centralizing the data is probably the better option, especially when the data is changed often, can be changed directly through your 3 applications, and the 3 applications need the data in the same state.
1