Sorry if the title is hard to understand, it is easier if I describe the scenario to get the question across.
I have a service that collects data, performs business logic and does all the normal stuff. I now have to perform an integration that will provide a set of data (for sage accounts) via an endpoint (http://example.net/import-feed
).
This endpoint can be polled by the external service at any time to pull the latest changes.
When a piece of data is updated, say a payment or a date this needs to be added to the feed.
This payment or date could be changed multiple times before the service gets polled.
The question is what is the best methods to keep track of all or the last change (probably best) made to the record/row/data and pull all of these changes into the feed. The external service will then send notifications back to my service letting me know if adding that update was successful or not.
Currently the only method I can think of is track the last modified time and the last time it was imported, and a status saying if it needs to be imported/updated. These imports/updates would then be pulled from the feed.
I hope you understand the problem, I have had a hard time trying to find information on this type of integration. It may just be me not using the correct terms.
So, what are the methods for this?
The method you’ve suggested using a last modified timestamp should work fine. If you need to store data specific to the external service, such as keeping track of whether the service has downloaded that information, then it might make sense to store this meta information in a separate table, eg:
sage_queue
id | table | table_id | modified_at
This way, when your sage endpoint is called, it should be a fairly straighforward procedure to run through this table and collect together all the associated data from your core tables.
When your sage endpoint receives confirmation that a certain entry has been processed, it could just be deleted from the table.
When a record gets updated in one of your core tables, then it gets entered as a reference into your sage_queue table (either triggered by an event-listener system, or by a periodic cron job). If a reference to that record already exists, then you could just update the modified_at field.
It strikes me that you have a problem that might suit the Command Query Responsibility Segregation CQRS code pattern.
Essentially you wouldn’t want to think about your code storing a single entry and constantly updating that entry, but rather, using a message queue to store all the add/edit statements so you could roll the data forward or backwards to any point in time.
Periodically (your polling/scheduling interval) you would simply pushing all data from the First in First Out queue and then just popping the oldest records off the queue.
I suspect you may need to manage the way you identify each entry from your system to SAGE Accounting API’s using this approach, but I think this is a much safer approach as it provides an audit trail.