TL;DR – I’m looking for guidance with my database design. I am concerned that my existing design is inefficient and won’t be able to handle large numbers of employees.
This is going to be long, so please bear with me.
I am in the process of creating a JEE application for staffing demand, supply and usage kind of web application.
So far, I have done the following (the easy part)
Core DB models like Employee, department, location etc.
Key points about demand and supply structure:
- Allocation for each employee can happen between 0.1 to 1.0.
- Each new employee automatically gets allocated to a pool project for 1.0
- Need to track an employee’s allocation to a granularity of a week’s allocation. May be required to have daily allocation tracking
in future. - An employee can be allocated partially (between 0.1 to 1.0) or completely (1.0) for a demand.
- Demand is further split into weeks and actual allocation needs to be tracked for each employee on each of this weeks. An employee
allocated to a demand may be allocated fully or partially to some or
all of the weeks for this demand.
I am planning to build the demand DB presentation as 3 tables.
Demand Master (DM):
This will contain entry for each new demand created. Basically number of employees required and the over all duration of the demand.
DemandLineItem (DLI)
: Is the level 2 of demand data that stores the employee requirements for the demand.
DemandLineItemDetails(DLID)
: is the level 3 of details, that keeps details of each allocation for an employee at week level.
The relationship I envision for these 3 tables is like this
DemandMaster(DM) <-- 1 to many --> DemandLineItem (DLI) <-- 1 to many --> DemandLineItemDetails (DLID)
So here is a dummy data set:
Table DM:
DemandId <--> StartDate <--> EndDate
Demand1 <--> 10 Feb 2014 <--> 24 Feb 2014
Table DLI:
DemandId <--> DliId <--> EmployeeRoleRequirement
Demand1 <--> Dli1 <--> Foreman
Demand1 <--> Dli2 <--> LineManager
Demand1 <--> Dli3 <--> Manager
Table DLID:
DLIID <--> DLID_ID <--> allocationstart <--> allocationend
Dli1 <--> DLIID1 <--> 10 Feb 2014 <--> 17 feb 2014
Dli1 <--> DLIID2 <--> 18 feb 2014 <--> 24 feb 2014
Dli2 <--> DLIID3 <--> 18 feb 2014 <--> 24 feb 2014
Dli3 <--> DLIID4 <--> 10 Feb 2014 <--> 17 feb 2014
The problem statement:
- This is an inefficient way of creating an tracking a demand, because of the number of joins involved.
- With the number of increased demands, the DLI and DLID data will increase and will be difficult to do fast CRUD operations.
- If I transpose a similar structure to supply structure, the number of weeks of data for a current employee will be huge, especially because supply structure needs to accommodate at least 2 years of data into future as “un-used” OR “available” allocations.
- The fact that allocations committed to a demand needs also be subtracted from the above “un-used” or “available” pool makes it more complex.
Is there a better way of maintaining data like this (week/day level data for each employee)?
Maybe a better DB model OR different approach?
I am using mysql-workbench and mysql database for the designing.
Planning to use JPA (hibernate provider) as ORM.
2
I think it’s important that you keep the structure as is. But you’ll also want to make a flattened structure for viewing the information. (This is probably what you’re seeing as inefficient).
You can approach this in two ways, you can do a deferred “agent-based” ETL into a more flattened schema. Or you can use Materialized Views (or their equivalent on your database platform) to precache the searches for you without needing the agent.
In CQRS, this distinction is called Read Model vs. Write Model
1