Managing unique data correctly in SQL

Here in Brazil we have a personal identifier called CPF. This identifier has a sequence of 11 digits and is unique for each Brazilian citizen. Possibly in other countries there is something like that too.

How is a unique identifier, I chose to set it in the database like a UNIQUE KEY. And now that things get complicated. A user could be registered by an administrator and be removed by some kind of problem. However, it is not actually removed, but hidden by a deleted flag. And in another moment, he could be recreated again (there is a need to recreate it, and not re-enable it).

Because it is an individual identifier, the CPF would be reused on recreated user and would conflict in UNIQUE KEY.

In this situation, I thought of some practical solutions to solve the problem as well as its disadvantages. But I like what is the most appropriate way to solve this kind of problem.

Do not use a UNIQUE KEY. The disadvantage is that it could increase the query time, which could perhaps be solved by a KEY-only column.
Set the column to NULL on pseudo-DELETE. The disadvantage is that I lose the information, however, could solve just creating a second column to store its value before being annulled.
The UNIQUE KEY could group the column with a date removal column (NULL by default). The disadvantage is having to use two columns for each group UNIQUE KEY.
Create a second table to store data removed without UNIQUE KEY. The disadvantages here are almost incalculable!
Remove the row, rather than hide it. The disadvantage here is that I could lose the references in a user FK.

Currently I prefer the first option, and if very necessary, I would opt for the second or third option. But I wonder if there is a more correct way, following the concept more accepted by the community.

what you’ve described is a unique key – only you’re thinking of it as a unique column.

You can create a unique constraint on any number of columns in a table, so here you’d combine the CPF with the deleted (or some other column if you could have multiple deleted records) column as well.

For example:

ALTER TABLE Persons
ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName)

Do not, do not, DO NOT abuse your database. A unique or primary key should be unique, without exception. Even if you’re using an archaic database system like xBase which natively supports the idea of “marking a record as deleted”, it’s just a bad idea.

There are two good and simple options, however.

1: Don’t “delete”, “mark inactive” instead.

As a SQL verb, “delete” has special meaning in any RDMS or system. It means you’re telling the system to physically remove and destroy records. Any data that happens to be recoverable is such only as a shortcut by the RDMS, and your data model should not depend on that behavior.

With a simple “Active” or “Inactive” boolean value, it should be easy to mark your person record, and relatively simple for your DBA to recover someone they had to manually adjust for some reason. (Not that such is good practice, but that’s another topic.) Don’t muck-about with duplicate CPF’s; just declare the thing unique, and train your admin staff on how to recover an inactive person.

Of course, this method while simple also prevents the ability to easily give a person a new surrogate key while maintaining past history. If that’s important, go with option #2.

2: Use a seperate table for CPF, and have “personID” be a property of the CPF instead of the inverse.

The obvious model for designing a system with people and their unique government number is to have said number be a field on the person table.

CREATE TABLE persons (
  personID int IDENTITY PRIMARY KEY,
  name varchar(250),
  CPF varchar(10)
)

However, in an instance where your design demands that two person records may exist for the same CPF, the simple design is to have CPF’s be unique, and have them point to their current person record.

CREATE TABLE persons (
  personID int IDENTITY PRIMARY KEY,
  name varchar(250)
)

CREATE TABLE cpfs (
  CPF varchar(10) NOT NULL PRIMARY KEY,
  personID int REFERENCES person(personID)
)

Give the tables above, it is a simple matter to create a view listing only the “real” person for each CPF, along with their personID.

SELECT C.CPF, P.* 
FROM cfps C 
INNER JOIN persons P
  ON C.personID = P.personID

And it likewise becomes simple to query for only active persons.

SELECT *
FROM someTable
WHERE personID IN (SELECT personID FROM cfps)

Now, this system does introduce a second table, and includes a modest bit of complexity. If you don’t expect multiple “person” records to stick around as a regular thing and only anticipate some few exceptions, go with option #1 instead.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 01:33

Thẻ: schema, sql

Managing unique data correctly in SQL

Because it is an individual identifier, the CPF would be reused on recreated user and would conflict in UNIQUE KEY.

In this situation, I thought of some practical solutions to solve the problem as well as its disadvantages. But I like what is the most appropriate way to solve this kind of problem.

Do not use a UNIQUE KEY. The disadvantage is that it could increase the query time, which could perhaps be solved by a KEY-only column.
Set the column to NULL on pseudo-DELETE. The disadvantage is that I lose the information, however, could solve just creating a second column to store its value before being annulled.
The UNIQUE KEY could group the column with a date removal column (NULL by default). The disadvantage is having to use two columns for each group UNIQUE KEY.
Create a second table to store data removed without UNIQUE KEY. The disadvantages here are almost incalculable!
Remove the row, rather than hide it. The disadvantage here is that I could lose the references in a user FK.

what you’ve described is a unique key – only you’re thinking of it as a unique column.

For example:

ALTER TABLE Persons
ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName)

There are two good and simple options, however.

1: Don’t “delete”, “mark inactive” instead.

Of course, this method while simple also prevents the ability to easily give a person a new surrogate key while maintaining past history. If that’s important, go with option #2.

2: Use a seperate table for CPF, and have “personID” be a property of the CPF instead of the inverse.

The obvious model for designing a system with people and their unique government number is to have said number be a field on the person table.

CREATE TABLE persons (
  personID int IDENTITY PRIMARY KEY,
  name varchar(250),
  CPF varchar(10)
)

CREATE TABLE persons (
  personID int IDENTITY PRIMARY KEY,
  name varchar(250)
)

CREATE TABLE cpfs (
  CPF varchar(10) NOT NULL PRIMARY KEY,
  personID int REFERENCES person(personID)
)

Give the tables above, it is a simple matter to create a view listing only the “real” person for each CPF, along with their personID.

SELECT C.CPF, P.* 
FROM cfps C 
INNER JOIN persons P
  ON C.personID = P.personID

And it likewise becomes simple to query for only active persons.

SELECT *
FROM someTable
WHERE personID IN (SELECT personID FROM cfps)

Filed under: softwareengineering - @ 01:33

Thẻ: schema, sql

Thiết kế website giá rẻ

Danh mục

Managing unique data correctly in SQL

1: Don’t “delete”, “mark inactive” instead.

2: Use a seperate table for CPF, and have “personID” be a property of the CPF instead of the inverse.

Managing unique data correctly in SQL

1: Don’t “delete”, “mark inactive” instead.

2: Use a seperate table for CPF, and have “personID” be a property of the CPF instead of the inverse.