I’m working with a Microsoft Access table that stores its data in the following format:
ID | Document | Revision |
---|---|---|
8000C001 | 103A4000X001, 103A4000X002, 103A4000X003 | A, B, A |
8000C002 | 103A5000G002/1/01, /02, /03 | C, D, B |
8000C003 | 103A6000B001 | A |
And I’d like to use an SQL Select query to reformat it like the table below:
ID | Document | Revision |
---|---|---|
8000C001 | 103A4000X001 | A |
8000C001 | 103A4000X002 | B |
8000C001 | 103A4000X003 | A |
8000C002 | 103A5000G002/1/01 | C |
8000C002 | 103A5000G002/1/02 | D |
8000C002 | 103A5000G002/1/03 | B |
8000C003 | 103A6000B001 | A |
Documents are divided by a comma, but when documents have the same prefix, subsequent files are abbreviated.
I set up a second table that contains the latest revision of EVERY document, and am currently using an outer join statement to reformat the data.
SELECT
one.ID
two.Document,
two.Revision,
FROM [All Documents] AS one
RIGHT JOIN [Active Documents] AS two
ON (one.Document LIKE IIf(InStr(two.Document,"/")<>0, "*" & Left(two.Document,InStr(two.Document,"/")-1) & "*" & Right(two.Document,Len(two.Document)-InStr(two.Document,"/")) & "*", "*" & two.Document & "*"))
AND (one.Revision LIKE "*" & two.Revision & "*");
This DOES work, but it is very slow. Running the query can take over 20 seconds to finish, so I started writing a second script that uses Unions to format the data instead:
SELECT
one.ID
one.Document,
one.Revision
FROM 8110 AS one WHERE one.Document NOT like "*,*"
UNION
SELECT
one.ID,
Trim(Left(one.Document, InStr(one.Document, ",") - 1)),
Left(one.Rev, 1)
FROM 8110 AS one WHERE one.Document LIKE "*,*"
UNION
SELECT
one.ID,
IIF(Left(Trim(Mid(one.Document, InStr(one.Document, ",") + 1)), 1) = "/", Left(one.Document, InStrRev(Trim(Left(one.Document,InStr(one.Document,",")-1)), "/")) & Trim(Mid(one.Document, InStr(one.Document, ",") + 3)), Trim(Mid(one.Document, InStr(one.Document, ",") + 1))),
Trim(Mid(one.Revision, InStr(one.Revision, ",") + 1))
FROM 8110 AS one WHERE one.Document LIKE "*,*";
This is significantly faster than the current query, but it only works on the first two documents. Here’s what the output looks like:
ID | Document | Revision |
---|---|---|
8000C001 | 103A4000X001 | A |
8000C001 | 103A4000X002, 103A4000X003 | B, A |
8000C002 | 103A5000G002/1/01 | C |
8000C002 | 103A5000G002/1/02, /03 | D, B |
8000C003 | 103A6000B001 | A |
I could solve this by just adding additional unions, but I don’t know how many documents a record could contain. Does SQL have some form of recursion? In any other programming language I would just convert the documents into an array using the commas as a delimiter, then add a record for each element.
Also, I understand that the abbreviated files significantly complicate this query. You don’t have to account for that in your response; it’s my problem to deal with. Here’s variants of my queries that don’t account for the abbreviated files, and just split them by their commas:
SELECT
one.ID
two.Document,
two.Revision,
FROM [All Documents] AS one
RIGHT JOIN [Active Documents] AS two
ON (one.Document LIKE "*" & two.Document & "*"))
AND (one.Revision LIKE "*" & two.Revision & "*");
SELECT
one.ID
one.Document,
one.Revision
FROM 8110 AS one WHERE one.Document NOT like "*,*"
UNION
SELECT
one.ID,
Trim(Left(one.Document, InStr(one.Document, ",") - 1)),
Left(one.Rev, 1)
FROM 8110 AS one WHERE one.Document LIKE "*,*"
UNION
SELECT
one.ID,
Trim(Mid(one.Document, InStr(one.Document, ",") + 1)),
Trim(Mid(one.Revision, InStr(one.Revision, ",") + 1))
FROM 8110 AS one WHERE one.Document LIKE "*,*";
3
Maybe you can try with a helper table and a VBA function:
- Create a new table with sequence numbers 0,1,2,3 as rows (integer column ‘Seq’
): tblSeq, with max being maximum no of parts you expect, - Create a VBA function to extract the subpart of the string, given its position:
Public Function GetSubString(Str As String, Seq As Integer) As String
Dim strRetVal As String, CommaPos As Integer, i As Integer, CommaPosNext As Integer
strRetVal = ""
CommaPos = 0
For i = 1 To Seq
CommaPos = InStr(CommaPos + 1, Str, ",")
Next i
CommaPosNext = InStr(CommaPos + 1, Str, ",")
GetSubString = Mid(Str, CommaPos + 1, IIf(CommaPosNext = 0, 255, CommaPosNext - CommaPos - 1))
End Function
then your query might be like:
SELECT TBL.ID, TBL.Document, TBL.Revision, Seq.Seq, GetSubString(TBL.Revision,Seq.Seq) as Rev
from YourTbl as TBL
inner join
tblSeq Seq
on len(TBL.Revision)-len(replace(TBL.Revision,',',''))>=Seq.Seq
Not claiming that it would perform better than any other option, or error free.
You can use VBA and recordset to populate a new table:
Option Explicit
Private rsIN As Recordset
Private rsOUT As Recordset
Sub Fill_New()
Dim dbs As Database
Set dbs = CurrentDb
Set rsIN = dbs.OpenRecordset("8110", dbOpenForwardOnly, dbReadOnly)
Set rsOUT = dbs.OpenRecordset("NEW8110", , dbAppendOnly)
Dim i As Integer
Do Until rsIN.EOF
Dim documents As Variant
documents = Split(rsIN("Document"), ",")
If UBound(documents) = 0 Then ' single value in document field '
AddNew ID:=rsIN("ID"), Document:=rsIN("Document"), revision:=rsIN("Revision")
Else ' comma seperated values in document field '
Dim revisions As Variant
revisions = Split(rsIN("Revision"), ",")
If InStr(rsIN("Document"), "/") > 0 Then ' / used for revisions '
AddNew ID:=rsIN("ID"), Document:=Trim(documents(0)), revision:=Trim(revisions(0))
Dim last_slash_pos As Integer
last_slash_pos = InStrRev(documents(0), "/")
Dim base_text As String
base_text = Left(documents(0), last_slash_pos - 1)
For i = LBound(documents) + 1 To UBound(documents)
AddNew ID:=rsIN("ID"), Document:=base_text & Trim(documents(i)), revision:=Trim(revisions(i))
Next
Else ' comma seperated '
For i = LBound(documents) To UBound(documents)
AddNew ID:=rsIN("ID"), Document:=Trim(documents(i)), revision:=Trim(revisions(i))
Next
End If
End If
rsIN.MoveNext
Loop
rsOUT.Close
Set rsOUT = Nothing
rsIN.Close
Set rsIN = Nothing
Set dbs = Nothing
End Sub
Private Sub AddNew(ID As String, Document As String, revision As String)
rsOUT.AddNew
rsOUT("ID") = ID
rsOUT("Document") = Document
rsOUT("Revision") = revision
rsOUT.Update
End Sub