I am trying to translate a bunch of .doc documents to .docx. I am using the script below, which works fine on its own.
Sub TranslateDocIntoDocx()
Dim objWordApplication As New Word.Application
Dim objWordDocument As Word.Document
Dim strFile As String
Dim strFolder As String
strFolder = ""
strFile = Dir(strFolder & "*.doc", vbNormal)
destFolder = ""
While strFile <> ""
With objWordApplication
Set objWordDocument = .Documents.Open(FileName:=strFolder & strFile, AddToRecentFiles:=False, ReadOnly:=True, Visible:=False)
With objWordDocument
.SaveAs FileName:=destFolder & Replace(strFile, "doc", "docx"), FileFormat:=16
.Close
End With
End With
strFile = Dir()
Wend
Set objWordDocument = Nothing
Set objWordApplication = Nothing
End Sub
Except when I do this and then parse the xml, I have a bunch of extraneous rsid tags breaking up words, making the parsing a little more complex. I’m trying to understand how to remove those, but so far haven’t had any success. Below is a modified version of the above code I tried, but didn’t remove the tags.
Sub TranslateDocIntoDocx()
Dim objWordApplication As New Word.Application
Dim objWordDocument As Word.Document
Dim strFile As String
Dim strFolder As String
strFolder = ""
strFile = Dir(strFolder & "*.doc", vbNormal)
destFolder = ""
While strFile <> ""
With objWordApplication
Set objWordDocument = .Documents.Open(FileName:=strFolder & strFile, AddToRecentFiles:=False, ReadOnly:=True, Visible:=False)
With objWordDocument
'Remove revisions
.TrackRevisions = False
'.AcceptAllRevisionsShown
.RemoveDocumentInformation wdRDIDocumentProperties
.RemoveDocumentInformation wdRDIRevisions
.RemoveDocumentInformation wdRDIComments
.RemoveDocumentInformation wdRDIRemovePersonalInformation
'Turn off grammar and spell check options
.ShowGrammaticalErrors = False
.ShowSpellingErrors = False
.GrammarChecked = False
.SpellingChecked = False
.SaveAs FileName:=destFolder & Replace(strFile, "doc", "docx"), FileFormat:=16
.Close
End With
End With
strFile = Dir()
Wend
Set objWordDocument = Nothing
Set objWordApplication = Nothing
End Sub
In addition, I tried adding in
.StoreRSIDOnSave = False
But depending on where I placed it either errored out or didn’t remove the rsid tags. Also tried accepting all revisions with either .AcceptAllRevisions or .AcceptAllRevisionsShown but would get error when including them. Does anyone know if there’s a VBA solution for doing this? My environment is pretty limited so I would be looking for a solution in either VBA or base python.