I’m trying to fine-tune a regex to quickly check if the auto-break feature of Subtitle Edit has produced any funny-looking breaks syntactic/semantic-segmentation-wise. I have put together an extensive “do not break after” list over the years, so it mostly behaves, but it never hurts to double-check.
As it stands, I’m using this:
[a-z]+(?:n[a-z]+)*$
As you can see in the screenshot, it gets the job done, even ignoring line breaks when any sort of punctuation is in the picture. Still, how do I go about matching just the word “charge” in this example and ignoring the likes of “call” and look? There are technically two line breaks between each caption group, so I suppose that’s the one thing that can be leveraged to discriminate. Of course, I don’t need either “call” or “look” to be flagged in this example because I manually go over that side of things while captioning, splitting lines at semantically logical places. So, I basically only need to match the final word of all two-line subs, provided there’s no punctuation at the end of it, but, as I’ve said, what I’m currently using already takes care of that. I’ve looked far and wide here but haven’t found a solution to this exact issue. Any advice? Thanks in advance.
Screenshot
Sarah is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.