I have to use a ton of regex in my new job (plz save me), and I use ChatGPT for all of it. My job would be 10x harder if it wasn’t for ChatGPT. It provides extremely detailed examples and warns you of situations where the regex may not perform as expected. Seriously, try it out.
Just make sure to test the regex instead of blindly slapping it in assuming it works 🙂
What if I say “it’s probably okay just this one time” before I do it every time?
Ah I’ve tested this method, shit breaks a lot. Still my go to.
Can we just have another LLM check the work for us? Like an LLM-GAN?
I’m not sure if it’s still the case, but asking it to review what it just wrote for errors has led to significant quality improvements previously.
The new Code Interpreter plugin that went live for this week for Plus users can actually execute Python code on a sandboxed environment. This allows you to add “Write and execute tests for the regex” to the end of your prompt.
Regex101 is a sandbox env specifically for Regex
Not just for writing, and testing samples. It will also explain the parts of the regex.
However it won’t generate examples that will pass the regex - which may be the biggest benefit of chatGPT.
This is the way. Everything ChatGPT produces for me gets tested and debugged here.
This is where I go to validate the work of ChatGPT. The debugging capabilities in that site are wonderful.
I’ve tried it and found it wanting at regex and excel formulas, but I’m glad to hear it’s working for you! Are you using 4? I haven’t tried that one and I hear it’s better.
I typically try 3.5 first and switch to 4 if the results aren’t great. 3.5 typically handles basic use cases quite well, for example, writing regex that detects jira ticket naming nomenclature. For more complex things, I go to 4.
It sometimes gets things wrong, but I’ve also found that just saying “that didn’t work” gets it to reevaluate for more complex situations
it helps if you hold ChatGPTs hand and walk it through what you need. For example if you have a regex with 3 requirements, ask it to write a regex for the first requirement, then ask it to modify the previous output to add another requirement, and so on. that way you can sort of “audit” it as it generates the correct regex.
there is some more discussion of this in a similar post from a few days ago.
So I was trying to write a regex for use with my ChatGPT discord bot. I wanted to trim off any final partial sentence at the end. I went around and around with it for a couple of hours because look ahead and look behind are just not something I do often enough.
It kept writing more and more complicated regex that didn’t work. The final solution, while not exactly perfect - it won’t keep a quote at the end of a sentence, and honorifics like Mr. and Dr. throw it - it wasn’t nearly as complicated as ChatGPT was making it. It still never did give me anything working - I just fucked around on regex101 until I got it right. As usual but having wasted 90 minutes or so.
I’ve found that you need to be very careful when asking it to modify things it produced directly without making significant changes to the regex it provides. Once I get to the 3rd or 4th iteration of asking it to modify previous responses I’ve found the likelihood that it starts hallucinating to increase dramatically. The best solution I’ve found to this is to put your entire request in a single prompt that walks it through all requirements step-by-step.
You can improve the reliability if you provide it test cases. You can now be the PM you wish you had for the robot that will eventually replace you.
I hated everything about this comment, thanks.
I agree, my regex experience was not great.
Also curious. If I had some AI help with regex that would be awesome. But I felt as you said it wouldn’t work great without 4. Which I don’t have.
If you think regex is the hard part of programming, then you’re in for a bad time.
I often need to deal with half a dozen different programming languages in any day/week and the context switching can be difficult at times. When you’ve spent all day switching between JavaScript, Python, and YAML and suddenly need to draft some Regex, tools like ChatGPT can help immensely at reducing the mental burden of switching gears.
The syntax of regular regexes is the same across languages though. It’s just the regex library which is different, but so is every other library between languages.
If the project is less than a thousand lines of code in a language with a garbage collector, it probably is. Most other problems don’t require learning a DSL to handle them, and most other DSL’s aren’t nearly as terse.
“i have this problem I know what I’ll do! I’ll use regex to fix it!”
Uses regex.
“Yay problem is now fixed it works!”
Now has 2 problems.
totally!!!
Make sure to ask ChatGPT “are you sure?”
Most of the time it will apologize, and admit that it made a mistake. It keeps messing up my config files unless I demand that it double check its work and to ask for details if it lacks information.
Thanks for this post, I use regex a often and did not know gpt would be good at this…
That’s the problem. It will confidently give you an correct sounding answer.
If it is actually true is a different topic. So don’t just blindly trust it. Verify, or at least sanity check it.
This this this!!! I know this is a post from the place that shall not be named, but it just showcases the issues with ChatGPT (this is from when GPT4 was just released)
I have yet to see a regex that is so complicated that I would need some help. I expect programmers to know how to use regexes but it seems that it’s not the case. And when it becomes too big, you always can write verbose regexes with comments, it’s even easier. If someone could show me something too difficult for a human being (excluding the regex to validate emails), I’m interested.
Regex isn’t difficult, just annoying to ensure it is bug-free. If ChatGPT can help, then I don’t know why you wouldn’t be in favor of it
It’s not that I’m incapable of evaluating regex, but rather the mental burden of evaluating complex regex statements and determining their purpose can be time-consuming. Why take 20 minutes to understand some regex when ChatGPT can do it in 20 seconds?
A coworker once defined regex as a write-only language and he definitely had a point. I love regex but it can be time consuming figuring out exactly what a complex regex expression is doing.
It’s often developers who never took a finite automata class who I’ve seen struggle with regular expressions.
It’s kind of like writing code in C while not understanding how memory management works
Huh. That class looked hard as hell, I didn’t take it, and now I’m 2 years out of school still googling regex every time I need it.
Maybe I should do some reading 😅
It was mandatory. I’m glad I took it, but I’m glad it’s over 😂😂😂
Just look up how finite automatas work. You don’t need to understand turing machines or turing completeness
You can also ask it to do write VBA code for Excel, or Jira queries.
Still a bit new to Jira, what are Jira queries?
Typically called JQL, it’s a simple query language to find info. For example, there’s a simple query to find epics with a particular affects version and/or fix version, or return epics that are missing information in a particular field.
The default or basic Jira can’t do some things though. Like I haven’t been able to get the total number of story points from issues within an epic. I think you need a 3rd party plugin for that.
That’s nice to know, hopefully I can bring that up during our sprint planning sessions when necessary.
My biggest problem with it has been that it doesn’t necessarily understand that some things are impossible - for example, variable-length lookbehinds.
A variable length lookbehind is the same as the opposite of a variable length lookahead.
That depends on the regex flavor. Some of them have full support for variable length lookbehinds, for example JavaScript and third-party
regex
module for Python.
Wait, you guys don’t use AI to make regex?
I use regex101.com
Up to now that usually was faster than trying to get chatGPT to generate something worthwhile. However, if you define some test cases first, the combination of both will even get the sales guy there eventually.
Ugh god it’s been a shit day with sales, let’s not bring them up. The turds.
I tried it and naaah it’s not that great. Keeps giving a rule for sample text too, despite really making it clear that I want a more general one.