How I use AI to become a better developer

Over the many years I’ve been a WordPress engineer I’ve performed countless launches and migrations. It’s actually one of my favourite things to do. Often times that involves manipulating large amounts of data and as such I’ve learnt about a bunch of command line tools to help me process and manipulate data and assets.

My skills on the command line are still pretty rough though. I’ve learnt what I need to know and most of the time it takes me a little while, including searching the web, to figure out exactly what command(s) will get me the end result. Sometimes this is hours of figuring things out. Each time I learn something new and I’m quicker the next time, which is great.

Which is where AI comes in…

Now amongst the hype about AI tools like ChatGPT taking our jobs I wanted to see if it could help me be more efficient. I was pleasantly surprised with how ChatGPT not only provides you with answers but explains them. In practice I found that I could both save myself time searching through slightly-related Stack Overflow answers and have a better learning experience at the same time.

To demonstrate what I mean I’ll take you through a real life example of how I used ChatGPT to help me with a migration task. I’m moving a WordPress site with 7 years of posts and over 800,000 files in the uploads folder that I need to copy to the new host (in this case, WordPress.com VIP).

VIP accepts an archived uploads folder which is limited to 60GB in size, so we want to create an archive that includes only the original media and not the intermediate sizes that WordPress automatically creates for us. On VIP, these intermediate sizes are generated dynamically.

Now I know how to use find and I know how to create folder archives with tar but I’m not sure how best to combine these to create an archive without the intermediate images. This is where I find AI tools really helpful, but you also have to know how to ask the question so that ChatGPT can understand what you want. This is what I went with:

I have a folder named “uploads”. Within that folder a year folders (e.g. “uploads/2023/”). Within each year folder are month folders (e.g. “uploads/2023/05/”). Within each month folder are media files such as images, PDFs etc. For each image file (e.g. “Wizards.jpg”) there are additional, resized, copies with a naming convention that includes the image dimensions. For example, Wizards.jpg is copied to Wizards-768×283.jpg, Wizards-319×179.jpg and so on. There are thousands of such files, we do not know the name of each file or the sizes, only the filename format. I need an archive (.zip or .tar.gz) of the entire uploads folder but I want to exclude all of the resized images and only include the originals. For example, I want Wizards.jpg but not Wizards-319×179.jpg. When extracted, the archive must create the uploads folder. I need to do this using the linux command line (bash). What command(s) will accomplish this?

Note how I’m very concise and specific. I want to make sure to avoid any ambiguity that could be misinterpreted or misunderstood. Like any piece of software the output is only ever going to be as good as the input.

The initial response is good, but I see an obvious issue:

ChatGPT response that reads:

You can use the find command to locate all the original images and pipe the output to the tar command to create the archive. Here's an example command that should achieve what you're looking for:

find uploads/ -type f -not -name '*-*.*' | tar czvf originals.tar.gz -T -

This command will recursively search the uploads directory for all files (-type f) that do not contain a hyphen followed by any characters and a period in their filename (-not -name '*-*.*'). This will exclude all files with the naming convention used for resized images. The output of find is piped to tar with the -T - option, which reads the list of files to include from standard input. The resulting archive will be compressed with gzip (-z) and written to a file named originals.tar.gz.

Note that this command assumes that there are no other files in the uploads directory besides the year folders. If there are other files in that directory that should be excluded, you may need to modify the find command accordingly.

I understand that the -not -name '-.*' portion will exclude any media files with a hyphen in, not just those with intermediate sizes. My question wasn’t clear enough! I raise the issue and ask for a refined command:

I have a file at uploads/2023/05/White_House-1.jpg that I believe would be excluded by the proposed command. Refine it to ensure such files are included.

I got refined command back, but see if you can spot the issue:

find uploads/ -type f | grep -vE '[-][0-9]+x[0-9]+[.][^/]*$' | tar czvf originals.tar.gz -T -

Now we have grep with a regex too – fun! But it’s focusing too much on those intermediate file sizes. Again, my question wasn’t specific enough, so I asked:

Will this file be included?: uploads/wpforms/cache/index.html

Sure enough, it recognised it wouldn’t and provided an update. After some back and forth it reintroduced the problematic ! -name '*-*.*' clause so I realised I needed to be much clearer about my expectations:

Nearly there… to test the command I created the following directory structure and files:
uploads/2021/01/White_House-1-150×150.jpg
uploads/2021/01/White_House-1.jpg
uploads/2023/05/Wizards-768×283.jpg
uploads/2023/05/Wizards-319×179.jpg
uploads/2023/05/Wizards.jpg
uploads/blog-1-ads.txt
uploads/wpforms/cache/index.html

This is what I expect the archive to include:
uploads/2021/01/White_House-1.jpg
uploads/2023/05/Wizards.jpg
uploads/blog-1-ads.txt
uploads/wpforms/cache/index.html

However, the command produces an archive with only:
uploads/2023/05/Wizards.jpg
uploads/wpforms/cache/index.html

Can you refine the command to match my expectation?

As you can see I’m now giving it very good test data and very clear expectations of output. However, it got stuck and gave me this:

find uploads/ -type f ! -name '*-*.*' ! -name '*[0-9]x[0-9]*.*' | tar czvf originals.tar.gz -T -

It really likes that regex!

But I knew this wasn’t going to work, but I used my existing knowledge to fix the command:

find uploads/ -type f ! -name '*[0-9]x[0-9]*.*' | tar czvf originals.tar.gz -T -

So you see, AI tools like ChatGPT can be incredibly useful when you have incomplete knowledge and need some help figuring out the rest. With good questions and input AI can help you speed up figuring out tasks like this.

Of course, with it being so polite I had to let ChatGPT know the command wasn’t quite right and what the correct command is. Perhaps it’ll be able to learn from that, just like it helped me learn today.

Screenshot of an exchange with ChatGPT that reads:

The command still didn't work, but I was able to fix it. The issue was the first `-name` parameter. The final working command is:
find uploads/ -type f ! -name '*[0-9]x[0-9]*.*' | tar czvf originals.tar.gz -T -

ChatGPT: I apologize for the confusion and I'm glad to hear that you were able to fix the command. You are correct, the ! -name '*[0-9]x[0-9]*.*' option is the one that excludes all resized images based on the naming convention used in your folder structure.

Thank you for your patience and for letting me know the final command that worked for you.

So polite

How I use AI to become a better developer

Comments

Leave a Reply Cancel reply

How I use AI to become a better developer

Share this:

Comments

Leave a Reply Cancel reply