
Today's article brings us to an exciting—and to me, really creepy—development in artificial intelligence: ChatGPT-4 Vision, or GPT-4V for short. This new capability lets you upload images to ChatGPT and have it tell you all sorts of things about what it ‘sees’.
Why exciting? There are a lot of possible uses, once we develop the habit of going to it. I’m confident that what’s been written so far is just scratching the surface of what GPT-4V can offer.
Why creepy? Wait until you see how much it can figure out from a variety of images. Turn a hand-scribbled note into a categorized shopping list. Tell you how to fix something that’s broken. Give you the recipe for an appetizing-looking dish, just from a photo of the plate. I almost saved this for Halloween because it’s got me so freaked out!
But for today’s StrefaTECH tome, I’m going to stick to some rather pedestrian ideas for how GPT-4V could be of use in the nonprofit world.
Access and Basics
First, it's worth noting that GPT-4V is available only as part of the ChatGPT-4 Plus (paid) plan. To utilize its image analysis capabilities on a computer, click on GPT-4 at the top (to access the Plus features), and you'll find an image icon located to the left of the prompt area. Clicking this will allow you to upload an image for analysis and interaction. Or if you download the ChatGPT app to your phone and switch to GPT-4, then similarly tap the image icon there or tap the camera to take a photo (which I think is going to be my go-to use of GPT-4V!).
You can upload multiple images within a chat conversation. It can be handy to tell ChatGPT how to refer to them; for example, upload two images and instruct it to refer to them as #1 and #2. It’ll figure out to refer to the first one you uploaded as #1 and the second as #2 (obviously!).
Then, you can go to town with questions about what’s in the images. ChatGPT seems to do well at figuring out what elements are included in an image — for example, a bike, a sign, a road, a stoplight. If you want to draw attention to something, you can circle it in an image editor before uploading, and you simply ask ChatGPT about the circled item. And like with everything else in the GenAI chatbots, it remembers what you’ve already discussed in your chat, can generate new ideas or content, will answer questions and suggest ideas, and more.
I just learned something that I didn’t know before … you can do this in Bard now, too, and for free!
I asked Bard when this capability to upload and analyze images was added, and it replied that it was developed in July 2023 and released a few months later. Of course, Bard is known to be wrong / make things up / hallucinate (like the rest of the GenAI chatbots), so I don’t trust the veracity of that statement. If somebody decides to dig into whether that’s true, fill me in.
Sheesh, keeping up with the advances in AI is crazy hard! If you see something I missed or is out of date, please clue me in!
Relevance to Nonprofits
While GPT-4V has completely broad applicability, I’ll suggest a handful of ways that nonprofits might use this technology. This list just scratches the surface; I’m very interested in more ideas that come to your minds!
Financial Documentation: By uploading pictures of receipts, you can inquire about specific financial details, such as the amount of tax paid or the amount of the total that was for alcoholic beverages. This could streamline expense submission and accounting, the bane of everyone’s professional life!
Venue Analysis: Imagine you're considering locations for an upcoming event. You could upload image an image—captured from Google Street View, for example—highlight a particular area, and then ask GPT-4V for information about that space. This could help you to narrow down the options before scheduling site visits.
Content Summarization: Did someone hand you a long article or report that may (or may not!) be interesting? Upload images of the pages and ask GPT-4V to tell you what’s in it; because it’s in ChatGPT, you can have a conversation to explore whether it’s worth reading.
Language Translation: For organizations working with multilingual communities, you can upload images containing text and have GPT-4V translate it for you.
Picture / Graphic Selection: If you're debating between different images for an Impact Report, for example, GPT-4V can provide insights into which image may better align with your intended message. Then ask it why it has a particular perspective. You may or may not agree with its views, but having a partner handy to brainstorm with is priceless.
Audience Reaction: When devising marketing material or social posts, you could upload images and ask GPT-4V to speculate on potential public reaction. Too edgy? Something completely different than you expected? It’ll tell you, and you can explore why.
To give you a taste of what you might expect, here’s my dialogue with GPT-4V about a picture I took of the entrance to a local park.1
A Reminder — Think Before You Upload!
What you upload to the internet may become part of a response to someone’s query down the road. The simplest behavior at this complicated, ever-changing time of data privacy and AI is to remember:
If you wouldn’t email it to the world, don’t upload it to AI !
The Bottom Line
Some of the new capabilities being released in the AI world may take awhile to get a handle on. They seem cool, but until it’s shown how they might be helpful, the luster can fade quickly. Image analysis with GPT-4V or Bard seems to fall squarely in this category.
But … BUT! … there’s a lot of promise for novel ways that these kinds of capabilities can be time savers, creativity builders, and dare I speculate favorite tools.
If this is one that isn’t compelling at the get-go, don’t totally abandon the idea of finding use for it down the road!
As a note, just before publishing this article, I discovered that Google’s Bard has this capability of uploading and analyzing images as well. I submitted the same picture and prompt to it, and the response was very similar to what ChatGPT provided. “Table stakes” for these GenAI chatbots are growing by leaps and bounds!