GPT, the large language model developed by OpenAI, received a lot of media attention in early 2023 for its remarkable capabilities in natural language processing. With its advanced algorithms and massive training capabilities, GPT has the ability to generate text that reads as if it had been created by a human. It can complete tasks and even create new content.
Now the capability of GPT has been vastly extended with long anticipated functionality: the ability to generate text based on an image. It can actually understand the image and generate a description, keywords and landmarks to name just a few. This development unlocks a new realm of possibilities for digital asset management.
Enhanced Metadata Creation
GPT's ability to analyze images alongside text enables more comprehensive metadata generation. It can automatically tag and describe images based on their content, improving searchability and categorization without relying solely on manually entered metadata. It's as though someone else is completing some of your metadeata for you - you just need to scan over it to make sure you're happy.
Facilitation of Rights Management:
GPT's ability to recognize objects, scenes, or even text within images can assist in rights management by identifying copyrighted material or trademarks. This capability can be used to help ensure compliance and avoid unintentional misuse of assets.
Example: Using GPT to process an image
Example: Using GPT to process text
To use the plugin you must have an account and API key from OpenAI. For testing purposes free trial credentials may be used, but be aware that such accounts will have strict usage limits and at the moment don't allow images to be processed. To process images you'll need to add some credit to your account. Once you've logged in to your Open AI account it's Settings>Billing>Add payment details. Pricing can be found here.
- Enable the plugin by navigating to Admin->System->Plugins and clicking on 'Activate' next to 'OpenAI API GPT integration' - under the Asset Processing section
- Click on 'Options', add your OpenAI API key to the plugin setup page and click on 'Save configuration'
- In order for GPT to process an image you have to choose the right API model. At the moment only the 'gpt-4-vision-preview' model is available.
WARNING: It is strongly advised to not change any of the other plugin options here.
Configuring metadata fields
You need to give the plugin instructions in the field where you want it to generate metadata. You also need to select the input so it knows where to get its information from. This can either be the image, or text in another field. ResourceSpace will then pass the prompt along with the data to GPT and populate the output field with the response. This will be updated whenever the value in the source field changes.
- Decide on the metadata field that you want to use for the output. Note that category tree and date fields can't be configured.
- Navigate to the metadata field configuration page and click on the advanced section.
- Scroll down and under the 'GPT Prompt' section, enter the prompt in natural language, describing what you want GPT to do e.g. 'Create a description of this image' or 'List no more than 10 keywords for the purposes of metadata indexing and searching.'
- Under 'GPT Input Field', select either the image ('Image: Preview image') or the field that contains the source data.
- Click the Save button
Using the plugin
Once the fields have been configured that's all there is to it. When there's data in the source field (either an image or text in another metadata field), ResourceSpace will send this data to GPT, along with the prompt instructing it what to do. The response will then be saved into the configured field. Watch the videos above to see this in action.
Possible uses of the plugin
The integration of GPT into ResourceSpace provides lots of possible ways to help you improve and automate metadata processing tasks. A few examples include:
- Automated image tagging and description: ResourceSpace can now be configured to automatically tag and describe images based on their content, improving searchability and categorization.
- Facilitation of Rights Management: ResourceSpace can now be configured to recognize objects, scenes, or even text within images which can assist in rights management by identifying copyrighted material or trademarks. This capability helps ResourceSpace users ensure compliance and avoid unintentional misuse of assets.
- Automated title/summary generation: ResourceSpace can now be configured to generate concise and descriptive titles and summaries from large blocks of text, like a PDF file, subtitles from a video file, or pulled in from a Collection Management System or Product Information Management system.
- Keyword extraction: ResourceSpace can now automatically extract meaningful keywords from a block of text, providing a list of relevant keywords for searching, categorization, and more. The list of keywords can be made more specific based on the prompt provided.
- Automatic categorization: GPT integration can be used to automatically categorize digital assets based on their content, saving time and ensuring consistent and accurate categorization. Automated categorization can also generate custom reports and statistics for valuable insights into the types and distribution of digital assets.
Populating existing resources
If you already have resources with input data when you enable the plugin you can run the process_existing.php script to update the target field. These steps need to be performed on the ResourceSpace server by a system administrator.
- On the server, navigate to the ResourceSpace web root and then into 'plugins/openai_gpt/pages/'
- Run the process_existing.php script as below
php process_existing.php [OPTIONS...] OPTIONS SUMMARY --help Display help text and exit -c --collection Collection ID. Only resources in the specified collections will be updated -f --field ID of metadata field to update -o --overwrite Overwrite existing data in the field? Note that if overwrite is enabled and the input field contains no data the target field will be cleared. False by default EXAMPLES php process_existing.php --field=18 --collection=56 php process_existing.php --field=18 --collection=56 --overwrite php process_existing.php -f 18 -c56 -c77 php process_existing.php -f 18 -c56 -c77 -o
Tips and troubleshooting
- Text fields being populated with JSON: If you have configured a text field to use the GPT response and the input field is of a fixed list type you may see the text output being formatted as JSON e.g. ["warship", "battle", "invasion"]. To fix this simply indicate in your prompt how you want the output to be returned - i.e. instead of 'Provide 3 keywords' you could write 'Provide 3 keywords, returned as a comma separated list'. This will mean your data gets returned in a more readable form i.e. 'warship, battle, invasion'. You can even say 'make sure there's no JSON in the output'.