OpenAI GPT

Introduction

GPT, the large language model developed by OpenAI, received a lot of media attention in early 2023 for its remarkable capabilities in natural language processing. With its advanced algorithms and massive training capabilities, GPT has the ability to generate text that reads as if it had been created by a human. It can complete tasks and even create new content.

Now the capability of GPT has been vastly extended with long anticipated functionality: the ability to generate text based on an image. It can actually understand the image and generate a description, keywords and landmarks to name just a few. This development unlocks a new realm of possibilities for digital asset management.

Enhanced Metadata Creation

GPT's ability to analyze images alongside text enables more comprehensive metadata generation. It can automatically tag and describe images based on their content, improving searchability and categorization without relying solely on manually entered metadata. It's as though someone else is completing some of your metadeata for you - you just need to scan over it to make sure you're happy. 

Facilitation of Rights Management:

GPT's ability to recognize objects, scenes, or even text within images can assist in rights management by identifying copyrighted material or trademarks. This capability can be used to help ensure compliance and avoid unintentional misuse of assets.

Example: Using GPT to process an image

To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video

Example: Using GPT to process text

Requirements

To use the plugin you must have an account and API key from OpenAI. For testing purposes free trial credentials may be used, but be aware that such accounts will have strict usage limits and at the moment don't allow images to be processed. To process images you'll need to add some credit to your account. Once you've logged in to your Open AI account it's Settings>Billing>Add payment details. Pricing can be found here

Initial configuration

  1. Enable the plugin by navigating to Admin->System->Plugins and clicking on 'Activate' next to 'OpenAI API GPT integration' - under the Asset Processing section
  2. Click on 'Options', add your OpenAI API key to the plugin setup page and click on 'Save configuration'
  3. In order for GPT to process an image you have to choose the right API model. At the moment only the 'gpt-4-vision-preview' model is available.

WARNING: It is strongly advised to not change any of the other plugin options here.

Configuring metadata fields

You need to give the plugin instructions in the field where you want it to generate metadata. You also need to select the input so it knows where to get its information from. This can either be the image, or text in another field. ResourceSpace will then pass the prompt along with the data to GPT and populate the output field with the response. This will be updated whenever the value in the source field changes.

    1. Decide on the metadata field that you want to use for the output. Note that category tree and date fields can't be configured.
    2. Navigate to the metadata field configuration page and click on the advanced section.
    3. Scroll down and under the 'GPT Prompt' section, enter the prompt in natural language, describing what you want GPT to do e.g. 'Create a description of this image' or 'List no more than 10 keywords for the purposes of metadata indexing and searching.'
    4. Under 'GPT Input Field', select either the image ('Image: Preview image') or the field that contains the source data.
    5. Click the Save button

Open AI Preview Image Input

 

OpenAI gpt metadata field 2

Using the plugin

Once the fields have been configured that's all there is to it. When there's data in the source field (either an image or text in another metadata field), ResourceSpace will send this data to GPT, along with the prompt instructing it what to do. The response will then be saved into the configured field. Watch the videos above to see this in action.

Possible uses of the plugin

The integration of GPT into ResourceSpace provides lots of possible ways to help you improve and automate metadata processing tasks. A few examples include:

  • Automated image tagging and description: ResourceSpace can now be configured to automatically tag and describe images based on their content, improving searchability and categorization.
  • Facilitation of Rights Management: ResourceSpace can now be configured to recognize objects, scenes, or even text within images which can assist in rights management by identifying copyrighted material or trademarks. This capability helps ResourceSpace users ensure compliance and avoid unintentional misuse of assets.
  • Automated title/summary generation: ResourceSpace can now be configured to generate concise and descriptive titles and summaries from large blocks of text, like a PDF file, subtitles from a video file, or pulled in from a Collection Management System or Product Information Management system.
  • Keyword extraction: ResourceSpace can now automatically extract meaningful keywords from a block of text, providing a list of relevant keywords for searching, categorization, and more. The list of keywords can be made more specific based on the prompt provided.
  • Automatic categorization: GPT integration can be used to automatically categorize digital assets based on their content, saving time and ensuring consistent and accurate categorization. Automated categorization can also generate custom reports and statistics for valuable insights into the types and distribution of digital assets.

Populating existing resources

If you already have resources with input data when you enable the plugin you can run the process_existing.php script to update the target field. These steps need to be performed on the ResourceSpace server by a system administrator.

  1. On the server, navigate to the ResourceSpace web root and then into 'plugins/openai_gpt/pages/'
  2. Run the process_existing.php script as below
    php process_existing.php [OPTIONS...]
    
    OPTIONS SUMMARY
    
    --help          Display help text and exit
    
    -c
    --collection    Collection ID. Only resources in the specified collections
                    will be updated
    
    -f
    --field         ID of metadata field to update
    
    -o
    --overwrite     Overwrite existing data in the field? Note that if overwrite 
                    is enabled and the input field contains no data the target 
                    field will be cleared. False by default
    
    EXAMPLES
    
    php process_existing.php --field=18 --collection=56
    php process_existing.php --field=18 --collection=56 --overwrite
    php process_existing.php -f 18 -c56 -c77
    php process_existing.php -f 18 -c56 -c77 -o
    

Tips and troubleshooting

  • Text fields being populated with JSON: If you have configured a text field to use the GPT response and the input field is of a fixed list type you may see the text output being formatted as JSON e.g. ["warship", "battle", "invasion"]. To fix this simply indicate in your prompt how you want the output to be returned - i.e. instead of 'Provide 3 keywords' you could write 'Provide 3 keywords, returned as a comma separated list'. This will mean your data gets returned in a more readable form i.e. 'warship, battle, invasion'. You can even say 'make sure there's no JSON in the output'.