Enhancing Multimodal Query Suggestions with Multi-Agent Reinforcement Learning

multimodal query suggestion with multi agent n.w
1 / 9
Embed
Share

Explore how Multi-Agent Reinforcement Learning, with human feedback, enhances multimodal query suggestions. The process involves generating candidate suggestions from query images, assigning labels through models and human annotation, and training agents for intentionality and diversity. Experimental setups cover both generation-based and retrieval-based search engines with ground truth annotations and evaluation metrics.

  • Multimodal Query Suggestions
  • Reinforcement Learning
  • Human Feedback
  • Search Engines
  • Experimental Setup

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback Zheng Wang, Bingzheng Gan, Wei Shi Huawei Singapore Research Center

  2. Query Suggestion: TQS, VQS, and MMQS Multimodal Query Suggestion (MMQS): Input: query image Output: textual suggestions Characteristics: Intentionality & Diversity 2

  3. RL4Sugg Pipeline RLHF-based Framework Input: Image Output: Suggs Step-2: RewardNet Step-4: Diversity Step-1: Data Labeling Step-3: PolicyNet Reinforcement Diversity Enhancement GPT-assisted Labeling Learning from Human Feedback Labelling 0 or 1 label for users search intent Priority-based Clustering Reward Network 0-1 scoring for image- sugg pairs, image-sugg representation learning clustering centers will be selected for suggs, and centers are formed based on ranking priorities GPT-assisted labeled by GPT if conf < thre, and labeled by human otherwise Policy Network image-to-sugg generative learning based on a LLM (sugg-adapter) Diversity Enhancement Agent Intentionality Diversity

  4. Data Collection Step 1: GPT-4 generates multiple candidate suggestions from a query image Step 2: The model assigns a label (1 or 0) to each suggestion, indicating user click intent, along with a confidence (0 to 1) Step 3: Suggestions with low confidence are filtered out using a threshold (e.g., 0.5) and then undergo human annotation to produce the final labels

  5. Training Overview of Agent-I and Agent-D Multi-Agent Reinforcement Learning from Human Feedback Agent-I (RewardNet & PolicyNet) for Intentionality Agent-D for Diversity

  6. Experimental Setup Applications: Generation-based & Retrieval-based Search Engines Generation-based: generate suggestions from LLMs Retrieval-based: construct a suggestion database, domain-specific (fashion, sports, animal, shopping) scenarios, high efficiency Ground Truth & Evaluation: human annotations for search intent Baselines: techniques in Metrics: vision-language pretraining models Generation: DCG/NDCG, GSB Retrieval : PNR, Recall@K

  7. Experimental Results Intentionality and diversity Ground truth quality verification Handling the cold-start problem

  8. Qualitative results Cover various intentions of the query image

  9. Q & A

Related


More Related Content