Revolutionary Method for Action Planning Optimization

sequential discrete action selection via blocking n.w

1 / 21

Embed Share

Explore a groundbreaking method, Sequential Action Planning, challenging traditional task planners. Dive into the proposed BCR approach, revolutionizing action selection via blocking conditions and resolutions. Discover how it dynamically constructs trajectories backward from the goal, optimizing decision-making and planning efficiency.

dollison_i Follow

Uploaded on May 28, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Sequential Discrete Action Selection via Blocking Conditions and Resolutions Hoffmeister et al. 2024 Presented by Howard Qin

Sequential Action Planning Break down a high-level task into smaller actions that eventually reach a goal Difficult Exponential planning time Dynamic & hostile environment Overview But useful Broad-domain automation

Typical Task Planner Procedure Problem For all viable actions, model how environment changes, then recurse O(m^n) planning time Often have to learn / estimate non-terminal state values Similar to BFS / A* Constructs entire trajectory to goal Start over if environment changed

Proposed Method BCR Changes Blocking Conditions and Resolutions DFS from goal towards start (cheaper planning (usually)) Working backwards from goal Whole trajectory is dynamically constructed step by step (cheaper replanning) Making decisions one at a time

BCR IN DETAIL

BCR in detailterminologies Instances Predicates Literals Dictionary nouns in the world (mug, apple ) Boolean functions that take in instances as args Predicates given all args that can eval to true or false Think objects in programming Think Boolean functions isUnder(toy, table) Set of all literals is called world state Apple apple; isUnder(Obj o1, Obj o2)

Blocking Condition & Resolution Actions have effects Specified through predicates (e.g. pickup(Obj o1) makes isUnder(Obj o1, Obj o2) false) Actions have blocking conditions Specified through predicates (e.g. pickup(Obj o1) can t proceed if isInView(Obj o1) == false) Blocking conditions have resolution actions A set of actions with effects that will resolve said blocking condition

1. Try executing an action that directly contributes to goal 2. If no blocking conditions, reached goal! Main Loop 3. Otherwise, try executing an action that directly resolves the blocking condition 4. Repeat

isNearCampus(me) == false, blocking! Candidate actions: walk(me), getInCar(me), getOnBus(me) isAwake(me) == true, no problem! isVisible(opusCard) == false, blocking! Execute wearClothes(me) Choose visualSearch(opusCard) Choose getOnBus(me) to resolve packItem(...) wearClothes(me) getOnBus(me) arriveMcGill(me) visualSearch(...) GOAL: atMcgill(me) == true isDressed(me) == false, blocking! hasTicket(me) == false, blocking! Candidates: sneak, pack card wearClothes(me) Choose packItem(me, opusCard)

Example in Paper

But you left out the most important detail

How does the robot know how to choose from available actions?

The Selection Engine Powered by LLM Uses ChatGPT3.5-T Natural language in, natural language out 0-shot, no fine tuning, no few-shot Use regex to extract chosen resolution action Each selection step starts a new conversation!

Prompt Construction 1. Stating role and task 2. Meta-instructions (take time, think carefully etc) 3. Array of previous actions taken 4. Array of sub-goals already completed 5. Array of current candidate actions 6. Array of remaining unsatisfied goal predicates 7. Description of any previous errors 8. Emphasize choosing only from current candidate actions and formatting

Prompt Construction "I am a robot called character acting in a household environment and I need your help selecting my next atomic action from a limited set to move towards my goal." I have completed the following actions: {previous_actions[-5:]} "Right now I can only perform the following actions: {actions}" "This is how these objects relate to each other: {relevant_relations}" "NOTE the following actions involve a goal object: {goal_actions}" "Of these actions which should I take to move towards my goal of {goal}. include an explaination for your action selection. Please refrain from getting stuck in action loops and provide your selected action in the format '$$ <action> <object, room, (including id tag) or character> <optional second object (including id tag) depending on action> $$."

Example output $$walktoobjectFridge|- 02.64|+00.00|+02.13$$.ThisactionwillhelpyoureachtheFridge,whichiswhereth eAppleislocatedbasedontheinformationyouprovided.

Evaluation - - Eval done in AI2Thor env 4 tasks: Make coffee Put apple in fridge Wash mug in sink Make toast

Results - BCR had highest accuracy BCR had good runtime BCR had fewest actions considered per decision - - FF-Replan-lim: removes all relevant room location info ProgPrompt-no-find: no cheat automatic finding items

Discussions Needs no exploration to learn state values (good and bad) Requires manually specifying blocking conditions Higher accuracy and relatively low planning time than previous methods Sometimes hallucinates actions Sometimes gets stuck in action loop (may be helped by maintaining message history) No need to search entire graph Implicit replanning capabilities No optimality guarantees (obviously) (not even close)

Pros & Cons Pros Cons Demonstrates clear algorithmic superiority Limited state & action space size Open source prompts and code Questionable design choices (imo) - Starting new chat for every planning step Allowing natural language output Even asks for explanation explicitly No fine tuning Well-defined scope, no bloating - - -

Summary Start from goal state and work backwards Choose actions using LLM Recursively resolve blocking conditions Demonstrated superiority Future work on automated blocking condition-action mapping, prompt engineering, etc

Revolutionary Method for Action Planning Optimization

Download Presentation

Presentation Transcript

Related

More Related Content