Skip to content

Question on browser-extension #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anishhiranandani opened this issue Jan 30, 2025 · 3 comments
Open

Question on browser-extension #10

anishhiranandani opened this issue Jan 30, 2025 · 3 comments

Comments

@anishhiranandani
Copy link

After installing the browser extension, given a Browser/BrowserContext object, is there a way to execute a click action on the open page? (Without going the agent flow) This is for a task that needs to happen outside the agent loop execution.

For example:
browser = Browser()
page = browser.goto("https://www.google.com")
Then using the browser extension,
page.perform_click(selector)

@galatian44to7
Copy link
Contributor

@anishhiranandani that would be interesting. You could include that information in your prompt or in advanced settings you can save the click functionality that you want. Depending on the model you're using, it might do it. But as far as the extension, that is not yet included. Can you describe further how you would want the functionality to look like?

@anishhiranandani
Copy link
Author

Hi @galatian44to7 , thanks for your prompt reply.
Well, this might not have anything to do with the model. Consider it a preprocessing step that is executed just before the agent loop is started.
The sequence of steps would be:

  1. Create a Browser/BrowserContext object with a config (path to google chrome, etc.)
  2. Go to a URL on a page using this browser object
  3. Using the browser (or a member object of this class), execute a click action on an element on the page opened in step 2
  4. Continue with agent loop.

Does this seem feasible?

@galatian44to7
Copy link
Contributor

@anishhiranandani what do you think it would look like on the UI side? As in, how would you like the user to specify this behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants