{"id":27125599,"url":"https://github.com/aws/nova-act","last_synced_at":"2026-01-07T21:16:09.005Z","repository":{"id":285416661,"uuid":"946281616","full_name":"aws/nova-act","owner":"aws","description":"Amazon Nova Act is a research preview of a new AI model for developers to build agents that take actions in web browsers","archived":false,"fork":false,"pushed_at":"2025-04-24T21:23:37.000Z","size":3262,"stargazers_count":638,"open_issues_count":2,"forks_count":87,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-05-08T00:08:04.414Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://nova.amazon.com/act","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aws.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-10T22:36:50.000Z","updated_at":"2025-05-07T21:19:06.000Z","dependencies_parsed_at":"2025-04-07T18:33:35.022Z","dependency_job_id":null,"html_url":"https://github.com/aws/nova-act","commit_stats":null,"previous_names":["aws/nova-act"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fnova-act","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fnova-act/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fnova-act/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aws%2Fnova-act/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aws","download_url":"https://codeload.github.com/aws/nova-act/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254364270,"owners_count":22058878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-07T15:20:35.271Z","updated_at":"2026-01-07T21:16:08.998Z","avatar_url":"https://github.com/aws.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Nova Act SDK\n\nA Python SDK for Amazon Nova Act.\n\nAmazon Nova Act is available as a new AWS service to build and manage fleets of reliable AI agents for automating production UI workflows at scale. Nova Act completes repetitive UI workflows in the browser and escalates to a human supervisor when appropriate. You can define workflows by combining the flexibility of natural language with Python code. Start by exploring in the web playground at nova.amazon.com/act, develop and debug in your IDE, deploy to AWS, and monitor your workflows in the AWS Console, all in just a few steps.\n\n(Preview) Nova Act also integrates with external tools through API calls, remote MCP, or agentic frameworks, such as Strands Agents.\n\n\n\u003e #### ⚠️ Notice: Support for Nova Act SDK versions prior to 3.0 will end on January 21, 2026.\n\n\u003e Please follow the upgrade instructions below:\n\n \u003e ```bash\n \u003e # Upgrade to the latest version\n \u003e pip install --upgrade nova-act\n \u003e\n \u003e # Check your current version\n \u003e pip show nova-act\n \u003e ```\n\n## Table of contents\n* [Pre-requisites](#pre-requisites)\n* [Nova Act IDE Extension](#quick-set-up-with-ide-extension)\n* [Nova Act Authentication and Installation](#authentication)\n* [Quick Start](#quick-start)\n* [How to prompt Nova Act](#how-to-prompt-act)\n* [Workflows](#workflows)\n* [Extract information from a web page](#extracting-information-from-a-web-page)\n* [Human-in-the-loop (HITL)](#human-in-the-loop-hitl) \n* [Tools](#tool-use-beyond-the-browser-preview)\n* [Run multiple sessions in parallel](#running-multiple-sessions-in-parallel)\n* [Authentication, cookies, and persisting browser state](#authentication-cookies-and-persistent-browser-state)\n* [Handling sensitive data](#entering-sensitive-information)\n* [Captchas](#captchas)\n* [Search on a website](#search-on-a-website)\n* [File upload and download](#file-upload-and-download)\n* [Working with dates](#picking-dates)\n* [Setting the browser user agent](#setting-the-browser-user-agent)\n* [Using a proxy](#using-a-proxy)\n* [Time worked tracking utility](#time-worked-tracking-utility)\n* [Logging and viewing traces](#logging)\n* [Recording a video of a session](#recording-a-session)\n* [Storing Session Data in Amazon S3](#storing-session-data-in-your-amazon-s3-bucket)\n* [Navigating Pages](#navigating-pages)\n* [Viewing headless sessions](#viewing-a-session-that-is-running-in-headless-mode)\n* [Use Nova Act SDK with Amazon Bedrock AgentCore Browser Tool](#use-nova-act-sdk-with-amazon-bedrock-agentcore-browser-tool)\n* [Known limitations](#known-limitations)\n* [Disclosures](#disclosures)\n* [Report a Bug](#report-a-bug)\n* [Reference: Nova Act constructor parameters](#initializing-novaact)\n* [Reference: Actuating the browser](#actuating-the-browser)\n* [Reference: Nova Act CLI](#nova-act-cli)\n\n## Pre-requisites\n\n1. Operating System: MacOS Sierra+, Ubuntu 22.04+, WSL2 or Windows 10+\n2. Python: 3.10 or above\n\n\u003e **Note:** Nova Act supports English.\n\n## Set Up\n\n### Quick Set Up with IDE Extension\n\nAccelerate your development process with the [Nova Act extension](https://github.com/aws/nova-act-extension). The extension automates the setup of your Nova Act development environment and brings the entire agent development experience directly into your IDE, enabling chat-to-script generation, browser session debugging, and step-by-step testing capabilities. For installation instructions and detailed documentation, visit the [extension repository](https://github.com/aws/nova-act-extension) or [website](https://nova.amazon.com/act).\n\n### Authentication\n\n#### API Key Authentication\n\nNote: When using the Nova Act Playground and/or choosing Nova Act developer tools with API key authentication, access and use are subject to the nova.amazon.com Terms of Use. \n\n\nNavigate to https://nova.amazon.com/act and generate an API key.\n\nTo save it as an environment variable, execute in the terminal:\n```sh\nexport NOVA_ACT_API_KEY=\"your_api_key\"\n```\n\n#### IAM-based Authentication\n\nNote: When choosing developer tools with AWS IAM authentication and/or deploying workflows to the Nova Act AWS service, your AWS Service Terms and/or Customer Agreement (or other agreement governing your use of the AWS Service) apply.\n\nNova Act also supports authentication using IAM credentials. For details please refer to the Amazon [Nova Act User Guide documentation](https://docs.aws.amazon.com/nova-act/latest/userguide/). To use IAM-based credentials use the Workflow constructs (see [Worfklows](#workflows)). Please note the SDK will instantiate a default boto session if AWS credentials are already configured in your environment.\n\n### Installation\n\n```bash\npip install nova-act\n```\n\nAlternatively, you can build `nova-act`. Clone this repo, and then:\n```sh\npip install .\n```\n\n#### [Optional] Install Google Chrome\nNova Act works best with Google Chrome but does not have permission to install this browser. You may skip this step if you already have Google Chrome installed or are fine with using Chromium. Otherwise, you can install Google Chrome by running the following command in the same environment where you installed Nova Act. For more information, visit https://playwright.dev/python/docs/browsers#google-chrome--microsoft-edge.\n```bash\nplaywright install chrome\n```\n\n\n## Quick Start\n\n*Note: The first time you run NovaAct, it may take 1 to 2 minutes to start. This is because NovaAct needs to [install Playwright modules](https://playwright.dev/python/docs/browsers#install-browsers). Subsequent runs will only take a few seconds to start. This functionality can be toggled off by setting the `NOVA_ACT_SKIP_PLAYWRIGHT_INSTALL` environment variable.*\n\n### Script mode\n\n```python\nfrom nova_act import NovaAct\n\nwith NovaAct(starting_page=“https://nova.amazon.com/act/gym/next-dot/search\") as nova:\n    nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n```\n\nThe SDK will (1) open Chrome, (2) perform the task as described in the prompt, and then (3) close Chrome. Details of the run will be printed as console log messages.\n\nRefer to the section [Initializing NovaAct](#initializing-novaact) to learn about other runtime options that can be passed into NovaAct.\n\n### Interactive mode\n\n_**NOTE**: NovaAct does not yet support `ipython`; for now, use your standard Python shell._\n\nUsing interactive Python is a nice way to experiment:\n\n```sh\n% python\nPython 3.10.16 (main, Dec  3 2024, 17:27:57) [Clang 16.0.0 (clang-1600.0.26.4)] on darwin\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n\u003e\u003e\u003e from nova_act import NovaAct\n\u003e\u003e\u003e nova = NovaAct(starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\")\n\u003e\u003e\u003e nova.start()\n\u003e\u003e\u003e nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n```\n\nPlease don't interact with the browser when an `act()` is running because the underlying model will not know what you've changed!\n\u003e Note: When using interactive mode, `ctrl+x` can exit the agent action leaving the browser intact for another `act()` call. `ctrl+c` does not do this -- it will exit the browser and require a `NovaAct` restart.\n\n### Samples\n\nThe [samples](./src/nova_act/samples) folder contains several examples of using Nova Act to complete various tasks, including:\n* search for apartments on a real estate website, find each apartment's distance from a train station using a maps website, and combine these into a single result set. [This sample](./src/nova_act/samples/search_apartments_calculate_commute.py) demonstrates running multiple NovaActs in parallel (more detail below).\n* book a flight using data that is provided by a tool, and return the booking number. [This sample](./src/nova_act/samples/booking_with_data_from_tool.py) demonstrates how to implement a python function as a tool that can be used to provide data for the workflow.\n* allows a human to log into an email application, and approve to print the number of emails. [This sample](./src/nova_act/samples/print_number_of_emails.py) demonstrates providing HITL (Human in the loop) callback implementations to incorporate human participation in the workflow.\n\nFor more samples showing how to use Nova Act SDK, please refer to this [Github repository](https://github.com/amazon-agi-labs/nova-act-samples)\n\n## How to prompt act()\n\nThe simplest way to use Nova Act to achieve an end-to-end task is by specifying the entire goal, possibly with hints to guide the agent, in one prompt. However, the agent then must take many steps sequentially to achieve the goal, and any issues or nondeterminism along the way can throw the workflow off track. We have found that Nova Act works most reliably when the task can be accomplished in fewer than 30 steps.\n\nMake sure the prompt is direct and spells out exactly what you want Nova Act to do, including what information you want it to return, if any (read more on data extraction [here](#extracting-information-from-a-web-page)). Aim to completely specify the choices the agent should make and what values it should put in form fields. During your testing, if you see act() going off track, enhance the prompt with hints (e.g. how to use certain UI elements it encounters, how to get to a particular function on the website, or what paths to avoid) — just like you would do with a new team member who might be unfamiliar with the task and the website. If the agent is taking a long winding path or you are unable to get repeated reliability, break the task up into stages and connect these in code.\n\n**1. Be direct and succinct in what the agent should do**\n\n❌ DON'T\n```python\nnova.act(\"Let's see what routes vta offers\")\n```\n\n✅ DO\n```python\nnova.act(\"Navigate to the routes tab\")\n```\n\n❌ DON'T\n```python\nnova.act_get(\"I want to go and meet a friend. I should figure out when the Orange Line comes next.\")\n```\n\n✅ DO\n```python\nnova.act_get(f\"Find the next departure time for the Orange Line from Government Center after {time}\")\n```\n\n**2. Provide complete instructions**\n\n❌ DON'T\n```python\nnova.act(\"book me a hotel that costs less than $100 with the highest star rating\")\n```\n\n✅ DO\n```python\nnova.act(f\"book a hotel for two adults in Houston between {startdate} and {enddate} that costs less than $100 per night with the highest star rating. two queen beds preferred but single king also ok. stop when you get to the enter customer details or payment page.\")\n```\n\n**3. Break up large acts into smaller ones**\n\n❌ DON'T\n```python\nnova.act(\"book me a hotel that costs less than $100 with the highest star rating then find the closest car rental and get me car there, finally find a lunch spot nearby and book it at 12:30pm\")\n```\n\n✅ DO\n```python\nhotel_address = nova.act_get(f\"book a hotel for two adults in Houston between {startdate} and {enddate} that costs less than $100 per night with the highest star rating. two queen beds preferred but single king also ok. return the address of the hotel you booked.\").response\nnova.act(f“book a restaurant near {hotel_address} at 12:30pm for two people”)\nnova.act(f“rent a small sized car between {startdate} and {enddate} from a car rental place near {hotel_address}”)\n```\n\nAnd if the agent still struggles, break it down:\n\n```python\nnova.act(f\"search for hotels for two adults in Houston between {startdate} and {enddate}\")\nnova.act(\"sort by avg customer review\")\nhotel_address = nova.act_get(\"book the first hotel that is $100 or less. prefer two queen beds if there is an option. return the address of the hotel you booked.\").response\nnova.act(f“book a restaurant near {hotel_address} at 12:30pm on {startdate} for two people”)\nnova.act(f“search for car rental places near {hotel_address} and navigate to the closest one’s website”)\nnova.act(f“rent a small sized car between {startdate} and {enddate}, pickup time 12pm, drop-off 12pm.”)\n```\n\n## Workflows\n\nA workflow defines your agent's end-to-end task. Workflows are comprised of act() statements and Python code that orchestrate the automation logic.\n\nThe `nova-act` SDK provides a number of convenience wrappers for managing workflows deployed with the NovaAct AWS service. Simply call the CreateWorkflowDefinition API (or use the AWS Console) and get a WorkflowDefinition to get started.\n\n### The Context Manager\n\nThe core type driving workflow coordination with the NovaAct service is `Workflow`. This class provides a [context manager](https://peps.python.org/pep-0343/) which will handle calling the necessary workflow API operations from the Amazon Nova Act service. It calls `CreateWorkflowRun` when your run starts and `UpdateWorkflowRun` with the appropriate status when it finishes. It is provided to the `NovaAct` client via a constructor argument, so that all called APIs will be associated with the correct workflow + run (`CreateSession`, `CreateAct`, `InvokeActStep`, `UpdateAct` etc.). See the following example for how to use it:\n\n```python\nimport os\nfrom nova_act import NovaAct, Workflow\n\ndef main():\n    with Workflow(\n        workflow_definition_name=\"\u003cyour-workflow-name\u003e\",\n        model_id=\"nova-act-latest\"\n    ) as workflow:\n        with NovaAct(\n            starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\",\n            workflow=workflow,\n        ) as nova:\n            nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n\nif name == \"main\":\n    main()\n```\n\n#### Retry handling\nBy default, when a Nova Act request times out, the Nova Act SDK will retry it once. This can be overridden by passing in a `boto_config` object to the Workflow constructor. You can also use this object to override the default 60 second `read_timeout`. For example, to retry a request 4 times (for a total of 5 attempts) with a 90 second timeout:\n\n```python\nboto_config = Config(retries={\"total_max_attempts\": 5, \"mode\": \"standard\"}, read_timeout=90)\nwith Workflow(\n    boto_config=boto_config,\n    workflow_definition_name=\"\u003cyour-workflow-name\u003e\",\n    model_id=\"nova-act-latest\"\n) as workflow:\n```\nNote that retrying the same Nova Act request may result in increased cost if the request ends up executing multiple times. For more information on retries including retry modes, please refer to the [botocore retry documentation](https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html).\n\n### The Decorator\n\nFor convenience, the SDK also exposes a [decorator](https://peps.python.org/pep-0318/) which can be used to annotate functions to be run under a given workflow. The decorator leverages [ContextVars](https://peps.python.org/pep-0567/) to inject the correct `Workflow` object into each `NovaAct` instance within the function; no need to provide the `workflow` keyword argument! The following syntax provides identical functionality to the previous example:\n\n```python\nfrom nova_act import NovaAct, workflow\n\n@workflow(\n    workflow_definition_name=\"\u003cyour-workflow-name\u003e\",\n    model_id=\"nova-act-latest\",\n)\ndef main():\n    with NovaAct(starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\") as nova:\n        nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n\nif __name__ == \"__main__\":\n    main()\n```\n\n#### Configuring AWS Credentials with `boto_session_kwargs`\n\nThe `Workflow` class accepts an optional `boto_session_kwargs` parameter for customizing the boto3 Session configuration. **By default, if not provided, the workflow uses `{\"region_name\": \"us-east-1\"}`** when AWS credentials are available.\n\nIf you need to customize your AWS session (e.g., to use a specific profile or provide explicit credentials), you can pass a custom dictionary to `boto_session_kwargs`. This works with both the **Context Manager** and **Decorator** versions:\n\n**Using the Context Manager:**\n\n```python\nfrom nova_act import NovaAct, Workflow\n\ndef main():\n    with Workflow(\n        workflow_definition_name=\"\u003cyour-workflow-name\u003e\",\n        model_id=\"nova-act-latest\",\n        boto_session_kwargs={\n            \"profile_name\": \"my-aws-profile\",\n            \"region_name\": \"us-east-1\"\n        }\n    ) as workflow:\n        with NovaAct(\n            starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\",\n            workflow=workflow,\n        ) as nova:\n            nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n\nif __name__ == \"__main__\":\n    main()\n```\n\n**Using the Decorator:**\n\n```python\nfrom nova_act import NovaAct, workflow\n\n@workflow(\n    workflow_definition_name=\"\u003cyour-workflow-name\u003e\",\n    model_id=\"nova-act-latest\",\n    boto_session_kwargs={\n        \"profile_name\": \"my-aws-profile\",\n        \"region_name\": \"us-east-1\"\n    }\n)\ndef main():\n    with NovaAct(starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\") as nova:\n        nova.act(\"Find flights from Boston to Wolf on Feb 22nd\")\n\nif __name__ == \"__main__\":\n    main()\n```\n\n**Note:** If you don't provide `boto_session_kwargs` and don't use an API key, the workflow will automatically load AWS credentials using boto3 (more details [here](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) on how boto3 loads AWS credentials).\n\n### Best Practices\n\n#### Multi-threading\n\nThe `Workflow` class will work as-is for multi-threaded workflows. See the following example:\n\n```python\nfrom nova_act import NovaAct, Workflow\n\ndef multi_threaded_helper(workflow: Workflow):\n    with NovaAct(..., workflow=workflow) as nova:\n       # nova will have the appropriate workflow run\n \nwith Workflow(\n    workflow_definition_name=\"my-workflow\",\n    model_id=\"nova-act-latest\"\n) as workflow:\n    t = Thread(target=multi_threaded_helper, args=(workflow,))\n    t.start()\n    t.join()\n```\n\nBecause the `@workflow` decorator leverages ContextVars for injecting context, and because ContextVars are intentionally designed to be thread-specific, users will have to provide the context to any functions that will run in different threads from where the wrapping function is defined. See the following example:\n\n```python\nfrom contextvars import copy_context\nfrom nova_act import NovaAct, workflow\n\ndef multi_threaded_helper():\n    with NovaAct(...) as nova:\n       # nova will have the appropriate workflow run\n \n@workflow(\n    workflow_definition_name=\"my-workflow\"\n    model_id=\"nova-act-latest\",\n)\ndef multi_threaded_workflow():\n    ctx = copy_context()\n    t = Thread(target=ctx.run, args=(multi_threaded_helper,))\n    t.start()\n    t.join()\n\nmulti_threaded_workflow()\n```\n\nOr, alternatively, use the `workflow` argument directly to manually inject it, as when directly leveraging the `Workflow` class:\n\n```python\nfrom nova_act import NovaAct, get_current_workflow, workflow\n\ndef multi_threaded_helper(workflow: Workflow):\n    with NovaAct(..., workflow=workflow) as nova:\n       # nova will have the appropriate workflow run\n \n@workflow(\n    workflow_definition_name=\"my-workflow\"\n    model_id=\"nova-act-latest\",\n)\ndef multi_threaded_workflow():\n    t = Thread(target=multi_threaded_helper, args=(get_current_workflow(),))\n    t.start()\n    t.join()\n\nmulti_threaded_workflow()  \n```\n#### Multi-processing\nThe `Workflow` construct does not currently support passing between multi-processing because it maintains a boto3 Session and Client as instance variables, and those objects are not [pickle](https://docs.python.org/3/library/pickle.html)-able. Support coming soon!\n\n### Nova Act CLI\n\nThe Nova Act CLI provides a streamlined command-line interface for deploying Python workflows to AWS AgentCore Runtime, handling containerization, ECR management, IAM roles, and multi-region deployments automatically. See the [Nova Act CLI README](./src/nova_act/cli/README.md) for installation and usage instructions.\n\n## Common Building Blocks\n\n### Extracting information from a web page\n\nUse `pydantic` and ask `act_get` to respond to a question about the browser page in a certain schema.\n\n- Make sure you use a schema whenever you are expecting any kind of structured response, even just a bool (yes/no). If a schema is not provided, the returned object will not contain a response.\n- Put a prompt to extract information in its own separate `act()` call.\n\nFor convenience, the `act_get()` function works the same as `act()` but provides a default `STRING_SCHEMA`, so that a response will always be available in the return object whether or not a specific schema is provided. We recommend using `act_get()` for all extraction tasks, to ensure type safety.\n\nExample:\n\n```python\nfrom nova_act import NovaAct\nfrom pydantic import BaseModel\n\nclass Measurement(BaseModel):\n    value: float\n    unit: str\n\nclass PlanetData(BaseModel):\n    gravity: Measurement\n    average_temperature: Measurement\n\nwith NovaAct(\n        starting_page=\"https://nova.amazon.com/act/gym/next-dot\"\n    ) as nova:\n        planet = 'Proxima Centauri b'\n        result = nova.act_get(\n            f\"Go to the {planet} page and return the gravity and average temperature.\",\n            schema=PlanetData.model_json_schema(),\n        )\n\n        # Parse the response into the data model\n        planet_data = PlanetData.model_validate(result.parsed_response)\n\n        # Do something with the parsed data\n        print(f\"✓ {planet} data:\\n{planet_data.model_dump_json(indent=2)}\")\n```\n\nIf all you need is a bool response, there's a convenient `BOOL_SCHEMA` constant:\nExample:\n\n```python\nfrom nova_act import NovaAct, ActInvalidModelGenerationError, BOOL_SCHEMA\nwith NovaAct(starting_page=\"https://nova.amazon.com/act\") as nova:\n    try:\n        result = nova.act_get(\"Am I logged in?\", schema=BOOL_SCHEMA)\n    except ActInvalidModelGenerationError as e:\n        # act response did not match the schema ¯\\_(ツ)_/¯\n        print(f\"Invalid result: {e}\")\n    else:\n        # result.parsed_response is now a bool\n        if result.parsed_response:\n            print(\"You are logged in\")\n        else:\n            print(\"You are not logged in\")\n```\n\n### Human-in-the-loop (HITL)\n\nNova Act's Human-in-the-Loop (HITL) capability enables seamless human supervision within autonomous web workflows. HITL is available in the Nova Act SDK for you to implement in your workflows (not provided as a managed AWS service). When your workflow encounters scenarios requiring human judgment or intervention, HITL can provide tools and user interfaces for supervisors to assist, verify, or take control of the process. \n\n#### HITL patterns\n\n##### Human approval\n\nHuman approval enables asynchronous human decision-making in automated processes. When Nova Act encounters a decision point requiring human judgment, it captures a screenshot of the current state and presents it to a human reviewer via a browser-based interface. Use this when you need binary or multi-choice decisions (Approve/Reject, Yes/No, or selecting from predefined options).\n\n##### UI takeover\n\nUI takeover enables real-time human control of a remote browser session. When Nova Act encounters a task that requires human interaction, it hands control of the browser to a human operator via a live-streaming interface. The operator can interact with the browser using mouse and keyboard in real-time\n\n#### Implementing HITL\n\nPlease refer to the [Amazon Nova Act User Guide documentation on HITL](https://docs.aws.amazon.com/nova-act/latest/userguide/hitl.html#implementing-hitl) for implementing HITL in your production workflows.\n\n##### Implementing HITL using the SDK\n\nTo implement HITL patterns in the Nova Act SDK, define a class that extends `HumanInputCallbacksBase` and implements two of its abstract methods `approve` and `ui_takeover`. Pass an instance of it to the `human_input_callbacks` argument of the `NovaAct` constructor.\n\n- `approve` - is a callback that will be triggered for the Human approval pattern (e.g Approve expense reports or purchase approvals)\n- `ui_takeover` - is a callback that will be triggered for the UI takeover pattern (e.g Solve CAPTCHA challenges)\n\n```\nfrom nova_act import NovaAct, Workflow\nfrom nova_act.tools.human.interface.human_input_callback import (\n    ApprovalResponse, HumanInputCallbacksBase, UiTakeoverResponse,\n)\n\nclass MyHumanInputCallbacks(HumanInputCallbacksBase):\n    def approve(self, message: str) -\u003e ApprovalResponse:\n        ... \n\n    def ui_takeover(self, message: str) -\u003e UiTakeoverResponse:\n        ...\n\nwith NovaAct(\n    starting_page=...,\n    tty=False,\n    human_input_callbacks=MyHumanInputCallbacks(),\n) as nova:\n    ...\n    print(f\"Task completed: {result.response}\")\n```\n\nRefer to [this sample](./src/nova_act/samples/print_number_of_emails.py) for a working example.\n\n\n### Tool Use Beyond the Browser (Preview)\n\n(Preview) Nova Act allows you to integrate external tools beyond the browser, such as an API Call or Database Query, into workflows. Nova Act SDK allows using a Python function as a tool that can be invoked during a workflow step. To make a Python function available as a tool, annotate it with the @tool decorator. You can pass a list of tools to the NovaAct constructor argument tools.\n\n```\nfrom nova_act import NovaAct, tool\n\n@tool\ndef my_tool(str: input) -\u003e str:\n   ...\n\nwith NovaAct(\n    starting_page=...,\n    tools=[my_tool],\n)\n```\n\nRefer to [this sample](./src/nova_act/samples/booking_with_data_from_tool.py) for a working example.\n\n### Handling ActErrors\n\nOnce the `NovaAct` client is started, it might encounter errors during the `act()` execution. All of these error types are included in the [`nova_act.types.act_errors` module](./src/nova_act/types/act_errors.py), and are organized as follows:\n1. `ActAgentError`: Indicates requested prompt failed to complete; users may retry with a different request.\n   * Examples include: `ActAgentFailed` (the agent raised an error because the task was not possible), `ActInvalidModelGenerationError` (model generated output that could not be interpreted), or `ActExceededMaxStepsError` (`act()` failed to complete within the configured maximum number of steps)\n1. `ActExecutionError`: Indicates a local error encountered while executing valid output from the agent\n   * Examples include: `ActActuationError` (client encountered an exception while actuating the Browser), or `ActCanceledError` (the user canceled execution).\n1. `ActClientError`: Indicates a request to the NovaAct Service was invalid; users may retry with a different request.\n   * Examples include: `ActGuardrailsError` (the request was blocked by our RAI guardrails) or `ActRateLimitExceededError` (request was throttled; rate should be reduced).\n1. `ActServerError`: Indicates the NovaAct Service encountered an error processing the request.\n   * Examples include: `ActInternalServerError` (internal error processing request), `ActBadResponseError` (the service returned a response with unrecognized shape), or `ActServiceUnavailableError` (the service could not be reached.)\n\nUsers may catch `ActAgentError`s and `ActClientError`s and retry with the appropriate request; for `ActExecutionError`s and `ActServerError`s, please submit an issue to the team to look into, including (1) your SDK version, (2) your platform + operating system, (3) the full error trace, and (4) steps to reproduce.\n\n### Running multiple sessions in parallel\nOne `NovaAct` instance can only actuate one browser at a time. However, it is possible to actuate multiple browsers concurrently with multiple `NovaAct` instances! They are quite lightweight. You can use this to parallelize parts of your task, creating a kind of browser use map-reduce for the internet. [This sample](./src/nova_act/samples/search_apartments_calculate_commute.py) shows running multiple sessions in parallel.\n\n### Authentication, cookies, and persistent browser state\n\nNova Act supports working with authenticated browser sessions by overriding its default settings. By default, when Nova Act runs, it clones the Chromium user data directory and deletes it at the end of the run. To use authenticated sessions, you need to specify an existing directory containing the authenticated sessions, and disable the cloning (which in turn disables deletion of the directory).\n\nSpecifically, you need to:\n1. (optional) Create a new local directory for the user data directory For example, `/tmp/user-data-dir`. You can skip this step to use an existing Chromium profile.\n2. specify this directory when instantiating `NovaAct` via the `user_data_dir` parameter\n3. disable cloning this directory when instantiating `NovaAct` by passing in the parameter `clone_user_data_dir=False`\n4. instruct Nova Act to open the site(s) into which you want to authenticate\n5. authenticate into the sites. See [Entering sensitive information](#entering-sensitive-information) below for more information on entering sensitive data\n6. stop your Nova Act session\n\nThe next time you run Nova Act with `user_data_dir` set to the directory you created in step 1, you will start from an authenticated session. In subsequent runs, you can decide if you want to enable or disable cloning. If you are running multiple `NovaAct` instances in parallel, they must each create their own copy so you must enable cloning in that use case (`clone_user_data_dir=True`).\n\nHere's an example script that shows how to pass in these parameters.\n\n```python\nimport os\n\nfrom nova_act import NovaAct\n\nos.makedirs(user_data_dir, exist_ok=True)\n\nwith NovaAct(starting_page=\"https://nova.amazon.com/act\", user_data_dir=user_data_dir, clone_user_data_dir=False) as nova:\n    input(\"Log into your websites, then press enter...\")\n    # Add your nova.act() statements here.\n\nprint(f\"User data dir saved to {user_data_dir=}\")\n```\n\nThe script is included in the installation: `python -m nova_act.samples.setup_chrome_user_data_dir`.\n\n#### Run against the local default Chrome browser\n\nIf your local default Chrome browser has extensions or security features you need for sites you need your workflow to access, you can configure the SDK to use the Chrome browser installed on your machine rather than the one managed by the SDK using the `NovaAct` parameters below.\n\n\u003e **Important notes:**\n\u003e\n\u003e - This feature currently only works for MacOS\n\u003e - This will quit your default running Chrome and restart it with new arguments. At the end of the session, it will quit Chrome.\n\u003e - If your Chrome browser has many tabs open, consider closing unnecessary ones before running the automation, as Chrome's performance during the restart can be affected by high numbers of open tabs.\n\nBefore starting NovaAct with this feature, you must copy the files from your system Chrome user_data_dir to a location of your choice.\nThis is necessary as Chrome does not allow CDP connections into instances started with the system default user_data_dir.\n\nManually, this is can be done with:\n```\nrsync -a --exclude=\"Singleton*\" /Users/$USER/Library/Application\\ Support/Google/Chrome/ \u003cyour choice of location\u003e\n```\n\nYou can also use the convenience function `rsync_from_default_user_data(\u003cyour choice of location\u003e)` to create and update that directory as part of your script.\nNote that invoking `rsync_from_default_user_data` will overwrite changes in the destination directory and make it an exact mirror of `/Users/$USER/Library/Application\\ Support/Google/Chrome/` by overwriting existing files with the same name as in the source and deleting files not in it. If you want to persist profile changes that NovaAct made in the working directory back to your system, you must then mirror the changes back into the system default dir with your own implementation after stopping NovaAct.\n\nWhen using this feature, you must specify `clone_user_data_dir=False` and pass the desired working dir as `user_data_dir` with the appropriate files populated. This is because `NovaAct` will not be cloning or deleting the `user_data_dir`s for you in this mode.\n\n```python\n\u003e\u003e\u003e from nova_act import NovaAct, rsync_from_default_user_data\n\u003e\u003e\u003e working_user_data_dir = \"/Users/$USER/your_choice_of_path\"\n\u003e\u003e\u003e rsync_from_default_user_data(working_user_data_dir)\n\u003e\u003e\u003e nova = NovaAct(use_default_chrome_browser=True, clone_user_data_dir=False, user_data_dir=working_user_data_dir, starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\")\n\u003e\u003e\u003e nova.start()\n\u003e\u003e\u003e nova.act_get(\"Find flights from Boston to Wolf on Feb 22nd\")\n...\n\u003e\u003e\u003e nova.stop()\n\u003e\u003e\u003e quit()\n```\n\n### Entering sensitive information\n\nTo enter a password or sensitive information (e.g., credit card and social security number), do not prompt the model with the sensitive information. Ask the model to focus on the element you want to fill in. Then use Playwright APIs directly to type the data, using `client.page.keyboard.type(sensitive_string)`. You can get that data in the way you wish: prompting in the command line using [`getpass`](https://docs.python.org/3/library/getpass.html), using an argument, or setting env variable.\n\nNote that any passwords or other sensitive data saved with a Chromium-based browser's password manager on Linux systems without a system-level keyring (ex. Libsecret, KWallet) will be stored in plaintext within a user's profile directory.\n\n\u003e **Caution:** If you instruct Nova Act to take an action on any browser screen displaying sensitive information, including information provided through Playwright APIs, that information will be included in the screenshots collected.\n\n```python\n# Sign in.\nnova.act(\"enter username janedoe and click on the password field\")\n# Collect the password from the command line and enter it via playwright. (Does not get sent over the network.)\nnova.page.keyboard.type(getpass())\n# Now that username and password is filled in, ask NovaAct to proceed.\nnova.act(\"sign in\")\n```\n\n### Security Options\n\nNovaAct is initialized with secure default behaviors which you may want to relax depending on your use-case.\n\n#### Allow Navigation to Local `file://` URLS\n\nTo enable local file navigation, define one or more filepath patterns in `SecurityOptions.allowed_file_open_paths`\n```python\nfrom nova_act import NovaAct, SecurityOptions\n\nNovaAct(starting_page=\"file://home/nova-act/site/index.html\", SecurityOptions(allowed_file_open_paths=['/home/nova-act/site/*']))\n```\n\n#### Allow File Uploads\nTo allow the agent to upload files to websites, define one or more filepath patterns in `SecurityOptions.allowed_file_upload_paths`.\n\n```python\nfrom nova_act import NovaAct, SecurityOptions\n\nNovaAct(starting_page=\"https://example.com\", SecurityOptions(allowed_file_upload_paths=['/home/nova-act/shared/*']))\n```\n\n#### Filepath Structures\nThe filepath parameters support the following formats:\n- `[\"/home/nova-act/shared/*\"]` - Allow from specific directory\n- `[\"/home/nova-act/shared/file.txt\"]` - Allow a specific filepath\n- `[\"*\"]` - Enable for all paths\n- `[]` - Disable the feature (Default)\n\n### State Guardrails\n\nState guardrails allow you to control which URLs the agent can visit during execution. You can provide a callback function that inspects the browser state after each observation and decides whether to allow or block continued execution. If blocked, `act()` will raise `ActStateGuardrailError`. This is useful for preventing the agent from navigating to unauthorized domains or sensitive pages.\n\n```python\nfrom nova_act import NovaAct, GuardrailDecision, GuardrailInputState\nfrom urllib.parse import urlparse\nimport fnmatch\n\ndef url_guardrail(state: GuardrailInputState) -\u003e GuardrailDecision:\n    hostname = urlparse(state.browser_url).hostname\n    if not hostname:\n        return GuardrailDecision.BLOCK\n\n    # Example URL block-list\n    blocked = [\"*.blocked-domain.com\", \"*.another-blocked-domain.com\"]\n    if any(fnmatch.fnmatch(hostname, pattern) for pattern in blocked):\n        return GuardrailDecision.BLOCK\n\n    # Example URL allow-list\n    allowed = [\"allowed-domain.com\", \"*.another-allowed-domain.com\"]\n    if any(fnmatch.fnmatch(hostname, pattern) for pattern in allowed):\n        return GuardrailDecision.PASS\n\n    return GuardrailDecision.BLOCK\n\nwith NovaAct(starting_page=\"https://allowed-domain.com\", state_guardrail=url_guardrail) as nova:\n    # The following will be blocked if agent tries to visit a blocklisted domain or leave one of the allowlisted domains\n    nova.act(\"Navigate to the homepage\")\n```\n\n### Captchas\n\nYou should use the `ui_takeover` callback (see [HITL](#human-in-the-loop-hitl)) if your script encounters captchas in certain places. This will allow redirecting the step of solving Captcha to a human.\n\n### Search on a website\n\n```python\nnova.go_to_url(website_url)\nnova.act(\"search for cats\")\n```\n\nIf the model has trouble finding the search button, you can instruct it to press enter to initiate the search.\n\n```python\nnova.act(\"search for cats. type enter to initiate the search.\")\n```\n\n### File upload and download\n\nYou can use playwright to download a file on a web page.\n\nThrough a download action button:\n\n```python\n# Ask playwright to capture any downloads, then actuate the page to initiate it.\nwith nova.page.expect_download() as download_info:\n    nova.act(\"click on the download button\")\n\n# Temp path for the download is available.\nprint(f\"Downloaded file {download_info.value.path()}\")\n\n# Now save the downloaded file permanently to a location of your choice.\ndownload_info.value.save_as(\"my_downloaded_file\")\n```\n\n\u003e **Important notes**:\n\u003e\n\u003e - The browser will show the file being downloaded to the temporary path defined by Playwright ([see docs](https://playwright.dev/docs/downloads#introduction))\n\u003e    - This temporary path is accessible via `download_info.value.path()`\n\u003e  - When using `download_info.value.save_as()`:\n\u003e    - If a full path is provided (e.g., \"/path/to/my_downloaded_file\"), the file will be saved there\n\u003e    - If only a filename is provided (e.g., \"my_downloaded_file\"), it will be saved in the current working directory where the Python script was executed from\n\nTo download the current page:\n\n1. If it's HTML, then accessing `nova.page.content()` will give you the rendered DOM. You can save that to a file.\n2. If it is another content type, like a pdf, you can download it using `nova.page.request`:\n\n```python\n# Download the content using Playwright's request.\nresponse = nova.page.request.get(nova.page.url)\nwith open(\"downloaded.pdf\", \"wb\") as f:\n    f.write(response.body())\n```\n\nNovaAct can natively upload files using the appropriate upload action on the page. To do that, first you must allow NovaAct to access the file for upload. Then instruct it to\nupload it by filename:\n\n```python\nupload_filename = \"/upload_path/upload_me.pdf\"\n\nwith NovaAct(..., security_options=SecurityOptions(allowed_file_upload_paths=[\"/upload_path/*\"])) as nova:\n    nova.act(f\"upload {upload_filename} using the upload receipt button\")\n```\n\n\u003e **Important security note**:\n\u003e\n\u003e Pick `allowed_file_upload_paths` narrowly to minimize NovaAct's access to your filesystem to avoid data exfiltration by malicious sites or web content.\n\n### Picking dates\n\nSpecifying the start and end dates in absolute time works best.\n\n```python\nnova.act(\"select dates march 23 to march 28\")\n```\n\n### Setting the browser user agent\n\nNova Act comes with Playwright's Chrome and Chromium browsers. These use the default User Agent set by Playwright. You can override this with the `user_agent` option:\n\n```python\nnova = NovaAct(..., user_agent=\"MyUserAgent/2.7\")\n```\n\n### Using a proxy\n\nNova Act supports proxy configurations for browser sessions. This can be useful when you need to route traffic through a specific proxy server:\n\n```python\n# Basic proxy without authentication\nproxy_config = {\n    \"server\": \"http://proxy.example.com:8080\"\n}\n\n# Proxy with authentication\nproxy_config = {\n    \"server\": \"http://proxy.example.com:8080\",\n    \"username\": \"myusername\",\n    \"password\": \"mypassword\"\n}\n\nnova = NovaAct(\n    starting_page=\"https://example.com\",\n    proxy=proxy_config\n)\n```\n\n\u003e **Note:** Proxy configuration is not supported when connecting to a CDP endpoint or when using the default Chrome browser (`use_default_chrome_browser=True`).\n\n\n### Logging\nBy default, `NovaAct` will emit all logs level `logging.INFO` or above. This can be overridden by specifying an integer value under the `NOVA_ACT_LOG_LEVEL` environment variable. Integers should correspond to [Python logging levels](https://docs.python.org/3/library/logging.html#logging-levels).\n \n### Viewing act traces\n \nAfter an `act()` finishes, it will output traces of what it did in a self-contained html file. The location of the file is printed in the console trace.\n \n```sh\n\u003e ** View your act run here: /var/folders/6k/75j3vkvs62z0lrz5bgcwq0gw0000gq/T/tmpk7_23qte_nova_act_logs/15d2a29f-a495-42fb-96c5-0fdd0295d337/act_844b076b-be57-4014-b4d8-6abed1ac7a5e_output.html\n```\n \nYou can change the directory for this by passing in a `logs_directory` argument to `NovaAct`.\n\n### Time worked tracking utility\n\nThe time_worked utility tracks and reports the approximate time spent by the agent working on tasks, excluding time spent waiting for human input. This helps you understand the actual agent execution time.\n\n#### How It Works\nApproximate time worked is calculated using this basic formula:\n```\ntime_worked = (end_time - start_time) - human_wait_time\n```\n\nWhen an `act()` call completes (successfully or with an error), the following is calculated:\n- **Approx. Time Worked**: Total execution time (end time minus start time) minus any time spent waiting for human input\n- **Human Wait Time**: Time spent waiting for `approve()` or `ui_takeover()` callbacks from when the callback is issued to when the agent execution continues\n\n#### Console Output\n\nAt the end of each `act()` call, you'll see a time worked summary in the console, as well as in the JSON and HTML reports:\n\nWithout human input:\n```\n⏱️ Approx. Time Worked: 11.8s\n```\n\nWith human input:\n```\n⏱️  Approx. Time Worked: 28.3s (excluding 4.5s human wait)\n```\n\n#### Important Disclaimer\n\n\u003e **Note:** Time worked calculations are approximate and may have inaccuracies due to system timing variations, network latency, or other factors. This metric should be viewed as a utility to help understand agent execution patterns and should not be used for formal time tracking or billing purposes.\n\n### Recording a session\n \nYou can easily record an entire browser session locally by setting the `logs_directory` and specifying `record_video=True` in the constructor for `NovaAct`.\n\n### Storing Session Data in Your Amazon S3 Bucket\n\nNova Act allows you to store session data (HTML traces, screenshots, etc.) in your own [Amazon S3](https://aws.amazon.com/s3/) bucket using the `S3Writer` convenience utility:\n\n```python\nimport boto3\nfrom nova_act import NovaAct\nfrom nova_act.util.s3_writer import S3Writer\n\n# Create a boto3 session with appropriate credentials\nboto_session = boto3.Session()\n\n# Create an S3Writer\ns3_writer = S3Writer(\n    boto_session=boto_session,\n    s3_bucket_name=\"my-bucket\",\n    s3_prefix=\"my-prefix/\",  # Optional\n    metadata={\"Project\": \"MyProject\"}  # Optional\n)\n\n# Use the S3Writer with NovaAct\nwith NovaAct(\n    starting_page=\"https://nova.amazon.com/act/gym/next-dot/search\",\n    boto_session=boto_session,  # You may use API key here instead\n    stop_hooks=[s3_writer]\n) as nova:\n    result = nova.act_get(\"Find flights from Boston to Wolf on Feb 22nd\")\n```\n\nThe S3Writer requires the following AWS permissions:\n- s3:ListObjects on the bucket and prefix\n- s3:PutObject on the bucket and prefix\n\nWhen the NovaAct session ends, all session files will be automatically uploaded to the specified S3 bucket with the provided prefix.\n\n#### S3 Upload Troubleshooting\n\n**No files in S3 bucket?**\n- Check logs for \"Registered stop hooks\" message during initialization\n- Verify your code path actually executes the NovaAct context manager\n\n### Navigating pages\n\n\u003e **Use `nova.go_to_url` instead of `nova.page.goto`**\n\nThe Playwright Page's `goto()` method has a default timeout of 30 seconds, which may cause failures for slow-loading websites. If the page does not finish loading within this time, `goto()` will raise a `TimeoutError`, potentially interrupting your workflow. Additionally, goto() does not always work well with act, as Playwright may consider the page ready before it has fully loaded.\nTo address these issues, we have implemented a new function, `go_to_url()`, which provides more reliable navigation. You can use it by calling: `nova.go_to_url(url)` after `nova.start()`. You can also use the `go_to_url_timeout` parameter on `NovaAct` initialization to modify the default max wait time in seconds for the start page load and subsequent `got_to_url()` calls.\n\n### Viewing a session that is running in headless mode\n\nWhen running the browser in headless mode (`headless: True`), you may need to see how the workflow is progressing as the agent is going through it. To do this:\n1. set the following environment variables before starting your Nova Act workflow\n```bash\nexport NOVA_ACT_BROWSER_ARGS=\"--remote-debugging-port=9222\"\n```\n2. start your Nova Act workflow as you normally do, with `headless: True`\n3. Open a local browser to `http://localhost:9222/json`\n4. Look for the item of type `page` and copy and paste its `devtoolsFrontendUrl` into the browser\n\nYou'll now be observing the activity happening within the headless browser. You can also interact with the browser window as you normally would, which can be helpful for handling captchas. For example, in your Python script:\n1. ask Nova Act to check if there is a captcha\n2. if there is, `sleep()` for a period of time. Loop back to step 1. During `sleep()`...\n3. send an email / SMS alert (eg, with [Amazon Simple Notification Service](https://aws.amazon.com/sns/)) containing the `devtoolsFrontendUrl` signaling human intervention is required\n4. a human opens the `devtoolsFrontendUrl` and solves the captcha\n5. the next time step 1 is run, Nova Act will see the captcha has been solved, and the script will continue\n\nNote that if you are running Nova Act on a remote host, you may need to set up port forwarding to enable access from another system.\n\n\n## Use Nova Act SDK with Amazon Bedrock AgentCore Browser Tool\n\nThe Nova Act SDK can be used together with the [Amazon Bedrock AgentCore Browser Tool](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser-tool.html) for production-ready browser automation at scale. The AgentCore Browser Tool provides a fully managed cloud-based browser automation solution that addresses limitations around real-time data access, while the Nova Act SDK gives you the flexibility to build sophisticated agent workflows.\nSee [this blog post](https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-agentcore-browser-tool/) for integration instructions.\n\n\u003e **Note**: When the Nova Act SDK and Bedrock AgentCore Browser run on different operating systems (e.g., SDK on MacOS and AgentCore Browser on Linux), keyboard commands may not translate correctly between systems. This impacts certain SDK functions like `agent_type()`, which uses keyboard shortcuts (such as `ControlOrMeta+A` for \"select all\") that are OS-dependent. This behavior is an expected consequence of the cross-OS integration architecture and should be considered when developing automations that use keyboard input methods.\n\n## Known limitations\nOur vision for Nova Act is to provide key capabilities to build useful agents at scale. If you encounter limitations with Nova Act — please provide feedback to [nova-act@amazon.com](mailto:nova-act@amazon.com?subject=Nova%20Act%20Bug%20Report) to help us make it better.\n\n\nFor example:\n\n* `act()` cannot interact with non-browser applications;\n* `act()` cannot interact with the browser window. This means that browser modals such as those requesting access to use your location don't interfere with act() but must be manually acknowledged if desired;\n* Screen size constraints;\n  * Nova Act is optimized for resolutions between `864×1296` and `1536×2304`; and\n  * Performance may degrade outside this range\n\nLearn more in the AWS AI Service Card for Amazon Nova Act.\n\n## Reference\n\n\n### Initializing `NovaAct`\n\nThe constructor accepts the following:\n\n* `starting_page (str)`: The URL of the starting page; supports both web URLs (`https://`) and local file URLs (`file://`) (required argument)\n  * Note: file URLs require passing `ignore_https_errors=True` to the constructor\n* `headless (bool)`: Whether to launch the browser in headless mode (defaults to `False`)\n* `user_data_dir (str)`: Path to a [user data directory](https://chromium.googlesource.com/chromium/src/+/master/docs/user_data_dir.md#introduction), which stores browser session data like cookies and local storage (defaults to `None`).\n* `nova_act_api_key (str)`: The API key you generated for authentication; required if the `NOVA_ACT_API_KEY` environment variable is not set. If passed, takes precedence over the environment variable.\n* `logs_directory (str)`: The directory where NovaAct will output its logs, run info, and videos (if `record_video` is set to `True`).\n* `record_video (bool))`: Whether to record video and save it to `logs_directory`. Must have `logs_directory` specified for video to record.\n* `proxy (dict)`: Proxy configuration for the browser. Should be a dictionary containing:\n  * `server` (required): The proxy server URL (must start with `http://` or `https://`)\n  * `username` (optional): Username for proxy authentication\n  * `password` (optional): Password for proxy authentication\n  * Note: Proxy is not supported when connecting to a CDP endpoint or using the default Chrome browser\n* `human_input_callbacks` (optional): An implementation of human input callbacks. If not provided, a request for human input tool will not be made.\n* `tools` (optional): A list of client provided tools.\n\nThis creates one browser session. You can create as many browser sessions as you wish and run them in parallel but a single session must be single-threaded.\n\n### Actuating the browser\n\n#### Use act\n\n`act()` takes a natural language prompt from the user and will actuate on the browser window on behalf of the user to achieve the goal. Arguments:\n\n* `max_steps` (int): Configure the maximum number of steps (browser actuations) `act()` will take before giving up on the task. Use this to make sure the agent doesn't get stuck forever trying different paths. Default is 30.\n* `timeout` (int): Number of seconds timeout for the entire act call. Prefer using `max_steps` as time per step can vary based on model server load and website latency.\n* `observation_delay_ms`: Additional delay in milliseconds before taking an observation of the page. Useful to wait for UI animations to complete.\n\nReturns an `ActResult`.\n\n```python\nclass ActResult:\n    metadata: ActMetadata\n\nclass ActMetadata:\n    session_id: str | None\n    act_id: str | None\n    num_steps_executed: int\n    start_time: float\n    end_time: float\n    prompt: string\n```\n\nIf a schema is passed to `act()` (the `act_get()` function conveniently provides a default `STRING_SCHEMA`), then the returned object will be an `ActGetResult`, a subclass which includes the raw and structured response:\n\n```python\nclass ActGetResult(ActResult):\n    response: str | None\n    parsed_response: JSONType\n    valid_json: bool | None\n    matches_schema: bool | None\n```\n\n#### Do it programmatically\n\n`NovaAct` exposes a Playwright [`Page`](https://playwright.dev/python/docs/api/class-page) object directly under the `page` attribute.\n\nThis can be used to retrieve current state of the browser, for example a screenshot or the DOM, or actuate it:\n\n```python\nscreenshot_bytes = nova.page.screenshot()\ndom_string = nova.page.content()\nnova.page.keyboard.type(\"hello\")\n```\n\n## Disclosures\n\nNote: When using the Nova Act Playground and/or choosing Nova Act developer tools with API key authentication, access and use are subject to the nova.amazon.com Terms of Use. When choosing Nova Act developer tools with AWS IAM authentication and/or deploying workflows to the Nova Act AWS service, your AWS Service Terms and/or Customer Agreement (or other agreement governing your use of the AWS Service) apply.\n\n1. Nova Act may not always get it right. \n2. ⚠️ Please be aware that Nova Act may encounter commands in the content it observes on third party websites, including user-generated content on trusted websites such as social media posts, search results, forum comments, news articles, and document attachments. These unauthorized commands, known as prompt injections, may cause the model to make mistakes or act in a manner that differs from its instructions, such as ignoring your instructions, performing unauthorized actions, or exfiltrating sensitive data. To reduce the risks associated with prompt injections, it is important to monitor Nova Act and review its actions, especially when processing untrusted user-contributed content.\n3. We recommend you do not provide sensitive information to Nova Act, such as account passwords. Note that if you use sensitive information through Playwright calls, the information could be collected in screenshots if it appears unobstructed on the browser when Nova Act is engaged in completing an action. (See Entering sensitive information below.).\n4. When choosing developer tools on nova.amazon.com/act with API key authentication, we collect information on interactions with Nova Act, including in-browser screenshots to develop and improve our services. Email us at nova-act@amazon.com to request deletion of your Nova Act data.\n5. Do not share your API key generated on https://nova.amazon.com/act. Anyone with access to your API key can use it to operate Nova Act under your Amazon account. If you lose your API key or believe someone else may have access to it, go to https://nova.amazon.com/act to deactivate your key and obtain a new one.\n6. If you are using our browsing environment defaults, look for `NovaAct` in the user agent string to identify our agent. If you operate Nova Act in your own browsing environment or customize the user agent, we recommend that you include that same string.\n\n## Report a Bug\n\nHelp us improve! If you notice any issues, please let us know by submitting a bug report via nova-act@amazon.com. \n\n\nBe sure to include the following in the email:\n- Description of the issue;\n- Session ID, which will have been printed out as a console log message; and\n- Script of the workflow you are using.\n\nYour feedback is valuable in ensuring a better experience for everyone.\n\nThanks for experimenting with Nova Act!\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws%2Fnova-act","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faws%2Fnova-act","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faws%2Fnova-act/lists"}