{"id":16759455,"url":"https://github.com/pogzyb/port43","last_synced_at":"2026-04-22T21:34:05.188Z","repository":{"id":226523315,"uuid":"744292796","full_name":"pogzyb/port43","owner":"pogzyb","description":"A set of open-source Information Security tools for the 🦜🔗 LangChain framework","archived":false,"fork":false,"pushed_at":"2024-03-05T00:17:59.000Z","size":39,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-23T09:40:18.317Z","etag":null,"topics":["cybersecurity","dns","huggingface","information-security","langchain","rdap","whois"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pogzyb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-01-17T01:59:35.000Z","updated_at":"2025-04-02T18:17:58.000Z","dependencies_parsed_at":"2024-03-08T03:22:55.599Z","dependency_job_id":"2d09101f-0c96-44fa-8068-8e23e91e32b4","html_url":"https://github.com/pogzyb/port43","commit_stats":null,"previous_names":["pogzyb/port43"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/pogzyb/port43","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pogzyb%2Fport43","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pogzyb%2Fport43/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pogzyb%2Fport43/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pogzyb%2Fport43/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pogzyb","download_url":"https://codeload.github.com/pogzyb/port43/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pogzyb%2Fport43/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278981512,"owners_count":26079640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cybersecurity","dns","huggingface","information-security","langchain","rdap","whois"],"created_at":"2024-10-13T04:08:10.828Z","updated_at":"2025-10-08T16:56:50.207Z","avatar_url":"https://github.com/pogzyb.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![PyPI version](https://badge.fury.io/py/port43.svg)](https://badge.fury.io/py/port43)\n![package workflow](https://github.com/pogzyb/port43/actions/workflows/python-package.yml/badge.svg)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n# 🤿 port43\n\n⚠️ **[work-in-progess]**\n\nA set of open-source Information Security tools for the 🦜🔗 LangChain framework\n\n### Premise\n\nPort43 can help you build Information Security-based LLM applications.\n\nA few use-cases include ...\n- Enabling Threat and SOC Analysts to query SIEM's using natural language\n- Parsing and extracting data from DNS, WHOIS, and RDAP queries\n- Gathering HTML, favicons, certificates, or screenshots from phishing sites on the internet\n- Connecting popular Information Security API's (shodan, virustotal, etc.) with LLM's\n\n... or combining any or all of the steps above into a single workflow!\n\n### Quickstart\n\nCheck out the `examples/` folder for each example's complete code.\n\n#### Basic example: WHOIS\n\nWHOIS is a query and response protocol that is used for querying databases \nthat store an Internet resource's registered users or assignees - [Wikipedia](https://en.wikipedia.org/wiki/WHOIS)\n\nUnlike the modern RDAP standard which uses a JSON schema, the format of WHOIS responses follow a semi-free text format. \nSo in other words, WHOIS is [\"Fragile, unparseable, obsolete... and universally relied upon\"](https://www.netmeister.org/blog/whois.html)\n\nIn order to parse WHOIS text responses from different registrars into a set of standardized key-value pairs that can be \nused by applications many open-source libraries have implemented a combination of regular expressions and text mining \ntechniques. Despite some success the amount of edge-cases or registrars with unconventional implementations has caused\nan overall inconsistent feel for many developers wishing to integrate WHOIS data into their applications.\n\nFor example, here is the authoritative output of `whois umich.edu`, which doesn't necessary follow \nthe conventional single line key:value format:\n```\n-------------------------------------------------------------\n\nDomain Name: UMICH.EDU\n\nRegistrant:\n\tUniversity of Michigan -- ITD\n\tITCS, Arbor Lakes\n\t4251 Plymouth Road\n\tAnn Arbor, MI 48105-2785\n\tUSA\n\nAdministrative Contact:\n\tDomain Admin\n\tUniversity of Michigan\n\tITS, Arbor Lakes\n\t4251 Plymouth Road\n\tAnn Arbor, MI 48105-3640\n\tUSA\n\t+1.7347641817\n\tdomainreg@umich.edu\n\nTechnical Contact:\n\t \n\tUniversity of Michigan\n\tITS, Arbor Lakes\n\t4251 Plymouth Road\n\tAnn Arbor, MI 48105-3640\n\tUSA\n\t+1.7347641817\n\tdomainreg@umich.edu\n\nName Servers:\n\tUMICH-EDU.DNS.UMICH.COM\n\tUMICH-EDU.DNS.UMICH.ORG\n\tUMICH-EDU.DNS.UMICH.NET\n\nDomain record activated:    07-Oct-1985\nDomain record last updated: 04-Jan-2024\nDomain expires:             31-Jul-2024\n\n```\n\nFortunately, the ever-growing capabilities of LLM's have made it possible to frame this problem in terms of an \"AI-assistant\"\n(aka ChatModel) leading to impressive results with zero pre- and post-processing.\n\nHere is some example code:\n\n```python\n# get a blob of WHOIS text\ntext, _ = asyncwhois.whois(\"umich.edu\", authoritative_only=True)\n# craft a prompt to extract key/values from the whois text\n# the prompt asks the LLM to take the text and convert it into a standardized JSON format\nprompt = WhoisTextToJson  # port43.prompts.whois_text_to_json.py\n# pull any open-source LLM from HuggingFace\n# or use Ollama: model = llm = ChatOllama(\"mistral\")\nllm = HuggingFaceHub(\n    repo_id=\"HuggingFaceH4/zephyr-7b-beta\",\n    task=\"text-generation\",\n    huggingfacehub_api_token=\u003cHF_API_TOKEN\u003e,\n    model_kwargs={\"max_new_tokens\": 2048},\n)\n# wrapper for HuggingFace LLM's\nmodel = ChatHuggingFace(llm=llm)\n# LCEL\nchain = prompt | model | StrOutputParser()\n# view the result\npprint(chain.invoke(input={\"data\": text}))\n```\n\u003cdetails\u003e\n  \u003csummary\u003eView the Result\u003c/summary\u003e\n  \nNote that there is absolutely no postprocessing of the LLM output. The LLM\nwas able to match all keys/values on its own. Further processing could be added to\nconvert timestamps, fill-in null values, or modify values for a specific use-case.\n\n```json\n{\n  \"admin_address\": \"University of Michigan -- ITD\\\\nITCS, Arbor Lakes\\\\n4251 Plymouth Road\\\\nAnn Arbor, MI 48105-2785\\\\nUSA\",\n  \"admin_city\": \"Ann Arbor\",\n  \"admin_country\": \"USA\",\n  \"admin_email\": \"domainreg@umich.edu\",\n  \"admin_fax\": \"+1.7347641817\",\n  \"admin_id\": \"\",\n  \"admin_name\": \"\",\n  \"admin_organization\": \"University of Michigan -- ITD\",\n  \"admin_phone\": \"+1.7347641817\",\n  \"admin_state\": \"\",\n  \"admin_zipcode\": \"48105-3640\",\n  \"billing_address\": \"University of Michigan -- ITD\\\\nITCS, Arbor Lakes\\\\n4251 Plymouth Road\\\\nAnn Arbor, MI 48105-3640\\\\nUSA\",\n  \"billing_city\": \"Ann Arbor\",\n  \"billing_country\": \"USA\",\n  \"billing_email\": \"\",\n  \"billing_fax\": \"+1.7347641817\",\n  \"billing_id\": \"\",\n  \"billing_name\": \"\",\n  \"billing_organization\": \"University of Michigan -- ITD\",\n  \"billing_phone\": \"+1.7347641817\",\n  \"billing_state\": \"\",\n  \"billing_zipcode\": \"48105-3640\",\n  \"created\": \"07-Oct-1985\",\n  \"dnssec\": \"\",\n  \"domain_name\": \"UMICH.EDU\",\n  \"expires\": \"31-Jul-2024\",\n  \"name_servers\": [\n    \"UMICH-EDU.DNS.UMICH.ORG\",\n    \"UMICH-EDU.DNS.UMICH.NET\",\n    \"UMICH-EDU.DNS.UMICH.COM\"\n  ],\n  \"registrant_address\": \"University of Michigan -- ITD\\\\nITCS, Arbor Lakes\\\\n4251 Plymouth Road\\\\nAnn Arbor, MI 48105-2785\\\\nUSA\",\n  \"registrant_city\": \"Ann Arbor\",\n  \"registrant_country\": \"USA\",\n  \"registrant_email\": \"\",\n  \"registrant_fax\": \"+1.7347641817\",\n  \"registrant_id\": \"\",\n  \"registrant_name\": \"\",\n  \"registrant_organization\": \"University of Michigan -- ITD\",\n  \"registrant_phone\": \"+1.7347641817\",\n  \"registrant_state\": \"\",\n  \"registrant_zipcode\": \"48105-2785\",\n  \"registrar\": \"\",\n  \"registrar_abuse_email\": \"\",\n  \"registrar_abuse_phone\": \"\",\n  \"registrar_iana_id\": \"\",\n  \"registrar_url\": \"\",\n  \"status\": [\n    \"active\"\n  ],\n  \"tech_address\": \"University of Michigan\\\\nITS, Arbor Lakes\\\\n4251 Plymouth Road\\\\nAnn Arbor, MI 48105-3640\\\\nUSA\",\n  \"tech_city\": \"Ann Arbor\",\n  \"tech_country\": \"USA\",\n  \"tech_email\": \"\",\n  \"tech_fax\": \"+1.7347641817\",\n  \"tech_id\": \"\",\n  \"tech_name\": \"\",\n  \"tech_organization\": \"University of Michigan\",\n  \"tech_phone\": \"+1.7347641817\",\n  \"tech_state\": \"\",\n  \"tech_zipcode\": \"48105-3640\",\n  \"updated\": \"04-Jan-2024\"\n}\n```\n\u003c/details\u003e\n\nThis whois example is just scratching the surface of what kind of problems LLM's can tackle. \nAgain, the goal of Port43 is to highlight more use-cases and expand AI-first information security workflows. \n\n#### Basic Agent: Finding DNS Records\n\n```python\n# add some tools\ntools = [DNSTool(), WHOISTool()]\n# get the ReAct prompt\nprompt = get_react_json_prompt(tools, render_args=True)\n# init any LLM; in this example we're using mistral via Ollama\n# figure out how to use Ollama here: https://ollama.com\nllm = ChatOllama(model=\"mistral\", temperature=0)\n# have the model stop after solving the exercise\nchat_model_with_stop = llm.bind(stop=[\"\\nObservation\"])\n# create the agent\nagent = (\n    {\n        \"input\": lambda x: x[\"input\"],\n        \"chat_history\": lambda x: (\n            _format_chat_history(x[\"chat_history\"]) if x.get(\"chat_history\") else []\n        ),\n        \"agent_scratchpad\": lambda x: format_log_to_messages(\n            x[\"intermediate_steps\"]\n        ),\n    }\n    | prompt\n    | chat_model_with_stop\n    | ReActJsonSingleInputOutputParser()\n)\n# create an executor\nagent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)\npprint(\n    agent_executor.invoke(\n        {\n            \"input\": \"How many DNS records does google.com have? What are the MX records?\"\n        }\n    )\n)\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eView the Result\u003c/summary\u003e\n\n`examples/scripts/basic_react_agent_01.py`\n\n```python\n\"\"\"\n\u003e Entering new AgentExecutor chain...\n Thought: I need to find out how many DNS records google.com has and what its MX records are. I can use the dns_search tool for this.\nAction:```json\n{\n    \"action\": \"dns_search\",\n    \"action_input\": {\n        \"hostname\": \"google.com\"\n    }\n}\n```{\n  \"A\": \"142.250.191.142\",\n  \"NS\": \"ns4.google.com.\",\n  \"SOA\": \"ns1.google.com. dns-admin.google.com. 611883130 900 900 1800 60\",\n  \"MX\": \"10 smtp.google.com.\",\n  \"TXT\": \"\\\"apple-domain-verification=30afIBcvSuDV2PLX\\\"\",\n  \"AAAA\": \"2607:f8b0:4009:818::200e\",\n  \"CAA\": \"0 issue \\\"pki.goog\\\"\"\n} Observation: The DNS records for google.com include one A record, two NS records, one SOA record, one MX record, one TXT record, one AAAA record, and one CAA record. The MX record is \"10 smtp.google.com.\"\nThought: I now have the information to answer the original question.\nFinal Answer: Google.com has a total of 7 DNS records, including 1 A record, 2 NS records, 1 SOA record, 1 MX record, 1 TXT record, 1 AAAA record, and 1 CAA record. The MX records are \"10 smtp.google.com.\"\n\n\u003e Finished chain.\n{'input': 'How many DNS records does google.com have? What are the MX records?',\n 'output': 'Google.com has a total of 7 DNS records, including 1 A record, 2 '\n           'NS records, 1 SOA record, 1 MX record, 1 TXT record, 1 AAAA '\n           'record, and 1 CAA record. The MX records are \"10 smtp.google.com.\"'}\n\"\"\"\n```\n\n\u003c/details\u003e\n\n#### Advanced use-case: Threat Hunting using Natural Language\n\n```python\ncoming soon...\n```\n\n#### Advanced use-case: Domain Monitoring \u0026 Phishing Detection\n\n```python\ncoming soon...\n```\n\n### Roadmap\n- Continue to expand the number of Tools\n  - common interface for SIEM query integrations (Splunk, Elasticsearch, SumoLogic, etc.)\n  - popular infosec API's (shodan, virustotal, ..., etc.)\n  - popular open-source cli libraries (dnstwist, ..., etc.)  \n- Add examples for advanced use-cases\n- Abstract some of the LangChain Agent setup","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpogzyb%2Fport43","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpogzyb%2Fport43","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpogzyb%2Fport43/lists"}