{"id":20925293,"url":"https://github.com/apehex/gpm","last_synced_at":"2026-04-25T00:36:24.650Z","repository":{"id":238309008,"uuid":"753050979","full_name":"apehex/gpm","owner":"apehex","description":"A stateless password manager, with an AI twist.","archived":false,"fork":false,"pushed_at":"2026-04-04T15:29:03.000Z","size":4921,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-04T16:56:19.846Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apehex.png","metadata":{"files":{"readme":".github/readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-02-05T11:23:43.000Z","updated_at":"2026-04-04T15:29:07.000Z","dependencies_parsed_at":"2026-01-12T20:02:19.967Z","dependency_job_id":null,"html_url":"https://github.com/apehex/gpm","commit_stats":null,"previous_names":["apehex/gpm"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/apehex/gpm","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apehex%2Fgpm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apehex%2Fgpm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apehex%2Fgpm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apehex%2Fgpm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apehex","download_url":"https://codeload.github.com/apehex/gpm/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apehex%2Fgpm/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32246393,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T13:21:15.438Z","status":"ssl_error","status_checked_at":"2026-04-24T13:21:15.005Z","response_time":64,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T20:30:42.070Z","updated_at":"2026-04-25T00:36:24.632Z","avatar_url":"https://github.com/apehex.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GPM: Generative Password Manager\n\n\u003cimg src=\"header.png\" alt=\"Neural tokenization\" title=\"Source: Image by Author and generated with MidJourney\" width=\"100%\" style=\"margin: auto;\"/\u003e\n\n\u003e Stateless password manager, powered by ML neural networks.\n\nPassword management is up there with cookie popups and ads, a major pain in the ass.\n\nTurns out you don't need to *manage* passwords, they can all be derived from a single master key.\nHere's an elegant implementation using tools from the AI field.\n\n## Features\n\n\u003e Passwords are **never stored**, so they can't be leaked\n\n\u003e Passwords are **never transmitted**, there is no need to sync devices\n\n\u003e All the passwords are generated from a **single master key**\n\n## Principle\n\nContrary to traditional password managers, the passwords are not saved on disk:\nthey are (re)generated each time.\n\nThe master key is encoded and used as seed to initiate the random number generator.\nThanks to this generator, the tensor weights of a MLP are filled.\n\nThis MLP model then takes the login information as input and outputs a password.\n\nEven though the process generates high entropy passwords, it is deterministic and will always output the same password for a given login.\n\n## Usage\n\nOnly 3 arguments are required:\n\n```shell\npython  gpm/main.py --key 'never seen before combination of letters' --target 'http://example.com' --id 'user@e.mail'\n# YRLabEDKqWQrN6JF\n```\n\n- the master key\n- the login target\n- the login id\n\nIf they are not specified on the command line, the user will be prompted during the execution:\n\n```shell\npython  gpm/main.py\n# \u003e Master key:\n# never seen before combination of letters\n# \u003e Login target:\n# http://example.com\n# \u003e Login id:\n# user@mail.com\n```\n\nThe full list of parameters is the following:\n\n```shell\nGenerate / retrieve the password matching the input information\n\noptional arguments:\n  -h, --help                                    show this help message and exit\n  --key MASTER_KEY, -k MASTER_KEY               the master key (all ASCII)\n  --target LOGIN_TARGET, -t LOGIN_TARGET        the login target (URL, IP, name, etc)\n  --id LOGIN_ID, -i LOGIN_ID                    the login id (username, email, etc)\n  --length PASSWORD_LENGTH, -l PASSWORD_LENGTH  the length of the password (default 16)\n  --nonce PASSWORD_NONCE, -n PASSWORD_NONCE     the nonce of the password\n  --lower, -a                                   exclude lowercase letters from the password\n  --upper, -A                                   exclude uppercase letters from the password\n  --digits, -d                                  exclude digits from the password\n  --symbols, -s                                 include symbols in the password\n```\n\n## Process Overview\n\nThe user provides:\n\n- a master key\n- the login informations:\n    - target URL\n    - user ID\n- the password properties:\n    - its length\n    - the composition of its vocabulary (upper / lower letters, numbers, symbols)\n    - a nonce, to allow multiple passwords per website\n\nThese inputs are then processed:\n\n0. setup the hyper-parameters:\n    - use the whole ASCII table as input vocabulary and save its shape\n    - compose the output vocabulary and save its shape\n    - cast the master key into an integer seed\n1. preprocess / clean the string inputs:\n    - remove unwanted characters\n    - normalize the strings\n2. encode the inputs as a sequence tensor X for the MLP:\n    - map the input characters to integers\n    - add entropy to avoid repetitions in the output\n    - format as a tensor\n3. create the model corresponding to the hyper-parameters\n4. sample / generate the password as a tensor Y\n5. decode the probability tensor Y into an actual password string\n\n## 0. Setup The Hyper Parameters\n\nThe generative function is a MLP: it is defined by hyper-parameters.\n\n- the seed for the random number generators\n- the tensor shapes\n- the input vocabulary (all the ASCII characters)\n- the output vocabulary (alpha / numbers / symbols)\n- the password length, which is the length of the sampling\n\nSome of these are fixed:\n\n```python\n# size of the input / output vocabularies\nN_INPUT_DIM = len(INPUT_VOCABULARY)\nN_OUTPUT_DIM = N_INPUT_DIM\n# shapes of the inner layers of the MLP \nN_CONTEXT_DIM = 8\nN_EMBEDDING_DIM = 128\n# default properties of the password\nN_PASSWORD_DIM = 16\nN_PASSWORD_NONCE = 1\n```\n\nOnly `N_OUTPUT_DIM`, `N_PASSWORD_DIM` and `N_PASSWORD_NONCE` can be overwritten by the user.\n\n### 0.1. Defining the Input Vocabulary\n\nThe inputs are projected on the ASCII table, all unicode characters are ignored.\n\nThis vocabulary is fixed, whatever the user typed:\n\n```python\nINPUT_VOCABULARY = ''.join(chr(__i) for __i in range(128))\n```\n\n### 0.2. Composing The Output Vocabulary\n\nThe output vocabulary dictates the composition of the model output, IE the password.\n\nThis vocabulary can contain:\n\n- lowercase letters\n- uppercase letters\n- digits\n- ASCII symbols, apart from the quotes `\"` and `'`\n\n```python\nVOCABULARY_ALPHA_UPPER = ''.join(chr(__i) for __i in range(65, 91))                             # A-Z\nVOCABULARY_ALPHA_LOWER = VOCABULARY_ALPHA_UPPER.lower()                                         # a-z\nVOCABULARY_NUMBERS = '0123456789'                                                               # 0-9\nVOCABULARY_SYMBOLS = ''.join(chr(__i) for __i in range(33, 48) if chr(__i) not in [\"'\", '\"'])   # !#$%\u0026\\()*+,-./\n```\n\nIt is generated from the user preferences with:\n\n```python\ndef compose(lower: bool=True, upper: bool=True, digits: bool=True, symbols: bool=False) -\u003e str:\n    return sorted(set(lower * VOCABULARY_ALPHA_LOWER + upper * VOCABULARY_ALPHA_UPPER + digits * VOCABULARY_NUMBERS + symbols * VOCABULARY_SYMBOLS))\n```\n\nBy default it is:\n\n```python\n''.join(compose(1, 1, 1, 0))\n# '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'\n```\n\nAnother possibility would be to form the password out of whole words.\n\n### 0.3. Casting The Master Key Into The Seed\n\nA naive approach is to interpret the master key as a HEX sequence, then cast into the integer seed:\n\n```python\ndef seed(key: str) -\u003e int:\n    __key = ''.join(__c for __c in key if ord(__c) \u003c 128) # keep only ASCII characters\n    return int(bytes(__key, 'utf-8').hex(), 16) % (2 ** 32) # dword\n```\n\nThis doesn't work though:\n\n```python\nseed('never seen before combination of letters')\n# 1952805491\nseed('combination of letters')\n# 1952805491\nb'combination of letters'.hex()\n# '636f6d62696e6174696f6e206f66206c657474657273'\n```\n\nThe encoding of the string `'combination of letters'` requires 22 bytes, so it is greater than `2 ** 168`.\nPrepending a prefix means adding a number times `2 ** 176` which leads to the same value modulo `2 ** 32`.\n\nTo separate the encoding of similar mater keys, it is first hashed using `sha256`:\n\n```python\ndef seed(key: str) -\u003e int:\n    __key = ''.join(__c for __c in key if ord(__c) \u003c 128) # keep only ASCII characters\n    __hash = hashlib.sha256(string=__key.encode('utf-8')).hexdigest()\n    return int(__hash[:8], 16) # take the first 4 bytes: the seed is lower than 2 ** 32\n```\n\nNow:\n\n```python\nseed('never seen before combination of letters')\n# 3588870616\nseed('combination of letters')\n# 3269272188\n```\n\n## 1. Preprocessing The Inputs\n\nThe inputs are the login information for which the user wants a password:\n\n- login target\n- login id\n\nBefore being handled to the model, they need to be preprocessed to guarantee that the output matches the user expectations.\n\n### 1.1. Removing Unwanted Characters\n\nFirst, the inputs should be cleaned to:\n\n- remove spaces: they serve no purpose and are typos like `http://example. com` \n- remove unicode characters: many typos produce invisible control characters like `chr(2002)`\n\nSpaces can be removed with:\n\n```python\ndef remove_spaces(text: str) -\u003e str:\n    return text.replace(' ', '').replace('\\t', '')\n```\n\nWhile the encoding function detailed below will automatically replace characters outside of the input vocabulary (ASCII table) with the default character of index 0.\n\n### 1.2. Normalizing The Strings\n\nSeveral variants can be used to point to the same service:\n\n```\nexample.com\nhttps://example.com\nhttps://example.com/\nExamPLE.COM\n```\n\nSo they need to be normalized with:\n\n```python\ndef remove_prefix(text: str) -\u003e str:\n    __r = r'^((?:ftp|https?):\\/\\/)'\n    return re.sub(pattern=__r, repl='', string=text, flags=re.IGNORECASE)\n\ndef remove_suffix(text: str) -\u003e str:\n    __r = r'(\\/+)$'\n    return re.sub(pattern=__r, repl='', string=text, flags=re.IGNORECASE)\n```\n\nIn the end:\n\n```python\ndef preprocess(target: str, login: str) -\u003e list:\n    __left = remove_suffix(text=remove_prefix(text=remove_spaces(text=target.lower())))\n    __right = remove_spaces(text=login.lower())\n    return __left + '|' + __right\n```\n\n```python\npreprocess(target='example.com', login='user')\n# 'example.com|user'\npreprocess(target='https://example.com', login='user')\n# 'example.com|user'\npreprocess(target='example.com/', login='USER')\n# 'example.com|user'\n```\n\n## 2. Encoding The Inputs\n\n### 2.1. Mapping The Characters To Integers\n\nThe mapping between character and integer is a straightforward enumeration:\n\n```python\ndef mappings(vocabulary: list) -\u003e dict:\n    __itos = {__i: __c for __i, __c in enumerate(vocabulary)}\n    __stoi = {__c: __i for __i, __c in enumerate(vocabulary)}\n    # blank placeholder\n    __blank_c = __itos[0] # chr(0)\n    __blank_i = 0\n    # s =\u003e i\n    def __encode(c: str) -\u003e int:\n        return __stoi.get(c, __blank_i)\n    # i =\u003e s\n    def __decode(i: int) -\u003e str:\n        return __itos.get(i, __blank_c)\n    # return both\n    return {'encode': __encode, 'decode': __decode}\n```\n\nIt will remove all the characters outside the input vocabulary, EG unicode characters.\n\n### 2.2. Adding Entropy\n\nWith a character level embedding the input tensor would look like:\n\n```python\narray([101, 120,  97, 109, 112, 108, 101,  46,  99, 111, 109, 124, 117, 115, 101, 114], dtype=int32)\n```\n\nWhich means that *each repetition in the input would also yield a repetition in the output password*.\n\nJust like regular transformer models, using a context as input will make each sample more unique.\nInstead of a single character, a sample is now composed of the N latest characters:\n\n```python\narray([[  0,   0,   0,   0,   0,   0,   0,   0],\n       [  0,   0,   0,   0,   0,   0,   0, 101],\n       [  0,   0,   0,   0,   0,   0, 101, 120],\n       [  0,   0,   0,   0,   0, 101, 120,  97],\n       [  0,   0,   0,   0, 101, 120,  97, 109],\n       [  0,   0,   0, 101, 120,  97, 109, 112],\n       [  0,   0, 101, 120,  97, 109, 112, 108],\n       [  0, 101, 120,  97, 109, 112, 108, 101],\n       [101, 120,  97, 109, 112, 108, 101,  46],\n       [120,  97, 109, 112, 108, 101,  46,  99],\n       [ 97, 109, 112, 108, 101,  46,  99, 111],\n       [109, 112, 108, 101,  46,  99, 111, 109],\n       [112, 108, 101,  46,  99, 111, 109, 124],\n       [108, 101,  46,  99, 111, 109, 124, 117],\n       [101,  46,  99, 111, 109, 124, 117, 115],\n       [ 46,  99, 111, 109, 124, 117, 115, 101]], dtype=int32)\n```\n\nThis can still be improved.\nAs long as the process is deterministic, the input can be modified in any way.\n\nFor example, the successive ordinal values can be accumulated:\n\n```python\ndef accumulate(x: int, y: int, n: int) -\u003e int:\n    return (x + y) % n\n```\n\nThe modulo guarantees that the encoding stays within the range of the ASCII encoding:\n\n```python\n__func = lambda __x, __y: accumulate(x=__x, y=__y + N_PASSWORD_NONCE, n=N_INPUT_DIM)\nlist(itertools.accumulate(iterable=__source, func=__func))\n# [101, 94, 64, 46, 31, 12, 114, 33, 5, 117, 99, 96, 86, 74, 48, 35]\n```\n\nAlso the context can start from the current index, instead of ending on it.\nFinally the encoded input can be cycled through to create and infinite iterator:\n\n```python\ndef feed(source: list, nonce: int, dimension: int) -\u003e iter:\n    __func = lambda __x, __y: accumulate(x=__x, y=__y + nonce, n=dimension) # add entropy by accumulating the encodings\n    return itertools.accumulate(iterable=itertools.cycle(source), func=__func) # infinite iterable\n```\n\nThis will allow to create passwords longer than the input text.\n\n### 2.3. Formatting As A Tensor\n\nFinally, the iterator of encoded inputs is used to generate the tensor X:\n\n```python\ndef tensor(feed: 'Iterable[int]', length: int, context: int) -\u003e tf.Tensor:\n    __x = [[next(feed) for _ in range(context)] for _ in range(length)]\n    return tf.constant(tf.convert_to_tensor(value=__x, dtype=tf.dtypes.int32))\n```\n\nThis tensor has shape `(N_PASSWORD_LENGTH, N_CONTEXT_DIM)`:\n\n```python\ntensor(feed=__feed, length=N_PASSWORD_DIM, context=N_CONTEXT_DIM)\n# \u003ctf.Tensor: shape=(16, 8), dtype=int32, numpy=\n# array([[101,  94,  64,  46,  31,  12, 114,  33],\n#        [  5, 117,  99,  96,  86,  74,  48,  35],\n#        [  9,   2, 100,  82,  67,  48,  22,  69],\n#        [ 41,  25,   7,   4, 122, 110,  84,  71],\n#        [ 45,  38,   8, 118, 103,  84,  58, 105],\n#        [ 77,  61,  43,  40,  30,  18, 120, 107],\n#        [ 81,  74,  44,  26,  11, 120,  94,  13],\n#        [113,  97,  79,  76,  66,  54,  28,  15],\n#        [117, 110,  80,  62,  47,  28,   2,  49],\n#        [ 21,   5, 115, 112, 102,  90,  64,  51],\n#        [ 25,  18, 116,  98,  83,  64,  38,  85],\n#        [ 57,  41,  23,  20,  10, 126, 100,  87],\n#        [ 61,  54,  24,   6, 119, 100,  74, 121],\n#        [ 93,  77,  59,  56,  46,  34,   8, 123],\n#        [ 97,  90,  60,  42,  27,   8, 110,  29],\n#        [  1, 113,  95,  92,  82,  70,  44,  31]], dtype=int32)\u003e\n```\n\nEven though the input strings `'example.com|user'` had repetitions (\"e\" and \"m\") no two lines of the tensor are the same.\n\nThe process detailed here will always produce the same tensor X.\n\n## 3. Creating The MLP Model\n\nNow that all the hyper-parameters are set, creating the MLP is just a formality:\n\n```python\ndef create_model(\n    seed: int,\n    n_input_dim: int,\n    n_output_dim: int,\n    n_context_dim: int=N_CONTEXT_DIM,\n    n_embedding_dim: int=N_EMBEDDING_DIM,\n) -\u003e tf.keras.Model:\n    __model = tf.keras.Sequential()\n    # initialize the weights\n    __embedding_init = tf.keras.initializers.GlorotNormal(seed=seed)\n    __dense_init = tf.keras.initializers.GlorotNormal(seed=(seed ** 2) % (2 ** 32)) # different values\n    # embedding\n    __model.add(tf.keras.layers.Embedding(input_dim=n_input_dim, output_dim=n_embedding_dim, embeddings_initializer=__embedding_init, name='embedding'))\n    # head\n    __model.add(tf.keras.layers.Reshape(target_shape=(n_context_dim * n_embedding_dim,), input_shape=(n_context_dim, n_embedding_dim), name='reshape'))\n    __model.add(tf.keras.layers.Dense(units=n_output_dim, activation='tanh', use_bias=False, kernel_initializer=__dense_init, name='head'))\n    __model.add(tf.keras.layers.Softmax(axis=-1, name='softmax'))\n    # compile\n    __model.compile(\n        optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),\n        loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False, label_smoothing=0., axis=-1, reduction=tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE, name='loss'))\n    return __model\n```\n\nFor the purpose of this POC we are using Tensorflow and Keras, but it could actually be done with basic matrix multiplications.\nNumpy would be almost as convenient to use and yield the same result.\n\n## 4. Sampling = Password Generation\n\nThe forward pass of the tensor X in the above model will result in the probabilities for each character in the output vocabulary.\n\nThis can be directly decoded as a string like this:\n\n```python\ndef password(model: tf.keras.Model, x: tf.Tensor, itos: callable) -\u003e str:\n    __y = tf.squeeze(model(x, training=False))\n    __p = list(tf.argmax(__y, axis=-1).numpy())\n    return _miv.decode(__p, itos=itos)\n```\n\n## Evaluation\n\nAll the operations are pieced together in the `process` function.\n\nWe can fix the internal parameters of the model like so:\n\n```python\n_process = functools.partial(\n    process,\n    password_length=32,\n    password_nonce=1,\n    include_lower=True,\n    include_upper=True,\n    include_digits=True,\n    include_symbols=False,\n    input_vocabulary=INPUT_VOCABULARY,\n    model_context_dim=N_CONTEXT_DIM,\n    model_embedding_dim=N_EMBEDDING_DIM)\n```\n\nWhich makes it easier to test the password generation:\n\n```python\n_process(master_key='test', login_target='example.com', login_id='user')\n# 'AfBOO0MGvFGikU2ZBVleuXDUFQpgR4Zg'\n_process(master_key='test', login_target='http://example.com', login_id='USER')\n# 'AfBOO0MGvFGikU2ZBVleuXDUFQpgR4Zg'\n```\n\nAs expected the whole process is deterministic:\ncalls with equivalent inputs will always yield the same password, there is no need to save it.\n\n```python\n_process(master_key='verysecretpassphrase', login_target='example.com', login_id='u s e r@EMAIL.COM')\n# '4ZUHYALvuXvcSoS1p9j7R64freclXKvf'\n_process(master_key='verysecretpassphrase', login_target='HTTPS://example.com/', login_id='user@email.com')\n# '4ZUHYALvuXvcSoS1p9j7R64freclXKvf'\n```\n\n## Improvements\n\nThis POC could be turned into a full-fledged product with:\n\n- performance improvements:\n    - use the base `numpy` instead of `tensorflow`\n    - replace the model with its base weight tensors and matrix multiplications\n- more output options:\n    - generate the password as a bag of words\n    - create whole sentences / quotes\n    - force the use of certain characters / sub-vocabularies (like the symbols)\n- an actual distribution as:\n    - browser extension\n    - binary executable (CLI)\n    - mobile app\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapehex%2Fgpm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapehex%2Fgpm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapehex%2Fgpm/lists"}