{"id":16855317,"url":"https://github.com/andyxning/shortme","last_synced_at":"2025-05-16T07:07:33.632Z","repository":{"id":57486999,"uuid":"53938488","full_name":"andyxning/shortme","owner":"andyxning","description":"Yet Another URL Shortening Service in Golang","archived":false,"fork":false,"pushed_at":"2025-03-02T14:04:34.000Z","size":971,"stargazers_count":314,"open_issues_count":1,"forks_count":74,"subscribers_count":17,"default_branch":"master","last_synced_at":"2025-04-07T13:03:37.026Z","etag":null,"topics":["go","hash","sequence-counter","url-shortener"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andyxning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-03-15T11:19:29.000Z","updated_at":"2025-03-31T15:25:14.000Z","dependencies_parsed_at":"2022-09-01T23:01:31.533Z","dependency_job_id":null,"html_url":"https://github.com/andyxning/shortme","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andyxning%2Fshortme","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andyxning%2Fshortme/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andyxning%2Fshortme/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andyxning%2Fshortme/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andyxning","download_url":"https://codeload.github.com/andyxning/shortme/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254485065,"owners_count":22078767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","hash","sequence-counter","url-shortener"],"created_at":"2024-10-13T13:59:06.166Z","updated_at":"2025-05-16T07:07:33.614Z","avatar_url":"https://github.com/andyxning.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"![](logo.png)  \n![](https://img.shields.io/badge/version-1.2.0-blue.svg)\n![](https://img.shields.io/badge/LICENSE-AGPL-blue.svg)\n[![Build Status](https://travis-ci.org/andyxning/shortme.svg?branch=master)](https://travis-ci.org/andyxning/shortme)\n### Introduction\n----\nShortMe is a url shortening service written in Golang.  \nIt is with high performance and scalable.  \nShortMe is ready to be used in production. Have fun with it. :)\n\n### Features\n----\n* Convert same long urls to different short urls.\n* Api support\n* Web support\n* Short url black list\n    * To avoid some words, like `f**k` and `stupid`\n    * To make sure that apis such as `/version` and `/health` will only be\n    used as api not short urls or otherwise when requesting `http://127.0.0.1:3030/version`, version info will be returned rather the long url corresponding to the short url \"version\".\n* Base string config in configuration file\n    * **Once this base string is specified, it can not be reconfigured anymore\n    otherwise the shortened urls may not be unique and thus may conflict with\n     previous ones.**\n* Avoid short url loop\n    * In case we request the short url for an already shortened url by\n    **shortme**. This is meaningless and will consume more resource in\n    **shortme**.\n* Short **http** or **https** urls\n\n### Implementation\n----\nCurrently, afaik, there are three ways to implement short url service.\n* Hash\n    * This way is straightforward. However, every hash function will have a\n    collision when data is large.\n* Sample\n    * this way may contain collision, too. See example below (This example,\n    in Python, is only used to demonstrate the collision situation.).\n\n    ```python\n    \u003e\u003e\u003e import random\n    \u003e\u003e\u003e import string\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['c', 'a']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['a', 'b']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['c', 'b']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['a', 'b']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['b', 'c']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['b', 'c']\n    \u003e\u003e\u003e random.sample('abc', 2)\n    ['c', 'a']\n    \u003e\u003e\u003e\n    ```\n* Base\n    * Just like converting bytes to base64 ascii, we can convert base10 to base62\n    and then make a map between **0 .. 61** to **a-zA-Z0-9**. At last, we can\n    get a unique string if we can make sure that the integer is unique.\n    So, the URL shortening question transforms into making sure we can get a\n    unique integer.\n    ShortMe Use [the method that Flicker use](http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/)\n    to generate a unique integer(Auto_increment + Replace into + MyISAM).\n    Currently, we only use one backend db to generate sequence. For multiple\n    sequence counter db configuration see [Deploy#Sequence Database]\n    (#Sequence Database)\n\n### Api\n----\n* `/version`\n    * `HTTP GET`\n    * Version info\n    * Example\n        * `curl http://127.0.0.1:3030/version`\n* `/health`\n    * `HTTP GET`\n    * Health check\n    * Example\n        * `curl http://127.0.0.1:3030/health`\n* `/short`\n    * `HTTP POST`\n    * Short the long url\n    * Example\n        * `curl -X POST -H \"Content-Type:application/json\" -d \"{\\\"longURL\\\": \\\"http://www.google.com\\\"}\" http://127.0.0.1:3030/short`\n* `/{a-zA-Z0-9}{1,11}`\n    * `HTTP GET`\n    * Expand the short url and return a **temporary redirect** HTTP status\n    * Example\n        * `curl -v http://127.0.0.1:3030/3`\n\n        ```bash\n            *   Trying 127.0.0.1...\n            * Connected to 127.0.0.1 (127.0.0.1) port 3030 (#0)\n            \u003e GET /3 HTTP/1.1\n            \u003e Host: 127.0.0.1:3030\n            \u003e User-Agent: curl/7.43.0\n            \u003e Accept: */*\n            \u003e\n            \u003c HTTP/1.1 307 Temporary Redirect\n            \u003c Location: http://www.google.com\n            \u003c Date: Fri, 15 Apr 2016 07:25:24 GMT\n            \u003c Content-Length: 0\n            \u003c Content-Type: text/plain; charset=utf-8\n            \u003c\n            * Connection #0 to host 127.0.0.1 left intact\n        ```\n\n### Web\n----\nThe web interface mainly used to make url shorting service more intuitively.\n\nFor **short** option, the shorted url, shorted url qr code and the \ncorresponding long page is shown.\n\nFor **expand** option, the expanded url, expanded url qr code and the \ncorresponding expanded page is shown. \n\n![](shortme_record.gif)\n\n\n### Superstratum Projects\nProjects that use `short-me`.\n\n* [short-url](https://github.com/sillyhatxu/short-url)\n\n\n### Install\n----\n#### Dependency\n----\n* Golang\n* Mysql\n\n#### Compile\n----\n```bash\nmkdir -p $GOPATH/src/github.com/andyxning\ncd $GOPATH/src/github.com/andyxning\ngit clone https://github.com/andyxning/shortme.git\n\ncd shortme\nmake build\n```\n\n#### Database Schema\n----\nWe use two databases. Import the two schemas.\n* shortme\n    * Store short url info\n    * [shortme schema](schema/shortme.sql)\n* sequence\n    * sequence generator\n    * [sequence schema](schema/sequence.sql)\n\n#### Configuration\n----\n```\n[http]\n# Listen address\nlisten = \"0.0.0.0:3030\"\n\n[sequence_db]\n# Mysql sequence generator DSN\ndsn = \"sequence:sequence@tcp(127.0.0.1:3306)/sequence\"\n\n# Mysql connection pool max idle connection\nmax_idle_conns = 4\n\n# Mysql connection pool max open connection\nmax_open_conns = 4\n\n[short_db]\n# Mysql short service read db DSN\nread_dsn = \"shortme_w:shortme_w@tcp(127.0.0.1:3306)/shortme\"\n\n# Mysql short service write db DSN\nwrite_dsn = \"shortme_r:shortme_r@tcp(127.0.0.1:3306)/shortme\"\n\n# Mysql connection pool max idle connection\nmax_idle_conns = 8\n\n# Mysql connection pool max open connection\nmax_open_conns = 8\n\n[common]\n# short urls that will be filtered to use\nblack_short_urls = [\"version\",\"health\",\"short\",\"expand\",\"css\",\"js\",\"fuck\",\"stupid\"]\n\n# Base string used to generate short url\nbase_string = \"Ds3K9ZNvWmHcakr1oPnxh4qpMEzAye8wX5IdJ2LFujUgtC07lOTb6GYBQViSfR\"\n\n# Short url service domain name. This is used to filter short url loop.\ndomain_name = \"short.me:3030\"\n\n# Short url service schema: http or https.\nschema = \"http\"\n```\n#### Capacity\n----\nWe use an Mysql `unsigned bigint` type to store the sequence counter. According\n to the [Mysql doc](http://dev.mysql.com/doc/refman/5.7/en/integer-types.html)\n we can get `18446744073709551616` different integers.\n However, according to [Golang doc about `LastInsertId`](https://golang.org/pkg/database/sql/driver/#RowsAffected.LastInsertId)\n the returned auto increment integer can only be `int64` which will make the\n sequence smaller than `uint64`. Even through, we can still get\n `9223372036854775808` different integers and this will be large enough\n for most service.  \n\nSupposing that  we consume `100,000,000` short urls one day, then the\nsequence counter can last for `2 ** 63 / 100000000 / 365 = 252695124` years.\n\n#### Short URL Length\n----\nThe max string length needed for encoding `2 ** 63` integers will be **11**.\n\n```python\n\u003e\u003e\u003e 62 ** 10\n839299365868340224\n\u003e\u003e\u003e 2 ** 63\n9223372036854775808L\n\u003e\u003e\u003e 62 ** 11\n52036560683837093888L\n```\n\n#### Grant\n----\nAfter setting up the databases and before running **shortme**, make sure that\nthe corresponding user and password has been granted. After logging in mysql console, run following sql statement:\n* `grant insert, delete on sequence.* to 'sequence'@'%' identified by 'sequence'`\n* `grant insert on shortme.* to 'shortme_w'@'%' identified by 'shortme_w'`\n* `grant select on shortme.* to 'shortme_r'@'%' identified by 'shortme_r'`\n\n#### Run\n----\n* make sure that `static` directory will be at the same directory as **shortme**\n* `./shortme -c config.conf`\n\n### Deploy\n----\n\n#### \u003ca name=\"Sequence Database\"\u003e\u003c/a\u003eSequence Database\n----\nIn the [Flickr blog](http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/),\nFlickr suggests that we can use two databases with one for even sequence and\nthe other one for odd sequence. This will make sequence generator being more\navailable in case one database is down and will also spread the load about\ngenerate sequence. After splitting sequence db from one to more, we can use\n[HaProxy](http://www.haproxy.org/) as a reverse proxy and thus more sequence\ndatabases can be used as one. As for load balance algorithm, i think **round\nrobin** is good enough for this situation.\n\nIn two databases situation, we should add the following configuration to each\n database configuration file.\n* First database\n\n```\nauto_increment_offset 1\nauto_increment_increment 2\n```\n\n* Second databse\n\n```\nauto_increment_offset 2\nauto_increment_increment 2\n```\n\nThen each time to generate a sequence counter, we can execute below sql\nstatement:  \n`replace into sequence(stub) values(\"sequence\")`\n\nIn cases we use three databases as sequence counter generator, we should\ninsert a record for each table in two databases.\n* First database\n\n```\nauto_increment_offset 1\nauto_increment_increment 3\n```\n\n* Second database\n\n```\nauto_increment_offset 2\nauto_increment_increment 3\n```\n\n* Third database\n\n```\nauto_increment_offset 3\nauto_increment_increment 3\n```\n\nThen each time to generate a sequence counter, we can execute below sql\nstatement:  \n`replace into sequence(stub) values(\"sequence\")`\n\nOk, i think you get the point. When using `N` databases to generate sequence\ncounter, configuration for each database configuration file will just\nlike below:\n\n```\nfor i := range N {\n    add \"auto_increment_offset i\" to config file\n    add \"auto_increment_increment N\" to config file\n}\n\n```\nSo, sequence generator can be horizontally scalable.\n\n#### Shard\n----\nWith short urls increasing, many records are stored in one table. This\n is not an optimal mysql practice. In this case we can simply shard table to\n bypass this problem.\n\nFor example, we can shard according to the **base integer** using **modula hash\n algorithm**. This has a good distribution between tables. We can use **100**\n **short** tables with names like **short_00/short_01/short_02/..\n ./short_99**. we can use pseudo code blow to determine which is the\n table to store the short url record.\n\n ```\n baseInteger := sequence.NextSequence()\n tableName := fmt.Sprintf(\"short_%s\", baseInteger % 100)\n ```\n\n There are many table sharding algorithms, we can shard table according to\n range id, user name and so on. If we use user name as the criteria to shard\n table, we can do some aggregate algorithm like how many records a user has\n created easily. This may also has some drawbacks such as if user **Lily**\n and user **Lucy** are sharded to different tables and **Lily** shorts about\n **1k** urls **Lucy** shorts about **1M** urls, then we may encounter the\n unbalance hash problem, i.e., some tables contains more records than others.\n\nIn conclusion, there are many factors to consider before we can make a\ndecision which hash algorithm to use.\n\n#### Statistics\n----\nSometimes we may want to make some statistics about hit number, UA(User \nAgent), original IP and so on. \n\nA recommended way to deploy **shortme** is to use it behind a reverse proxy \nserver such as **Nginx**. Under this way, the statistics info can be analysed\n by analysing the access log of **Nginx**. This is can be accomplished by \n `awk` or more trending log analyse stack [`ELK`](https://www.elastic.co/).\n\n### Problems\n----\n* long url may make the generated qr code unreadable. I have test this in my \nself phone. This remains to be tested more meticulous.  \n* One demand about customizing the short url can not be done easily currently\n in **shortme** according to the id generation logic. Let's make it happen. :)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandyxning%2Fshortme","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandyxning%2Fshortme","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandyxning%2Fshortme/lists"}