{"id":18374989,"url":"https://github.com/pali/plist","last_synced_at":"2025-04-11T02:48:09.553Z","repository":{"id":16625029,"uuid":"19380076","full_name":"pali/plist","owner":"pali","description":"PList - software for archiving and formatting emails from mailing lists","archived":false,"fork":false,"pushed_at":"2016-10-23T00:16:05.000Z","size":444,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-02-15T22:13:57.223Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pali.png","metadata":{"files":{"readme":"README","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-05-02T15:51:02.000Z","updated_at":"2021-10-31T21:56:46.000Z","dependencies_parsed_at":"2022-08-25T18:11:24.506Z","dependency_job_id":null,"html_url":"https://github.com/pali/plist","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pali%2Fplist","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pali%2Fplist/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pali%2Fplist/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pali%2Fplist/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pali","download_url":"https://codeload.github.com/pali/plist/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248332343,"owners_count":21086064,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T00:16:58.734Z","updated_at":"2025-04-11T02:48:09.535Z","avatar_url":"https://github.com/pali.png","language":"Perl","readme":"=== About ===\n\nPList - software for archiving and formatting emails from mailing lists\n\nLicense: GPLv2+\nAuthor: Pali Rohár\nEmail: pali.rohar@gmail.com\n\nPList is written in Perl and it is developed as replacement for Pipermail, Hypermail or MHonArc. It provides terminal application for manipulating with email archives and also provides web based CGI application for browsing emails via internet browser. For each mailing list archive PList needs directory (called index) where it stores emails and other data. Because widely used MIME or MBox formats are not suitable for fast processing PList stores emails in its own internal binary format. Our terminal application, that converts emails to its internal formats, supports reading archives in various MBox formats and also tries to understand lots of incorrect MIME formatted emails. More detailed description and internals about this PList software can be found in my bachelor's thesis: Mailing list archives (in Slovak language) at https://is.cuni.cz/webapps/zzp/detail/132573/\n\nFeatures:\n\n * reading archives in mboxo, mboxrd, mboxcl, mboxcl2 variants of MBox format (plus mix of all these)\n * reading emails in RFC2822 and MIME formats\n * incremental imports of MBox archives\n * auto pregenerating HTML pages for emails\n * support for HTML templates\n * support for email attachments\n * auto detection of charset encoding and mime type of badly formatted MIME parts\n * stable implementation without randomness (result of importing same emails in any order at any time is still same archive)\n * interpreting broken emails and those which violate standards in the best possible way\n * browse emails by years, months or dates\n * flat and tree based view of email list\n * sophisticated (and stable) algorithm for grouping emails into threads and subsequently building email trees for these threads\n   - support for building threads across more months and years\n   - using Message-Id, In-Reply-To, References headers and option also for matching by similar subjects\n   - deals with incomplete threads when some emails from In-Reply-To or References headers are missing\n   - rationally build tree from email thread (=transitive closure of directly acyclic graph)\n   - deals with possible cycles, inconsistencies or flaws in email threads\n\n=== Parts ===\n\nInternal Perl modules:\n\nPList::Email\nPList::Email::MIME\nPList::Email::Binary\nPList::Email::View\nPList::List\nPList::List::MBox\nPList::List::Binary\nPList::Index\nPList::Template\n\nTerminal applications:\n\nplist.pl\nplist-import-mboxes.pl\n\nWeb based CGI applications:\n\nplist.cgi\n\n=== Installation ===\n\nThis software depends on required Perl modules which must be installed. Here is list of these modules:\n\nCGI::Session\nCGI::Simple\nCwd\nDate::Format\nDate::Parse\nDBD::mysql\nDBD::SQLite\nDBI\nDigest::SHA\nEmail::Address\nEmail::Folder::Mbox (\u003e= 0.860)\nEmail::MIME\nEmail::MIME::ContentType\nEncode\nEncode::Detect\nFile::MimeInfo::Magic\nFile::Path\nFindBin\nList::MoreUtils\nHTML::Entities\nHTML::FromText\nHTML::Strip\nHTML::Template\nMIME::Base64\nTime::Local\n\nOptionally if Perl module Number::Bytes::Human is installed PList will report size of all email attachments in human readable format (instead default bytes).\n\nTo make sure that PList terminal and CGI applications will work correctly all internal PList modules must be installed into the same directory as program applications or into global Perl modules directory. Installation process could be invoked by command:\n\n$ make install\n\nThis command copies everything into /usr/share/plist/ and creates launchers for terminal applications in /usr/bin/. With standard Make variables DESTDIR and PREFIX it is possible to change installation directory and system configuration prefix directory.\n\n=== Configuration ===\n\nPList uses HTML templates for generating HTML pages. Default templates are stored in template directory. Different templates can be used by changing global environmental variable PLIST_TEMPLATE_DIR. This variable must contain absolute path to templates directory. If variable does not exist or is empty then the applications are configured to use default templates directory.\n\nFor using web application HTTP web server with CGI scripting support is needed. PList was tested with Apache 2 server. PList CGI script is using PList (index) archives in current working directory, so web server must be correctly configured to access these archives. To configure URL links user can use prepared .htaccess file.\n\n=== Usage ===\n\n*** Terminal application plist.pl ***\n\n$ plist.pl \u003cmode\u003e \u003ccommand\u003e \u003cargs...\u003e\n\nModes: index, list, bin\n\n** Commands for index mode **\n\nIndex mode is used for access to PList (index) archives.\n\n| view \u003cdir\u003e |\nShows information about archive in \u003cdir\u003e\n\n| create \u003cdir\u003e [\u003cdriver\u003e] [\u003cparams\u003e] [\u003cusername\u003e] [\u003cpassword\u003e] [\u003ckey\u003e=\u003cvalue\u003e] [...] |\nCreates new empty archive with directory name \u003cdir\u003e. Uses SQL driver \u003cdriver\u003e, parameters \u003cparams\u003e, username \u003cusername\u003e and password \u003cpassword\u003e. If \u003cdriver\u003e is not specified SQLite will be used and database will be stored in \u003cdir\u003e. Additional list of \u003ckey\u003e=\u003cvalue\u003e options are passed to config command (see below).\n\n| config \u003cdir\u003e \u003ckey\u003e \u003cvalue\u003e |\nChanges configuration value for key of archive \u003cdir\u003e. Possible keys are:\n * driver, params, username, password - Database connection parameters\n * description - Description of archive\n * listsize - Average size of list file (default 104857600 = 100MB)\n * nomatchsubject - Do not group emails with similar subject to one thread (default 1 = is on)\n * templatedir - Absolute path for directory with HTML templates (overwrite PLIST_TEMPLATE_DIR)\n * autopregen - Automatically pregenerates HTML pages for emails (default 0 = is off)\n * auth - Comma separated cgi authorization keys (secure, session, httpbasic, script)\n * authscript - Path to script for cgi authorization (only used when \"script\" is specified in auth)\n\n| add-list \u003cdir\u003e [\u003clist\u003e] [silent] |\nAdds emails from file \u003clist\u003e into archive \u003cdir\u003e. File \u003clist\u003e must be in internal binary format (see list mode). If file is not specified stdin will be used. When \u003csilent\u003e argument is used no warnings about duplicate emails will be reported.\n\n| add-mbox \u003cdir\u003e [\u003cmbox\u003e] [silent] [unescape] |\nSame as add-list but input file must be in one of these MBox formats: mboxo, mboxrd, mboxcl, mboxcl2. For mboxrd format is needed \u003cunescape\u003e option.\n\n| add-email \u003cdir\u003e [\u003cemail\u003e] |\nAdds one email from file \u003cemail\u003e into archive \u003cdir\u003e. If input file is not specified stdin will be used. Input file must be in text document with RFC2822 email structure. It can contains optional mailbox-like \"From \" line.\n\n| get-bin \u003cdir\u003e \u003cid\u003e [\u003cbin\u003e] |\nRetrieves email with id \u003cid\u003e from archive \u003cdir\u003e and stores it into the file \u003cbin\u003e in internal binary format (see bin mode). If output file is not specified stdout will be used.\n\n| get-part \u003cdir\u003e \u003cid\u003e \u003cpart\u003e [\u003cfile\u003e] |\nRetrieves specified part \u003cpart\u003e of email with id \u003cid\u003e from archive \u003cdir\u003e and stores it into file \u003cfile\u003e. To see list of parts in email use command info in bin mode. If output file is not specified stdout will be used.\n\n| get-roots \u003cdir\u003e [desc] [date1] [date2] [limit] [offset] |\nPrints tree roots of email threads from archive \u003cdir\u003e. By default output is in ascending order sorted by dates. If \u003cdesc\u003e is specified then descending order will be used. Additional arguments \u003cdate1\u003e (start date), \u003cdate2\u003e (end date), offset, limit (relative to offset) can be used to filter output. Dates must be specified in unix timestamp.\n\n| get-tree \u003cdir\u003e \u003cid\u003e [\u003cfile\u003e] |\nPrints tree (thread) for email specified by id \u003cid\u003e from archive \u003cdir\u003e. Optionally stores tree into file \u003cfile\u003e instead of stdout.\n\n| gen-html \u003cdir\u003e \u003cid\u003e [\u003chtml\u003e] |\nGenerates HTML page for email with id \u003cid\u003e from archive \u003cdir\u003e and stores it into the file \u003chtml\u003e. If archive cache contains pregenerated HTML page this cached version will be retrieved. If output file is not specified stdout will be used.\n\n| del \u003cdir\u003e \u003cid\u003e |\nDeletes email with id \u003cid\u003e from archive \u003cdir\u003e.\n\n| setspam \u003cdir\u003e \u003cid\u003e \u003ctrue|false\u003e |\nMarks email with id \u003cid\u003e as spam (value true) or not spam (value false) in archive \u003cdir\u003e.\n\n| pregen \u003cdir\u003e [\u003cid\u003e] |\nPregenerates HTML page for email with id \u003cid\u003e from archive \u003cdir\u003e. Page will be stored in archive cache and next call of command gen-html will return this cached version.\n\n\n** Commands for list mode **\n\nList mode is used for reading and writing email lists in internal binary format.\n\n| list view \u003clist\u003e |\nShows some information (including id, offset and parts) about each email in list file \u003clist\u003e.\n\n| list add-mbox \u003clist\u003e [\u003cmbox\u003e] [\u003cunescape\u003e |\nAdds all emails from MBox file \u003cmbox\u003e into the list file \u003clist\u003e. If input file is not specified then stdin will be used. For mboxrd format is needed \u003cunescape\u003e option.\n\n| list add-email \u003clist\u003e [\u003cemail\u003e] |\nAdds one email from file \u003cemail\u003e into the list file \u003clist\u003e. If input file is not specified stdin will be used. Input file must be in text document with RFC2822 email structure. It can contains optional mailbox-like \"From \" line.\n\n| list add-bin \u003clist\u003e [\u003cbin\u003e] |\nAdds one email from file \u003cbin\u003e in internal binary format (see mode bin) into list file \u003clist\u003e. If input file is not specified stdin will be used.\n\n| list get-bin \u003clist\u003e \u003coffset\u003e [\u003cbin\u003e] |\nRetrieves email at offset \u003coffset\u003e from list \u003clist\u003e and stores it into the file \u003cbin\u003e in internal binary format (see bin mode). If output file is not specified stdout will be used.\n\n| list get-part \u003clist\u003e \u003coffset\u003e \u003cpart\u003e [\u003cfile\u003e] |\nRetrieves email part \u003cpart\u003e from email at offset \u003coffset\u003e in list \u003clist\u003e and stores it into the file \u003cfile\u003e. If output file is not specified stdout will be used.\n\n| list gen-html \u003clist\u003e \u003coffset\u003e [\u003chtml\u003e] |\nGenerates HTML page for email at offset \u003coffset\u003e in list \u003clist\u003e and stores it into the file \u003chtml\u003e. If output file is not specified stdout will be used.\n\n\n** Commands for bin mode **\n\nBin mode is used for reading and generating emails in internal binary format.\n\n| bin view [\u003cbin\u003e] |\nShows email (including parts) from file \u003cbin\u003e which is in internal binary format. If input file is not specified stdin will be used.\n\n| bin from-email [\u003cemail\u003e] [\u003cbin\u003e] |\nConverts email from file \u003cemail\u003e into binary file \u003cbin\u003e. Input file must be text document with RFC2822 structure. It can contains optional mailbox-like \"From \" line. If output file is not specified stdout will be used. If input file is not specified stdin will be used.\n\n| bin get-part \u003cpart\u003e [\u003cbin\u003e] [\u003cfile\u003e] |\nRetrieves email part \u003cpart\u003e from email \u003cbin\u003e and stores it into the file \u003cfile\u003e. If output file is not specified stdout will be used.\n\n| bin gen-html [\u003cbin\u003e] [\u003chtml\u003e] |\nGenerates HTML page for email \u003cbin\u003e and stores it into the file \u003chtml\u003e. If output file is not specified stdout will be used.\n\n\n*** Terminal application plist-import-mboxes.pl ***\n\n$ plist-import-mboxes.pl \u003cdir\u003e \u003cmbox1\u003e [\u003cmbox2\u003e ...] [silent] [unescape]\n\nThis application adds all emails from all specified MBox files \u003cmbox1\u003e, \u003cmbox2\u003e, ... into archive \u003cdir\u003e. It skips all MBox files which were already processed and its modification dates were not changed since last run. If last argument is \"silent\" than no warnings about duplicate emails will be reported.\n\n=== Examples ===\n\n** Index mode **\n\nCreate new empty archive with name lkml and use SQLite:\n$ plist.pl index create lkml\n\nCreate new empty archive with name test and use MySQL (db name: testdb, server: localhost, username: user, password: password):\n$ plist.pl index create test mysql testdb:localhost user password\n\nSet description of archive lkml to Linux Kernel Mailing List:\n$ plist.pl index config lkml description \"Linux Kernel Mailing List\"\n\nEnable auto pregenerating of HTML pages for all new emails which will be added to archive test:\n$ plist.pl index config autopregen 1\n\nAdd one email from stdin to archive lkml:\n$ plist.pl index add-email lkml\n\nAdd one email from file email.rfc822 to archive test:\n$ plist.pl index add-email test email.rfc822\n\nAdd all emails from MBox file archive.mbox to archive lkml (in silent mode - without warnings about duplicate emails):\n$ plist.pl index add-mbox lkml archive.mbox silent\n\nDelete email with id 201406241206@example.org from archive test47:\n$ plist.pl index del test47 201406241206@example.org\n\nRetrieve email with id id4247@test from archive arch and store it into the file file.bin (in internal binary format):\n$ plist.pl index get-bin arch id4247@test file.bin\n\nRetrieve email part 0/0/1 from email with id id4742@test from archive arch and store into the file file.pdf:\n$ plist.pl index get-part arch id4742@test 0/0/1 file.pdf\n\nGenerate HTML page from email with id 201406241205@example.org from archive test and store it into the file email.html:\n$ plist.pl index gen-html test 201406241205@example.org email.html\n\nMark email with id spam@example.org in archive arch as spam:\n$ plist.pl index setspam arch spam@example.org true\n\n\n** List mode **\n\nConvert MBox file file.mbox (with all emails) to file file.list (in internal binary list format):\n$ plist.pl list add-mbox file.list file.mbox\n\nRetrieve email which starts at offset 1024 in binary list file file.list and store it into the file file.bin (in internal binary format):\n$ plist.pl list get-bin file.list 1024 file.bin\n\n\n** Bin mode **\n\nGenerate HTML page from MIME email which is on stdin and write it to stdout:\n$ plist.pl bin from-email | plist.pl bin gen-html\n\n\n** Import more MBox files **\n\nAdd all emails from MBox files /201401.mbox and /201402.mbox to archive lkml (in silent mode - without warnings about duplicate emails):\n$ plist-import-mboxes.pl lkml /201401.mbox /201402.mbox silent\n\nAdd all emails from MBox files with extension .mbox which are in directory tree /lkml/ to archive lkml (silent mode):\n$ plist-import-mboxes.pl lkml $(find /lkml/ -name *.mbox) silent\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpali%2Fplist","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpali%2Fplist","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpali%2Fplist/lists"}