Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/julianmendez/wikihtml
Application in Java that converts wikitext documents into HTML documents.
https://github.com/julianmendez/wikihtml
converter java mediawiki wiki-format wiki-markup wikitext
Last synced: about 1 month ago
JSON representation
Application in Java that converts wikitext documents into HTML documents.
- Host: GitHub
- URL: https://github.com/julianmendez/wikihtml
- Owner: julianmendez
- Created: 2015-07-10T09:38:14.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2023-12-19T22:19:06.000Z (about 1 year ago)
- Last Synced: 2024-11-16T17:42:39.218Z (2 months ago)
- Topics: converter, java, mediawiki, wiki-format, wiki-markup, wikitext
- Language: Java
- Homepage: https://julianmendez.github.io/wikihtml/
- Size: 173 KB
- Stars: 3
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: docs/README.md
Awesome Lists containing this project
README
# [WikiHTML](https://julianmendez.github.io/wikihtml/)
[![build](https://github.com/julianmendez/wikihtml/workflows/Java%20CI/badge.svg)](https://github.com/julianmendez/wikihtml/actions)
[![maven central](https://maven-badges.herokuapp.com/maven-central/de.tu-dresden.inf.lat.wikihtml/wikihtml/badge.svg)](https://search.maven.org/#search|ga|1|g%3A%22de.tu-dresden.inf.lat.wikihtml%22)
[![license](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0.txt)
[![license](https://img.shields.io/badge/license-LGPL%203.0-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0.txt)**WikiHTML** is a Java library and executable standalone application that converts a document in wiki text format to an HTML document.
## Download
* [executable JAR file](https://sourceforge.net/projects/latitude/files/wikihtml/0.1.0/wikihtml-0.1.0.jar/download)
* [The Central Repository](https://repo1.maven.org/maven2/de/tu-dresden/inf/lat/wikihtml/)
* as dependency:```xml
de.tu-dresden.inf.lat.wikihtml
wikihtml
0.1.0```
## Use
It can be used as a Java library or from the command line. For example, use:
```
java -jar wikihtml-0.1.0.jar inputfile.text outputfile.html
```to create a new HTML file from the command line, and use
```
java -jar wikihtml-0.1.0.jar inputoutputfile.html
```to just update an HTML file with embedded wiki text.
## Description
Wiki markup, also wikitext or wikicode, is a markup language for wiki-based pages. It is a simplified human-friendly substitute of HTML. This library reads text written in this markup language and produces an HTML document. There are several "dialects" of wiki markup. This library implements a subset of the language used by the [MediaWiki](https://www.mediawiki.org/wiki/MediaWiki) software.
The application generates the HTML document with the original wiki markup source code inside. Technically, the source code will be between: ``. This allows to update an HTML file using the source in the same file.
This could be useful, for example, when maintaining documentation of a project. The files can be easily edited using a text editor, but after processing them with this library, they can be viewed with a browser.
When using only the wiki formatting, the produced document is an [XHTML 1.1](https://www.w3.org/TR/xhtml11/) document.
#### Sections
Sections are marked at the beginning of a line. The heading should be between a sequence of equals signs (=). Using more equals signs makes the heading smaller. For example:
| wiki markup | HTML |
|:------------------------------|:------------------------|
| `= heading 1 =` |heading 1
|
| `== heading 2 ==` |heading 2
|
| `=== heading 3 ===` |heading 3
|
| `==== heading 4 ====` |heading 4
|
| `===== heading 5 =====` |heading 5
|
| `====== heading 6 ======` |heading 6
|#### Line breaks
A new line is marked with two new lines. For example,
```
Two lines
together
are not considered different lines.
```is rendered
```
Two lines together are not considered different lines.
```but:
```
One line.Another line.
```is rendered
```
One line.
Another line.
```#### Indented text
Text can be indented using colons (:) at the beginning of the line. For example:
```
: item 1
: item 2
:: item 2.1
:: item 2.2
::: item 2.2.1
: item 3
```produces:
```
item 1
item 2
item 2.1
item 2.2
item 2.2.1
item3
```#### Unordered lists
Items in a list are marked with asterisks (*) at the beginning of the line. A subitem is marked with more asterisks. For example:
```
* item 1
* item 2
** item 2.1
** item 2.2
*** item 2.2.1
* item 3
```is rendered as
* item 1
* item 2
* item 2.1
* item 2.2
* item 2.2.1
* item 3#### Ordered lists
Numbered items are marked with hash signs (#) at the beginning of the line. A subitem is marked with more hash signs. For example:
```
# item 1
# item 2
## item 2.1
## item 2.2
### item 2.2.1
# item 3
```is rendered as
1. item 1
2. item 2
1. item 2.1
2. item 2.2
1. item 2.2.1
3. item 3#### Text formatting
The text can be formatted using apostrophes (') according to the following table:
| wiki markup | HTML |
|:------------------------------|:------------------------|
| `''italics''` | *italics* |
| `'''bold'''` | **bold** |
| `'''''bold italics'''''` | ***bold italics*** |#### Links
Links can be marked with square backets ([ ]). For example:
`[https://www.wikipedia.org Wikipedia]` renders [Wikipedia](https://www.wikipedia.org).
If the brackets are omitted, the URI is shown directly. For example: `https://www.wikipedia.org` renders https://www.wikipedia.org .The double square brackets ([[ ]]) are rendered as local links.
#### Tables
This wiki text:
```markdown
{| border="1"
| 4 || 9 || 2
|-
| 3 || 5 || 7
|-
| 8 || 1 || 6
|}
```produces the following table:
492
357
816(without the white and gray alternation of lines)
The following wiki text is not implemented in MediaWiki, but it also produces the same table:
* using semicolon:
```markdown
{||; border="1"
4;9;2
3;5;7
8;1;6
||}
```* using comma:
```markdown
{||, border="1"
4,9,2
3,5,7
8,1,6
||}
```* using tabs:
```markdown
{|| border="1"
4 9 2
3 5 7
8 1 6
||}
```#### nowiki
The pair of tags ``...`` is used to mark text without using the wiki formatting. For example:
`'''non-bold'''` is not in bold.#### Variables
The following MediaWiki variables are implemented:
| name | example | meaning |
|:----------------------------------|:-----------------|:-----------------------------------------------------------------------|
|{{CURRENTDAY}} | `1` | Displays the current day in numeric form. |
|{{CURRENTDAY2}} | `01` | Same as {{CURRENTDAY}}, but with leading zero (01 .. 31). |
|{{CURRENTDAYNAME}} | `Friday` | Name of the day in the language of the project or English. |
|{{CURRENTDOW}} | `5` | Same as {{CURRENTDAYNAME}}, but as a number (0=Sunday, 1=Monday...). |
|{{CURRENTMONTH}} | `01` | The number 01 .. 12 of the month. |
|{{CURRENTMONTHABBREV}}| `Jan` | Same as {{CURRENTMONTH}}, but in abbreviated form as Jan .. Dec. |
|{{CURRENTMONTHNAME}} |`January` | Same as {{CURRENTMONTH}}, but in named form January .. December. |
|{{CURRENTTIME}} | `16:03` | The current time (00:00 .. 23:59). |
|{{CURRENTHOUR}} | `16` | The current hour (00 .. 23). |
|{{CURRENTWEEK}} | `1` | Number of the current week (1-53) according to ISO 8601 with no leading zero.|
|{{CURRENTYEAR}} | `2016` | Returns the current year. |
|{{CURRENTTIMESTAMP}} | `20160101160345` | ISO 8601 time stamp |In addition, the {{LOCAL...}} variables are also implemented:{{LOCALDAY}}, {{LOCALDAY2}}, ... , {{LOCALTIMESTAMP}}. For example, in UTC+1 {{CURRENTTIMESTAMP}} returns `20160101160345`, while {{LOCALTIMESTAMP}} returns `20160101170345`.
#### HTML
HTML code can also be inserted directly. For example:
`bold` is the same as `'''bold'''`, and `λ` is rendered λ.## Example
The file [mupuzzle.text](https://github.com/julianmendez/wikihtml/blob/master/wikihtml/src/test/resources/mupuzzle.text) has the following wiki text:
```
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])# ''x''I → ''x''IU
# M''x'' → M''xx''
# ''x''III''y'' → ''x''U''y''
# ''x''UU''y'' → ''xy''
```and is translated to the following HTML document:
```html
```
The file [example.text](https://github.com/julianmendez/wikihtml/blob/master/wikihtml/src/test/resources/example.text) has more examples.
## Source code
To checkout and compile the project, use:
```
$ git clone https://github.com/julianmendez/wikihtml.git
$ cd wikihtml
$ mvn clean install
```The created executable library, its sources, and its Javadoc will be in `wikihtml/target`.
To compile the project offline, first download the dependencies:
```
$ mvn dependency:go-offline
```and once offline, use:
```
$ mvn --offline clean install
```The bundles uploaded to [Sonatype](https://oss.sonatype.org/) are created with:
```
$ mvn clean install -DperformRelease=true
```and then:
```
$ cd wikihtml/target
$ jar -cf bundle.jar wikihtml-*
```The version number is updated with:
```
$ mvn versions:set -DnewVersion=NEW_VERSION
```where *NEW_VERSION* is the new version.
## Architecture
The library reads a wiki text and creates a `WikiDocument`.
It extracts the wiki text from the given input and processes it line by line.Each line is transformed into a `ConversionToken`. Each token is processed by a pipeline of objects where each one is a `Renderer`. Each renderer (`-Renderer`) processes each conversion token producing a list of conversion tokens. These are the input for the next renderer, if any. Some renderers are parameterized and grouped (`-GroupRenderer`). Some renderers process whole lines (in package `...line`) and some renderers process pieces of lines (in package `...part`).
For example, all variables are processed by `...part.DateVariableRenderer`, but the headings are processed by a group of renderers (`...line.HeadingGroupRenderer`) composed by 6 renderers (h1, h2, ..., h6), where each one is a `...line.HeadingRenderer`.
## Author
[Julian Mendez](https://julianmendez.github.io)
## Licenses
[Apache License Version 2.0](https://www.apache.org/licenses/LICENSE-2.0.txt), [GNU Lesser General Public License version 3](https://www.gnu.org/licenses/lgpl-3.0.txt)
## Release notes
See [release notes](https://julianmendez.github.io/wikihtml/RELEASE-NOTES.html).
## Contact
In case you need more information, please contact @julianmendez .