Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/julianmendez/wikihtml

Application in Java that converts wikitext documents into HTML documents.
https://github.com/julianmendez/wikihtml

converter java mediawiki wiki-format wiki-markup wikitext

Last synced: about 1 month ago
JSON representation

Application in Java that converts wikitext documents into HTML documents.

Awesome Lists containing this project

README

        

# [WikiHTML](https://julianmendez.github.io/wikihtml/)

[![build](https://github.com/julianmendez/wikihtml/workflows/Java%20CI/badge.svg)](https://github.com/julianmendez/wikihtml/actions)
[![maven central](https://maven-badges.herokuapp.com/maven-central/de.tu-dresden.inf.lat.wikihtml/wikihtml/badge.svg)](https://search.maven.org/#search|ga|1|g%3A%22de.tu-dresden.inf.lat.wikihtml%22)
[![license](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0.txt)
[![license](https://img.shields.io/badge/license-LGPL%203.0-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0.txt)

**WikiHTML** is a Java library and executable standalone application that converts a document in wiki text format to an HTML document.

## Download

* [executable JAR file](https://sourceforge.net/projects/latitude/files/wikihtml/0.1.0/wikihtml-0.1.0.jar/download)
* [The Central Repository](https://repo1.maven.org/maven2/de/tu-dresden/inf/lat/wikihtml/)
* as dependency:

```xml

de.tu-dresden.inf.lat.wikihtml
wikihtml
0.1.0

```

## Use

It can be used as a Java library or from the command line. For example, use:

```
java -jar wikihtml-0.1.0.jar inputfile.text outputfile.html
```

to create a new HTML file from the command line, and use

```
java -jar wikihtml-0.1.0.jar inputoutputfile.html
```

to just update an HTML file with embedded wiki text.

## Description

Wiki markup, also wikitext or wikicode, is a markup language for wiki-based pages. It is a simplified human-friendly substitute of HTML. This library reads text written in this markup language and produces an HTML document. There are several "dialects" of wiki markup. This library implements a subset of the language used by the [MediaWiki](https://www.mediawiki.org/wiki/MediaWiki) software.

The application generates the HTML document with the original wiki markup source code inside. Technically, the source code will be between: ``. This allows to update an HTML file using the source in the same file.

This could be useful, for example, when maintaining documentation of a project. The files can be easily edited using a text editor, but after processing them with this library, they can be viewed with a browser.

When using only the wiki formatting, the produced document is an [XHTML 1.1](https://www.w3.org/TR/xhtml11/) document.

#### Sections

Sections are marked at the beginning of a line. The heading should be between a sequence of equals signs (=). Using more equals signs makes the heading smaller. For example:

| wiki markup | HTML |
|:------------------------------|:------------------------|
| `= heading 1 =` |

heading 1

|
| `== heading 2 ==` |

heading 2

|
| `=== heading 3 ===` |

heading 3

|
| `==== heading 4 ====` |

heading 4

|
| `===== heading 5 =====` |
heading 5
|
| `====== heading 6 ======` |
heading 6
|

#### Line breaks

A new line is marked with two new lines. For example,

```
Two lines
together
are not considered different lines.
```

is rendered

```
Two lines together are not considered different lines.
```

but:

```
One line.

Another line.
```

is rendered

```
One line.
Another line.
```

#### Indented text

Text can be indented using colons (:) at the beginning of the line. For example:

```
: item 1
: item 2
:: item 2.1
:: item 2.2
::: item 2.2.1
: item 3
```

produces:

```
item 1
item 2
item 2.1
item 2.2
item 2.2.1
item3
```

#### Unordered lists

Items in a list are marked with asterisks (*) at the beginning of the line. A subitem is marked with more asterisks. For example:

```
* item 1
* item 2
** item 2.1
** item 2.2
*** item 2.2.1
* item 3
```

is rendered as

* item 1
* item 2
* item 2.1
* item 2.2
* item 2.2.1
* item 3

#### Ordered lists

Numbered items are marked with hash signs (#) at the beginning of the line. A subitem is marked with more hash signs. For example:

```

# item 1

# item 2

## item 2.1

## item 2.2

### item 2.2.1

# item 3
```

is rendered as

1. item 1
2. item 2
1. item 2.1
2. item 2.2
1. item 2.2.1
3. item 3

#### Text formatting

The text can be formatted using apostrophes (') according to the following table:

| wiki markup | HTML |
|:------------------------------|:------------------------|
| `''italics''` | *italics* |
| `'''bold'''` | **bold** |
| `'''''bold italics'''''` | ***bold italics*** |

#### Links

Links can be marked with square backets ([ ]). For example:
`[https://www.wikipedia.org Wikipedia]` renders [Wikipedia](https://www.wikipedia.org).
If the brackets are omitted, the URI is shown directly. For example: `https://www.wikipedia.org` renders https://www.wikipedia.org .

The double square brackets ([[ ]]) are rendered as local links.

#### Tables

This wiki text:

```markdown
{| border="1"
| 4 || 9 || 2
|-
| 3 || 5 || 7
|-
| 8 || 1 || 6
|}
```

produces the following table:

492
357
816

(without the white and gray alternation of lines)

The following wiki text is not implemented in MediaWiki, but it also produces the same table:

* using semicolon:

```markdown
{||; border="1"
4;9;2
3;5;7
8;1;6
||}
```

* using comma:

```markdown
{||, border="1"
4,9,2
3,5,7
8,1,6
||}
```

* using tabs:

```markdown
{|| border="1"
4 9 2
3 5 7
8 1 6
||}
```

#### nowiki

The pair of tags ``...`` is used to mark text without using the wiki formatting. For example:
`'''non-bold'''` is not in bold.

#### Variables

The following MediaWiki variables are implemented:

| name | example | meaning |
|:----------------------------------|:-----------------|:-----------------------------------------------------------------------|
|{{CURRENTDAY}} | `1` | Displays the current day in numeric form. |
|{{CURRENTDAY2}} | `01` | Same as {{CURRENTDAY}}, but with leading zero (01 .. 31). |
|{{CURRENTDAYNAME}} | `Friday` | Name of the day in the language of the project or English. |
|{{CURRENTDOW}} | `5` | Same as {{CURRENTDAYNAME}}, but as a number (0=Sunday, 1=Monday...). |
|{{CURRENTMONTH}} | `01` | The number 01 .. 12 of the month. |
|{{CURRENTMONTHABBREV}}| `Jan` | Same as {{CURRENTMONTH}}, but in abbreviated form as Jan .. Dec. |
|{{CURRENTMONTHNAME}} |`January` | Same as {{CURRENTMONTH}}, but in named form January .. December. |
|{{CURRENTTIME}} | `16:03` | The current time (00:00 .. 23:59). |
|{{CURRENTHOUR}} | `16` | The current hour (00 .. 23). |
|{{CURRENTWEEK}} | `1` | Number of the current week (1-53) according to ISO 8601 with no leading zero.|
|{{CURRENTYEAR}} | `2016` | Returns the current year. |
|{{CURRENTTIMESTAMP}} | `20160101160345` | ISO 8601 time stamp |

In addition, the {{LOCAL...}} variables are also implemented:{{LOCALDAY}}, {{LOCALDAY2}}, ... , {{LOCALTIMESTAMP}}. For example, in UTC+1 {{CURRENTTIMESTAMP}} returns `20160101160345`, while {{LOCALTIMESTAMP}} returns `20160101170345`.

#### HTML

HTML code can also be inserted directly. For example:
`bold` is the same as `'''bold'''`, and `λ` is rendered λ.

## Example

The file [mupuzzle.text](https://github.com/julianmendez/wikihtml/blob/master/wikihtml/src/test/resources/mupuzzle.text) has the following wiki text:

```
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])

# ''x''I → ''x''IU

# M''x'' → M''xx''

# ''x''III''y'' → ''x''U''y''

# ''x''UU''y'' → ''xy''
```

and is translated to the following HTML document:

```html


MIU system


(see MU puzzle)


  1. xI → xIU

  2. Mx → Mxx

  3. xIIIyxUy

  4. xUUyxy



```

The file [example.text](https://github.com/julianmendez/wikihtml/blob/master/wikihtml/src/test/resources/example.text) has more examples.

## Source code

To checkout and compile the project, use:

```
$ git clone https://github.com/julianmendez/wikihtml.git
$ cd wikihtml
$ mvn clean install
```

The created executable library, its sources, and its Javadoc will be in `wikihtml/target`.

To compile the project offline, first download the dependencies:

```
$ mvn dependency:go-offline
```

and once offline, use:

```
$ mvn --offline clean install
```

The bundles uploaded to [Sonatype](https://oss.sonatype.org/) are created with:

```
$ mvn clean install -DperformRelease=true
```

and then:

```
$ cd wikihtml/target
$ jar -cf bundle.jar wikihtml-*
```

The version number is updated with:

```
$ mvn versions:set -DnewVersion=NEW_VERSION
```

where *NEW_VERSION* is the new version.

## Architecture

The library reads a wiki text and creates a `WikiDocument`.
It extracts the wiki text from the given input and processes it line by line.

Each line is transformed into a `ConversionToken`. Each token is processed by a pipeline of objects where each one is a `Renderer`. Each renderer (`-Renderer`) processes each conversion token producing a list of conversion tokens. These are the input for the next renderer, if any. Some renderers are parameterized and grouped (`-GroupRenderer`). Some renderers process whole lines (in package `...line`) and some renderers process pieces of lines (in package `...part`).

For example, all variables are processed by `...part.DateVariableRenderer`, but the headings are processed by a group of renderers (`...line.HeadingGroupRenderer`) composed by 6 renderers (h1, h2, ..., h6), where each one is a `...line.HeadingRenderer`.

## Author

[Julian Mendez](https://julianmendez.github.io)

## Licenses

[Apache License Version 2.0](https://www.apache.org/licenses/LICENSE-2.0.txt), [GNU Lesser General Public License version 3](https://www.gnu.org/licenses/lgpl-3.0.txt)

## Release notes

See [release notes](https://julianmendez.github.io/wikihtml/RELEASE-NOTES.html).

## Contact

In case you need more information, please contact @julianmendez .