Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/t-kalinowski/yasp

String functions for compact R code
https://github.com/t-kalinowski/yasp

Last synced: 4 months ago
JSON representation

String functions for compact R code

Host: GitHub
URL: https://github.com/t-kalinowski/yasp
Owner: t-kalinowski
License: other
Created: 2017-11-10T14:28:53.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-05-10T15:08:33.000Z (almost 6 years ago)
Last Synced: 2024-10-12T04:47:40.035Z (4 months ago)
Language: R
Homepage: https://t-kalinowski.github.io/yasp/
Size: 85.9 KB
Stars: 8
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # yasp: String Functions for Compact R Code

[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/yasp)](https://cran.r-project.org/package=yasp)

[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/last-month/yasp?color=blue)](https://r-pkg.org/pkg/yasp)

yasp is a small `R` package for working with character vectors. It is written

in pure base `R` and has no dependancies. It includes:

### `paste` wrappers with short names and various defaults

|                 | mnemonic                  | `collapse=`| `sep=` |

| :-------------- | :------------------------ | :--------- | :----- |

| `p()`, `p0()`   | paste, paste0             | `NULL`     | `""`   |

| `ps()`, `pss()` | paste (sep) space         | `NULL`     | `" "`  |

| `psh()`         | paste sep hyphen          | `NULL`     | `"_"`  |

| `psu()`         | paste sep underscore      | `NULL`     | `"-"`  |

| `psnl()`        | paste sep newline         | `NULL`     | `"\n"` |

| `pc()`          | paste collapse            | `""`       | `""`   |

| `pcs()`         | paste collapse space      | `" "`      | `""`   |

| `pcc()`         | paste collapse comma      | `", "`     | `""`   |

| `pcsc()`        | paste collapse semicolon  | `"; "`     | `""`   |

| `pcnl()`        | paste collapse newline    | `"\n"`     | `""`   |

| `pc_and()`      | paste collapse and        | _varies_   | `""`   |

| `pc_or()`       | paste collapse or         | _varies_   | `""`   |

`pc_and` and `pc_or` collapses vectors of length 3 or greater using a serial 

comma (aka, oxford comma)

``` r

pc_and( letters[1:2] )  # "a and b"

pc_and( letters[1:3] )  # "a, b, and c"

pc_or( letters[1:2] )  # "a or b"

pc_or( letters[1:3] )  # "a, b, or c"

```

### `wrap` and variants

Wrap a string with some characters

```

wrap("abc", "__")  #  __abc__

dbl_quote("abc")   #   "abc"

sngl_quote("abc")  #   'abc'

parens("abc")      #   (abc)

bracket("abc")     #   [abc]

brace("abc")       #   {abc}

```

### `unwrap`, `unparens`

Remove pairs of characters from a string

``` r

label <- p("name", parens("attribute"))

label             # "name (attribute)"

unparens(label)   # "name attribute"

# by default, removes all matching pairs of left and right

x <- c("a", "(a)", "((a))", "(a) b", "a (b)", "(a) (b)" )

data.frame( x, unparens(x), check.names = FALSE )

#>         x unparens(x)

#> 1       a           a

#> 2     (a)           a

#> 3   ((a))           a

#> 4   (a) b         a b

#> 5   a (b)         a b

#> 6 (a) (b)         a b

```

specify `n_pairs` to remove a specific number of pairs

``` r

x <- c("(a)", "((a))", "(((a)))", "(a) (b)", "(a) (b) (c)", "(a) (b) (c) (d)")

data.frame( x, "n_pairs=1"   = unparens(x, n_pairs = 1),

               "n_pairs=2"   = unparens(x, n_pairs = 2),

               "n_pairs=3"   = unparens(x, n_pairs = 3),

               "n_pairs=Inf" = unparens(x), # the default 

               check.names = FALSE)

  

#>                 x     n_pairs=1   n_pairs=2 n_pairs=3 n_pairs=Inf

#> 1             (a)             a           a         a           a

#> 2           ((a))           (a)           a         a           a

#> 3         (((a)))         ((a))         (a)         a           a

#> 4         (a) (b)         a (b)         a b       a b         a b

#> 5     (a) (b) (c)     a (b) (c)     a b (c)     a b c       a b c

#> 6 (a) (b) (c) (d) a (b) (c) (d) a b (c) (d) a b c (d)     a b c d

```

use `unwrap()` to specify any pair of characters for left and right

``` r

x <- "A string with some \\emph{latex tags}."

unwrap(x, "\\emph{", "}")

#> [1] "A string with some latex tags."

```

by default, only pairs are removed. Set a character to `""` to override.

``` r

x <- c("a)", "a))", "(a", "((a" )

data.frame(x, unparens(x), 'left=""' = unwrap(x, left = "", right = ")"),

           check.names = FALSE)

  

#>     x unparens(x) left=""

#> 1  a)          a)       a

#> 2 a))         a))       a

#> 3  (a          (a      (a

#> 4 ((a         ((a     ((a

```

### `sentence`

`paste` with some additional string cleaning (mostly concerning

whitespace) appropriate for prose sentences. It

  + trims leading and trailing whitespace

  + collapses runs of multiple whitespace into a single space

  + appends a period `.` if there is no terminal punctuation mark (`.`, `?`, or `!`)

  + removes spaces preceding punctuation characters: `.?!,;:`

  + collapses sequences of punctuation characters (`.?!,;:`) (possibly

      separated by spaces), into a single punctuation character. The first

      punctuation character of the sequence is used, with priority given to

      terminal punctuation marks `.?!` if present

  + makes sure a space or end-of-string follows every one of

      `.?!,;:`, with an exception for the special case of `.,:`

      followed by a digit, indicating the punctuation is a decimal period, 

      number seperator, or time delimiter

  + capitalizes the first letter of each sentence (start-of-string or

      following a `.?!`)

      

Some examples in `?sentence`:

``` r

compare <- function(x) cat(sprintf(' in: "%s"\nout: "%s"\n', x, sentence(x)))

#>  in: "capitilized and period added"

#> out: "Capitilized and period added."

#>  in: "whitespace:added ,or removed ; like this.and this"

#> out: "Whitespace: added, or removed; like this. And this."

#>  in: "periods and commas in numbers like 1,234.567 are fine !"

#> out: "Periods and commas in numbers like 1,234.567 are fine!"

#>  in: "colons can be punctuation or time : 12:00 !"

#> out: "Colons can be punctuation or time: 12:00!"

#>  in: "only one punctuation at a time!.?,;"

#> out: "Only one punctuation at a time!"

#>  in: "The first mark ,; is kept;,,with priority for terminal marks  ;,."

#> out: "The first mark, is kept; with priority for terminal marks."

# vectorized like paste()

sentence(

 "The", c("first", "second", "third"), "letter is", letters[1:3],

 parens("uppercase:", sngl_quote(LETTERS[1:3])), ".")

#> [1] "The first letter is a (uppercase: 'A')." 

#> [2] "The second letter is b (uppercase: 'B')."

#> [3] "The third letter is c (uppercase: 'C')."

```

## Installation

You can install 'yasp' from CRAN with:

``` r

install.packages("yasp")

```

Or install from github with:

``` r

# install.packages("devtools")

devtools::install_github("t-kalinowski/yasp")

```