Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/qxzzxq/faker
Generate fake data for Scala and Spark :tophat:
https://github.com/qxzzxq/faker
fake fake-data faker faker4s scala spark spark-data-generator test-data test-data-generator testing
Last synced: about 2 months ago
JSON representation
Generate fake data for Scala and Spark :tophat:
- Host: GitHub
- URL: https://github.com/qxzzxq/faker
- Owner: qxzzxq
- License: apache-2.0
- Created: 2020-06-15T07:53:35.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-03-29T18:13:04.000Z (almost 3 years ago)
- Last Synced: 2023-03-10T04:22:06.313Z (almost 2 years ago)
- Topics: fake, fake-data, faker, faker4s, scala, spark, spark-data-generator, test-data, test-data-generator, testing
- Language: Scala
- Homepage:
- Size: 316 KB
- Stars: 14
- Watchers: 1
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# faker
![build](https://github.com/qxzzxq/faker/workflows/build/badge.svg) [![codecov](https://codecov.io/gh/qxzzxq/faker/branch/master/graph/badge.svg)](https://codecov.io/gh/qxzzxq/faker)
Faker is a Scala library that can generate fake data.
This project is inspired by the other faker projects.
## Get faker
This project does not have a stable release yet (maybe soon). To test faker, use the snapshot version:
```xml
ossrh-snapshots
https://oss.sonatype.org/content/repositories/snapshots/
dev.qinx
faker_2.12
0.0.1-SNAPSHOT
```
## Usage
Just add *faker*'s annotations into a standard Scala case class and *faker* will handle the rest.
```scala
import java.time.LocalTimeimport dev.qinx.faker.Faker
import dev.qinx.faker.annotation.base.{IntType, Text}
import dev.qinx.faker.annotation.datetime.{Date, Time}
import dev.qinx.faker.annotation.geo.{Lat, Lon}
import dev.qinx.faker.annotation.person.Name
import dev.qinx.faker.annotation.transport.Airport
import dev.qinx.faker.enums.Localecase class MyClass1(@Date date: String,
@Time time: LocalTime,
@Airport(country = "FR") airportCode: String,
@Name(locale = Locale.zh_CN) nameCN: String,
@Text(pattern = "??-###") id: String,
int1: Int,
@IntType(seed = "123") int2: String,
@IntType(min = 0, max = 10) int3: Int)new Faker[MyClass1] get 10 foreach println
// MyClass1(2014-01-13,18:17:11,ORY,平磊,bm-250,-1501092464,1018954901,1)
// MyClass1(2012-09-30,19:23:54,TLS,申秀芳,XK-741,-252380286,1295249578,9)
// MyClass1(1994-05-22,03:27:26,ORY,敬桂英,VY-110,1097976093,-1829099982,9)
// MyClass1(2008-02-02,03:09:18,ORY,苑波,YX-343,-43394488,1111887674,5)
// MyClass1(1973-07-31,10:57:45,ORY,迟俊,ah-905,1558683535,-1621910390,7)
// MyClass1(1984-04-09,04:24:45,BOD,芦秀珍,Uo-900,366140728,-1935747844,2)
// MyClass1(2016-05-01,03:42:27,MRS,季兰英,qV-446,-995036697,696711130,7)
// MyClass1(1977-10-08,10:04:50,MRS,荆红,pk-108,-1973051050,-1366603797,2)
// MyClass1(1974-09-16,23:26:43,BOD,廖秀梅,He-851,-462826625,1149563170,5)
// MyClass1(1981-04-19,08:03:33,BSL,韩超,hk-371,1977755351,1041944832,6)
```Faker can also handle array (the support of other collections are still a WIP)
```scala
case class MyClass2(myClass1: Array[MyClass1])val fake = new Faker[MyClass2].get()
fake.myClass1 foreach println
// MyClass1(1990-08-17,23:56:35,NCE,匡健,uE-846,838894708,1018954901,8)
// MyClass1(1974-10-16,01:45:04,BSL,傅雪,ch-192,1582091789,1295249578,8)
// MyClass1(1988-04-09,07:44:15,LYS,查雷,fx-476,1388926418,-1829099982,2)// By default Faker generate an array of totalLength 3, you can adjust it by adding @ArrayType annotation
case class MyClass2Bis(@ArrayType(length = 10) myClass1: Array[MyClass1])
```It's also possible to cross join one field to another to generate all the possible combinations. In such a use case, use the `@Series` annotation
```scala
case class CrossJoinExample(@Series(length = 2) @Date date: String,
@Series(length = 3, crossJoin = "date") @Text(pattern = "??-###") id: String,
@Series(id = "myInput", crossJoin = "id") name: String,
@FloatType(min = 10, max = 20) price: Float)val faker = new Faker[CrossJoinExample]
faker.putSeries("myInput", Array("apple", "banana", "orange"))import spark.implicits._ // to visualize as a spark dataset
faker.getDataSeries.toDS().show()
// +----------+------+------+---------+
// | date| id| name| price|
// +----------+------+------+---------+
// |1984-11-13|ym-025| apple|15.212642|
// |1984-11-13|ym-025|banana|17.103207|
// |1984-11-13|ym-025|orange|17.610806|
// |1984-11-13|eN-286| apple|11.953639|
// |1984-11-13|eN-286|banana|10.136742|
// |1984-11-13|eN-286|orange|12.635193|
// |1984-11-13|wA-500| apple| 12.24368|
// |1984-11-13|wA-500|banana|16.292076|
// |1984-11-13|wA-500|orange|15.748799|
// |1980-09-04|ym-025| apple| 16.94527|
// |1980-09-04|ym-025|banana| 18.91407|
// |1980-09-04|ym-025|orange|11.317879|
// |1980-09-04|eN-286| apple|19.948784|
// |1980-09-04|eN-286|banana|14.381845|
// |1980-09-04|eN-286|orange|16.428938|
// |1980-09-04|wA-500| apple|19.262041|
// |1980-09-04|wA-500|banana|10.533231|
// |1980-09-04|wA-500|orange|19.337175|
// +----------+------+------+---------+
```Faker also provides the classic flavor API like the other faker libraries:
```scala
import dev.qinx.faker.FakerFaker.person().name()
Faker.localDate()
Faker.array[String](5)
Faker.array[MyClass1](5)
```## Available Annotations
- base
- `@IntType`
- `@LongType`
- `@FloatType`
- `@DoubleType`
- `@Text`
- `@UpperCase`
- `@LowerCase`
- `@Digit`
- datetime
- `@Date`
- `@Time`
- `@DateTime`
- geo
- `@Lat`
- `@Lon`
- person
- `@Name`
- transport
- `@Airport`
- company
- `@Company`
- data
- `@Series`
- `@Gaussian`
- collection
- `@ArrayType`
More features will be added.## Localization
Some annotations can handle locales, just configure the annotation with
the corresponding `Locale` value like this:
```Scala
@Name(locale = Locale.zh_CN)
```## Performance
Faker is thread safe, however the concurrent use may lead to poor performance (due to the limit of the Java `Random` class).