Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/arquillian/arquillian-cube-q

Fault injection and chaos testing all in one, sweet DSL.
https://github.com/arquillian/arquillian-cube-q
chaos-monkey docker java jvm testing
Last synced: about 2 months ago
JSON representation
Fault injection and chaos testing all in one, sweet DSL.
Host: GitHub
URL: https://github.com/arquillian/arquillian-cube-q
Owner: arquillian
License: apache-2.0
Created: 2015-11-30T22:20:28.000Z (about 9 years ago)
Default Branch: master
Last Pushed: 2021-06-22T11:15:21.000Z (over 3 years ago)
Last Synced: 2024-04-16T19:14:36.940Z (9 months ago)
Topics: chaos-monkey, docker, java, jvm, testing
Language: Java
Homepage:
Size: 263 KB
Stars: 11
Watchers: 11
Forks: 6
Open Issues: 17
Metadata Files:
- Readme: README.adoc
- License: LICENSE
Awesome Lists containing this project

README

        = Introduction

:numbered:

:sectlink:

:sectanchors:

:sectid:

:source-language: java

:source-highlighter: coderay

:sectnums:

:icons: font

:toc: left

:toclevels: 3

image:https://travis-ci.org/arquillian/arquillian-cube-q.svg?branch=master["Build Status", link="https://travis-ci.org/arquillian/arquillian-cube-q"]

Usually when we talk about writing tests, the first thing that comes to your mind is some kind of static test where you send an input and you expect an output.

For example you send a wrong parameter to a REST service, and you expect that it returns an error message/status code.

But usually applications runs more time than the amount of time it takes to execute all tests. Probably days or months until you update it.

And during this time, things happen, for example network starts to go slow, a unknown process eats all CPU or if you are using Docker, a container dies.

So you can see that when you run your application for a long time some kind of *chaos* might appear.

The question is, are you sure your application deals correctly with these situations?

You can test it manually, but if you want to apply CI/CD approach then you need some automatic way to execute them.

And this is where Arquillian Cube Q helps you.

Arquillian Cube Q is an extension of Arquillian Cube (https://github.com/arquillian/arquillian-cube) that allows you to write chaos tests.

Since Arquillian Cube Q is an extension of Cube, it relies on Docker to execute them.

== Chaos

image::http://www.starshipnivan.com/blog/wp-content/uploads/2010/10/De-Lancie-crop.jpg[]

There are several level of chaos that you might test, from network chaos (latency, bandwidth limitation, ...) to operative system chaos (cpu burn, io burn, dill disk, ...).

Arquillian Q as all the Arquillian project, it reuses existing chaos frameworks by integrating them into Arquillian philosophy.

Let's see what is supported:

=== Network Chaos

To do *network chaos* Arquillian Q integrates with Toxiproxy project (https://github.com/Shopify/toxiproxy).

Toxiproxy is a framework for simulating network conditions.

It is a TCP proxy that intercepts communication between two endpoints and adds some chaos before reaching the real endpoint.

Toxiproxy supports next toxics:

latency:: Add a delay to all data going through the proxy. The delay is equal to latency +/- jitter.

down:: Bringing a service down

bandwidth:: Limit a connection to a maximum number of kilobytes per second.

slow close:: Delay the TCP socket from closing until delay has elapsed.

timeout:: Stops all data from getting through, and closes the connection after timeout. If timeout is 0, the connection won't close, and data will be delayed until the toxic is removed.

slicer:: Slices TCP data up into small bits, optionally adding a delay between each sliced "packet".

Arquillian Q supports Toxiproxy by registering as docker container toxiproxy and then inspecting the cube definitions (in _cube_ format or _docker-compose_ format) and automatically redirect links to toxiproxy.

As an example:

ContainerA -link-> ContainerB

is converted to:

ContainerA -link-> ToxiproxyContainer -link-> ContainerB

After this you are able to program some toxics to Toxiproxy and execute the test.

==== Adding Dependency

Arquillian Q Toxiproxy is only a jar deployed to Maven central:

[source, xml]

.pom.xml

----

  

    

      org.jboss.arquillian

      arquillian-bom

      ${version.arquillian_core}

      pom

      import

    

  

  

    org.arquillian.cube.q

    arquillian-cube-q-toxic

    test

    ${version.arquillian_q}

  

  

    org.jboss.arquillian.junit

    arquillian-junit-standalone

    test

  

  

      junit

      junit

  

----

IMPORTANT: Notice that instead of registering `arquillian-junit-container` as you usually do in Arquillian test, you are using `arquillian-junit-standalone`. This is because it has no sense to use in these kind of tests microdeployments feature (method annotated with `@Deployment`).

==== Configuration

You don't need to configure anything else from the point of view of Q apart from Cube configuration file.

[source, xml]

.arquillian.xml

----

  

    dev

    

        hw:

          image: lordofthejars/helloworld

          env: ["CATALINA_OPTS=-Djava.security.egd=file:/dev/./urandom"]

          portBindings: [8081->8080/tcp]

          links:

            - pingpong:pingpong

        pingpong:

          image: jonmorehouse/ping-pong

          exposedPorts: [8080/tcp]

    

  

----

In this case container `helloworld` is connecting to `pingpong` container.

==== Test

Then the test looks like:

[source, java]

----

@RunWith(Arquillian.class)

public class ToxicFuntionalTestCase {

  @ArquillianResource

  private NetworkChaos networkChaos; // <1>

  @HostIp

  private String ip;

  @Test

  public void shouldAddLatency() throws Exception {

    networkChaos.on("pingpong", 8080).latency(latencyInMillis(4000)) // <2>

      .exec(() -> { // <3>

        URL url = new URL("http://" + ip + ":" + 8081 + "/hw/HelloWorld");

        final long l = System.currentTimeMillis();

        String response = IOUtil.asString(url.openStream());

        System.out.println(response);

        System.out.println("Time:" + (System.currentTimeMillis() - l));

        // assertions

    }); // <4>

  }

}

----

<1> Enrich the test with `NetworkChaos` instance to communicate with _Toxiproxy_.

<2> Adds a latency of 4 seconds when communication is done to `pingpong` container through port _8080_.

<3> Executes test logic. Notice that the execution time will be greater than 4 seconds.

<4> After callback executions, toxics are reseted.

TIP: `exec` method also supports you pass how many times do you want to execute the test: `networkChaos.on("pingpong", 8080).latency(latencyInMillis(4000)).exec(times(2), () -> {}` or for example the amount of time you want to keep executing the test `Q.on("pingpong", 8080).exec(during(15, TimeUnit.SECONDS), () -> {}`.

You can see full example at: https://github.com/arquillian/arquillian-cube-q/tree/master/ftest-toxic

==== Adding some randomness

Some of the discrete values set in toxics such as `slowClose`, `bandwidth`, `timeout` or `slice` can be randomized using mathematical distributions.

At this time two distributions are supported:

* Uniform Distribution: Distribution that returns values uniformally distributed across a range. You can read about this distribution at https://en.wikipedia.org/wiki/Discrete_uniform_distribution

* LogNormal Distribution: Returns log normally distributed values. You can use this website https://www.wolframalpha.com/input/?i=lognormaldistribution%28log%2890%29%2C+0.1%29 to play with the values.

You can read more about this distribution at https://en.wikipedia.org/wiki/Log-normal_distribution

For example, this is how you can randomize the latency:

[source, java]

----

networkChaos.on("pingpong", 8080)

            .latency(logNormalLatencyInMillis(2000, 0.3))

            .exec(times(2), () -> {

     URL url = new URL("http://" + ip + ":" + 8081 + "/hw/HelloWorld");

     final long l = System.currentTimeMillis();

     String response = IOUtil.asString(url.openStream());

     System.out.println(response);

     System.out.println("Time:" + (System.currentTimeMillis() - l));

});

----

In the configuration above, latency times are distributed in using a log normal distribution with median of 2 seconds and 0.3 as sigma value.

Then for each iteration of the test, a new value is calculated and send to toxiproxy.

==== Binding Ports Chaos

Sometimes you don't want to add chaos between containers but in binding ports.

That is adding chaos to the communication between host and containers.

This is really useful in cases when you want to test what's happening to your frontend application (javascript) when there is some chaos.

Assuming that A has a port binding, something like:

A -> B

is converted to:

Proxy -> A -> B

Where A has no port binding anymore but only exposed ports and it is the Proxy who has the port binding.

To use this just configure next parameter in `arquillian.xml` file:

[source, xml]

.arquillian.xml

----

    true

----

IMPORTANT: By defult this flag is false, if you set to true then no chaos can be done between containers, only between host and containers.

You can see an example at: https://github.com/arquillian/arquillian-cube-q/tree/master/ftest-toxic-frontend

=== Container Chaos

To do *container chaos* Arquillian Q integrates with Pumba project (https://github.com/Shopify/toxiproxy).

Pumba is an application that you run it on every Docker host, in your cluster and it, once in a while, will "randomly" stop running containers, matching specified name/s or name patterns.

You can even specify the signal, that will be sent to “kill” the container.

It supports:

* Stop a container.

* Remove a container.

* Kill a container process with signal.

Arquillian Q will register a Pumba container inside the configured docker host you set in Arquillian Q.

==== Adding Dependency

Arquillian Q Pumba is only a jar file deployed in Maven central.

[source, xml]

.pom.xml

----

  

    

      org.jboss.arquillian

      arquillian-bom

      ${version.arquillian_core}

      pom

      import

    

  

  

    org.arquillian.cube.q

    arquillian-cube-q-pumba

    test

    ${version.arquillian_q}

  

  

    org.jboss.arquillian.junit

    arquillian-junit-standalone

    test

  

  

      junit

      junit

  

----

IMPORTANT: Notice that instead of registering `arquillian-junit-container` as you usually do in Arquillian test, you are using `arquillian-junit-standalone`. This is because it has no sense to use in these kind of tests microdeployments feature (method annotated with `@Deployment`).

==== Configuration

You don't need to configure anything else from the point of view of Q apart from Cube configuration file.

[source, xml]

.arquillian.xml

----

  

    dev

    

      pingpong:

        image: jonmorehouse/ping-pong

        exposedPorts: [8080/tcp]

      pingpong2:

        image: jonmorehouse/ping-pong

        exposedPorts: [8080/tcp]

    

  

----

In this case we are defining two instances of same image.

==== Test

Then the test looks like:

[source, java]

----

@RunWith(Arquillian.class)

public class PumbaFunctionalTestCase {

  @ArquillianResource // <1>

  ContainerChaos containerChaos;

  @ArquillianResource

  DockerClient dockerClient; // <2>

  @Test

  public void shouldKillContainers() throws Exception {

    containerChaos

            .onCubeDockerHost()

                .killRandomly( // <3>

                        ContainerChaos.ContainersType.regularExpression("^pingpong"), // <4>

                        ContainerChaos.IntervalType.intervalInSeconds(4), // <5>

                        ContainerChaos.KillSignal.SIGTERM

                )

            .exec(); // <6>

        final List containers = dockerClient.listContainersCmd().exec();

        //Pumba container is not killed by itself

        assertThat(containers).hasSize(1);

    }

}

----

<1> Enrich test with container chaos

<2> Enrich test with `DockerClient` class to communicate with DockerHost in test

<3> Kills randomly one by one containers

<4> Kills only containers with name starting with _pingpong_

<5> Time to wait between kill another container

<6> Starts Pumba. In this case no callback used.

As happens in *Network Chaos* you can also specify test as callback and specify times to execute the test or the duration.

You can see full example at: https://github.com/arquillian/arquillian-cube-q/tree/master/ftest-pumba

=== Operative System Chaos

To do *operative system chaos* Arquillian Q uses some modified version scripts of Netflix Simian Army project ().

Some scripts have been modified to have sense into Docker world instead of AWS world.

It supports:

* Block a port using `iptables` command.

* Burn CPU using `dd` command. That is putting CPU to 100%.

* Burn IO using `dd` comomand.

* Fill disk with `dd` command.

* Kill process using `pkill` command.

* Null Route using `ip` command.

IMPORTANT: Scripts are executed inside the container. This means that the command used in the script must be installed inside the container. Some images might contain them, others not.

TIP: Making chaos with scripts means a whole new kind of possibilities since the only barrier is the commands you need to execute them. Please feel free to contribute with your own scripts.

==== Adding Dependency

Arquillian Simian Army is only a jar file deployed in Maven central.

[source, xml]

.pom.xml

----

  

    

      org.jboss.arquillian

      arquillian-bom

      ${version.arquillian_core}

      pom

      import

    

  

  

    org.arquillian.cube.q

    arquillian-cube-q-simianarmy

    test

    ${version.arquillian_q}

  

  

    org.jboss.arquillian.junit

    arquillian-junit-standalone

    test

  

  

      junit

      junit

  

----

IMPORTANT: Notice that instead of registering `arquillian-junit-container` as you usually do in Arquillian test, you are using `arquillian-junit-standalone`. This is because it has no sense to use in these kind of tests microdeployments feature (method annotated with `@Deployment`).

==== Test

Then the test looks like:

[source, java]

----

@RunWith(Arquillian.class)

public class SimianArmyFunctionalTestCase {

    @ArquillianResource // <1>

    OperativeSystemChaos operativeSystemChaos;

    @HostIp

    String dockerHost;

    @HostPort(containerName = "pingpong ", value = 8080)

    int port;

    @Test(expected = Exception.class) @Ignore //Running this test in same machine makes everything screwed

    public void shouldExecuteBurnCpuChaos() throws Exception {

        operativeSystemChaos.on("pingpong") // <2>

            .burnCpu(singleCpu()) // <3>

            .exec(); // <4>

        //.....

    }

----

<1> Enrich test with operative system chaos

<2> Sets the container to set the chaos

<3> Sets burn cpu chaos as if the system had only one cpu

<4> Starts the burn cpu script. In this case no callback used

As happens in *Network Chaos* you can also specify test as callback and specify times to execute the test or the duration.

You can see full example at: https://github.com/arquillian/arquillian-cube-q/tree/master/ftest-simianarmy