https://github.com/alainpham/large-file-processor

demo that shows different methods to handle large data sets and streams of various kinds such as XML, CSV..
https://github.com/alainpham/large-file-processor

Last synced: 3 months ago
JSON representation

demo that shows different methods to handle large data sets and streams of various kinds such as XML, CSV..

Host: GitHub
URL: https://github.com/alainpham/large-file-processor
Owner: alainpham
Created: 2017-01-30T08:58:13.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-01-30T08:59:17.000Z (over 8 years ago)
Last Synced: 2025-01-27T07:42:31.666Z (4 months ago)
Language: Java
Size: 221 KB
Stars: 0
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: ReadMe.txt

Awesome Lists containing this project

README

Large file processor sample project (OSGi)
=========================================

How to use this project

0) set the working folder where the data files in the blueprint.xml file

1) start the project locally :
mvn clean package camel:run

2) you can generate big files using the sample data by calling the following rest webservices

To generate a big CSV file
http://yourIP:7123/fileprocessor/gen/{fileSizeInMegaBytes}

i.e. http://yourIP:7123/fileprocessor/gen/100MB.csv

To generate a big XML files
http://yourIP:7123/fileprocessor/genxml/{fileSizeInMegaBytes}
i.e. http://yourIP:7123/fileprocessor/genxml/100MB.xml

3) run tests by using the rest services available

1 File + Built in tokenizer
http://yourIP:7123/fileprocessor/tokenize/100MB.csv

2 BufferedReader + readline
http://yourIP:7123/fileprocessor/bufferedread/100MB.csv

3 File + Built in tokenizer + 1 thread csv->java
http://yourIP:7123/fileprocessor/tokenizeparse/100MB.csv

4 File + Built in tokenizer + 4 thread csv->java
http://yourIP:7123/fileprocessor/tokenizeparsemulti/100MB.csv

5 BufferedReader + readline + 1 thread csv->java
http://yourIP:7123/fileprocessor/bufferedreadparsemono/100MB.csv

6 BufferedReader + readline + 4 thread csv->java
http://yourIP:7123/fileprocessor/bufferedreadparse/100MB.csv

7 BufferedReader + readline + 1 thread csv->java Java -> csv
http://yourIP:7123/fileprocessor/bufferedreadparsewritemono/100MB.csv

8 BufferedReader + readline + 4 thread csv->java Java -> csv
http://yourIP:7123/fileprocessor/bufferedreadparsewritemulti/100MB.csv

9 BufferedReader + readline + 1 thread csv->String String->csv
http://yourIP:7123/fileprocessor/bufferedreadparsewritemonorawtext/100MB.csv

10 BufferedReader + readline + 4 thread csv->String String->csv
http://yourIP:7123/fileprocessor/bufferedreadparsewritemultirawtext/100MB.csv

11 File + Built in tokenizer + 1 thread xml->dom Dom->xml
http://yourIP:7123/fileprocessor/tokenizexmlmono/100MB.xml

12 File + Built in tokenizer + 4 thread xml->dom Dom->xml
http://yourIP:7123/fileprocessor/tokenizexmlmulti/100MB.xml

4) while running test you can turn on a Jconsole to see heap usage

To build this project use

mvn install

To run the project you can execute the following Maven goal

mvn camel:run

To deploy the project in OSGi. For example using Apache ServiceMix
or Apache Karaf. You can run the following command from its shell:

osgi:install -s mvn:com.redhat.pocs/large-file-processor/1.0-SNAPSHOT

To deploy on fabric8

fabric8:deploy

For more help see the Apache Camel documentation

http://camel.apache.org/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alainpham/large-file-processor

Awesome Lists containing this project

README