{"id":31816698,"url":"https://github.com/splunk/splunk-shuttl","last_synced_at":"2025-10-11T09:57:48.501Z","repository":{"id":3322145,"uuid":"4365265","full_name":"splunk/splunk-shuttl","owner":"splunk","description":"Splunk app for archive management, including HDFS support.","archived":false,"fork":false,"pushed_at":"2014-09-03T00:33:27.000Z","size":52870,"stargazers_count":35,"open_issues_count":40,"forks_count":19,"subscribers_count":17,"default_branch":"master","last_synced_at":"2024-04-15T02:58:43.047Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/splunk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-05-18T02:54:15.000Z","updated_at":"2023-08-15T10:16:59.000Z","dependencies_parsed_at":"2022-08-31T16:30:35.134Z","dependency_job_id":null,"html_url":"https://github.com/splunk/splunk-shuttl","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/splunk/splunk-shuttl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splunk%2Fsplunk-shuttl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splunk%2Fsplunk-shuttl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splunk%2Fsplunk-shuttl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splunk%2Fsplunk-shuttl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/splunk","download_url":"https://codeload.github.com/splunk/splunk-shuttl/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/splunk%2Fsplunk-shuttl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279006749,"owners_count":26084185,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-11T09:57:44.043Z","updated_at":"2025-10-11T09:57:48.495Z","avatar_url":"https://github.com/splunk.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Project status:\nShuttl development has ***stalled***. There's no known developer working on this project.\n\nShuttl seems to work for Splunk 6.x when it's built off the develop branch, but it's experimental.\n\nShuttl - Archiving for Splunk \n=======================================\n\nSplunk is the premier technology for gaining Operational Intelligence on Machine Data. Since it\ncan handle large volume of data at a fast rate, often times users will only want to analyze\nrecent data, and data that is beyond a certain range is archived.\n\nSplunk provides hooks for allowing the administrator to designate archiving policies and\nactions. However, the actions are entirely implemented by the administrator of the system.\n\nShuttl provides a full-lifecycle solution for data in Splunk.\n\nIt can:\n* Manage the transfer of data from Splunk to an archive system\n* Enable an administrator to inventory/search the archive\n* Allow an administrator to selectively restore archived data into \"thawed\"\n* Remove archived data from thawed\n\nShuttl support on the following systems back-end systems for storage:\n* Attached storage\n* HDFS\n* S3 and S3n\n* Amazon Glacier\n\nLicense\n---------\n\nShuttl is licensed under the Apache License 2.0. Details can be found in the LICENSE file.\n\nShuttl is an unsupported community open source project and therefore is subject to being incomplete and containing bugs. \n\nThe Apache License only applies to Shuttl and no other Splunk software is implied.\n\nSplunk, in using the Apache License, does not provide any warranties or indemnification, and does not accept any liabilities with the use of Shuttl.\n\nWe are now accepting contributions from individuals and companies to our Splunk open source projects.\n\n\nPrerequisites\n-------------\n\n### Splunk\n\nCurrently the Splunk version used is 5.0.1.\nShuttl has support for Splunk Clustering.\n\nYou can download it [Splunk][splunk-download].  And see the [Splunk documentation][] for instructions on installing and more.\n\n[Splunk documentation]:http://docs.splunk.com/Documentation/Splunk/latest/User\n[splunk-download]:http://www.splunk.com/download\n\n### Java\n\n* Java JDK 6\n\n### Hadoop (optional)\n\nThis is needed if you are using HDFS. Currently the Hadoop version used is 1.1.1\n\nYou can download it from one of the [mirror sites][hadoop-download].\nAnd see the [Hadoop documentation][] for instructions on installing and more.\n\n[hadoop-download]:http://www.apache.org/dyn/closer.cgi?path=hadoop/core/hadoop-1.1.1\n[Hadoop documentation]:http://hadoop.apache.org/common/docs/r1.1.1\n\n\nDevelopment\n--------------\n\n### Eclipse Users\n\nYou'll need to build once, before you can use Eclipse\nThis .eclipse.templates directory contains templates for generating Eclipse files to configure\nEclipse for shuttl development.\n\n\n### Coding Conventions\n\nThe Shuttl code base is to follow the standard java conventions except for using braces for all for-loops, if-statements etc. We try to not use braces to avoid too much indentation. We rely on having tests to catch any mistake done by forgetting braces, when having more than one line after an if-statement or for-loop.\n\nThe standard java conventions can be found here:\nhttp://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html\n\nGetting Started\n---------------\n\nEnsure that:\n* `JAVA_HOME` environment variable is defined correctly\n* Make sure that you can run `ssh localhost` without having to enter a password\n* Make sure you have a tgz package of Splunk in the directory put-splunk-tgz-here (not needed if you are using your own Splunk instance, see below)\n\nBuild shuttl:\n\n\t$ ./buildit.sh\n\nRun the tests:\n\n\t$ ./testit.sh\n\n### How to Setup Passphraseless SSH\n\nHere's how you setup passphraseless ssh: http://hadoop.apache.org/common/docs/current/single_node_setup.html#Setup+passphraseless\n\n### Test configuration\n\nCreate a file called `build.properties`\n\nCopy the contents from `default.properties` to `build.properties` and edit the values you want to change\n\nInstalling the app\n------------------\n\nHere's how to install the Shuttl app in your Splunk instance. Shuttl comes with some pre-configured values that you might need to modify.\n\n### Install\n1. Build the app by running `ant dist`\n2. Extract the build/shuttl.tgz in your $SPLUNK_HOME/etc/apps/\n3. While Splunk is not running, configure Shuttl and Splunk as mentioned below\n4. Start Splunk up, and enable the Shuttl App via the Manager\n5. If the index is getting data, and calling the archiver, then you should see the data in HDFS\n\n### Shuttl Configuration (new)\nThere are three configuration files that you might care about. One for archiving, one for Splunk and one for the Shuttl server. They all live in the shuttl/conf directory. All the values are populated with default values to serve as an example.\n\nIn addition to these configuration files, there are property files for the backends. These live in shuttl/conf/backend directory. These need to be configured as well depending on the backendName you choose.\n\n#### archiver.xml:\n- localArchiverDir: A local path (or an uri with file:/ schema) where shuttl's archiver's temporary transfer data, locks, metadata, etc. is stored.\n- backendName: The of the backend you want to use. Currently supports: local, hdfs, s3, s3n and glacier.\n- archivePath: The absolute path in the archive where your files will be stored. Required for all backends.\n- clusterName: Unique name for your Splunk cluster. Use the default if you don't care to name your cluster for each Shuttl installation. Note, this is only a Shuttl concept for a group of Splunk indexers that should be treated as a cluster. Splunk does not have this notion.\n- serverName: This is the Splunk Server Name. Check Splunk Manager for that server to populate this value. Must be unique per Shuttl installation.\n- archiveFormats: The formats to archive the data as. The current available formats are SPLUNK_BUCKET, CSV and SPLUNK_BUCKET_TGZ. You can configure Shuttl to archive your data as all formats at the same time, which you can use for different use cases.\n* Warning: The old archiverRootURI is deprecated. It will still work for right now, but we recommend that you use the new configuration with property files instead.\n\n#### server.xml:\n- httpHost: The host name of the machine. (usually localhost)\n- httpPort: The port for the shuttl server. (usually 9090)\n\n#### splunk.xml:\n- host: The host name for the splunk instance where Shuttl is installed. Should be localhost\n- port: The management port for the splunk server. (Splunk defaults to 8089)\n- username: Splunk username\n- password: Splunk password\n\n#### backend/hdfs.properties (required for hdfs.):\n- hadoop.host: The host name to the hdfs name node. \n- hadoop.port: The port to the hdfs name node.\n\n#### backend/amazon.properties (required for s3, s3n or glacier)\n- aws.id: Your Amazon Web Services ID\n- aws.secret: Your Amazon Web Services secret\n- s3.bucket: Bucket name for storage in s3\n- glacier.vault: The vault name for storage in glacier.\n- glacier.endpoint: The server endpoint to where the data will be stored. (i.e. https://glacier.us-east-1.amazonaws.com/)\n* Note: The glacier backend currently uses both glacier and s3, so s3.bucket is still required when using glacier. This is also the reason why archivePath is always required.\n\nNote, the directory that the data will be archived to is\n\t[archivePath]/archive_data/[clusterName]/[serverName]/[indexName]\n\n### Splunk Index Configuration\n\nIn addition, you need to configure Splunk to call the archiver script (setting the coldToFrozenScript and/or warmToColdScript) for each index that is being archived. You can do this by creating an indexes.conf file in $SPLUNK_HOME/etc/apps/shuttl/local with the appropriate config stanzas. An example is as follows:\n\n\n\t[mytest]\n\thomePath = $SPLUNK_DB/mytest/db\n\tcoldPath = $SPLUNK_DB/mytest/colddb\n\tthawedPath = $SPLUNK_DB/mytest/thaweddb\n\trotatePeriodInSecs = 10\n\tfrozenTimePeriodInSecs = 120\n\tmaxWarmDBCount = 1\n\twarmToColdScript = $SPLUNK_HOME/etc/apps/shuttl/bin/warmToColdScript.sh\n\tcoldToFrozenScript = $SPLUNK_HOME/etc/apps/shuttl/bin/coldToFrozenScript.sh\n\nWARNING: the settings rotatePeriodInSecs, frozenTimePeriodInSecs, maxWarmDBCount are there only for testing to verify that data can be successfully transfered by inducing rapid bucket rolling. Don't use in production. See [Set a retirement and archiving policy](http://docs.splunk.com/Documentation/Splunk/latest/admin/Setaretirementandarchivingpolicy) and [Indexes.conf](http://docs.splunk.com/Documentation/Splunk/4.3.3/admin/Indexesconf) documentation to suit your test and deployment needs. Expected usage in production is that maxDataSize correspond to a HDFS block or larger (splunk default is 750mb), and maxHotIdleSecs should be set to 86400 for buckets approximately 24hrs worth of data.\n\nOther developer notes\n---------------------\n\n### Specifying which Hadoop version to run tests with\n\nIn your `build.properties`, set the property `hadoop.version` to the version you want to run\n\nNow run:\n\n\t$ `ant clean-all`\n\t$ `ant test-all`\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplunk%2Fsplunk-shuttl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsplunk%2Fsplunk-shuttl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsplunk%2Fsplunk-shuttl/lists"}