An open API service indexing awesome lists of open source software.

https://github.com/femto-dev/femto

Sequence Indexing and Search
https://github.com/femto-dev/femto

bioinformatics chapel compressed-suffix-array compression fm-index indexing information-retrieval search

Last synced: 17 days ago
JSON representation

Sequence Indexing and Search

Awesome Lists containing this project

README

        

== ABOUT ==
FEMTO is an indexing and search system for queries on sequences of bytes.
FEMTO stands for the FM-index for External Memory with Throughput Optimizations.
This tool supports building large indexes in parallel with MPI and then
searching large indexes with a multithreaded server.

== REQUIREMENTS ==
FEMTO requires a 64-bit machine to build and test. 32-bit machines are
supported for search only. FEMTO is known to build with GCC for Linux/x86-64.

To build FEMTO from a release tarball, you will need a C++ compiler,
libssl-dev, and optionally MPI. When building from source, you will also
need flex, bison, autotools, and libtool. It has worked with GNU bison
2.5 and 2.4.1.

MPI is required for parallel index construction. Note that MPI runs across
machines of different endianness are not supported.

If you'd like to use MPI-parallel index construction, you'll need to
install a version of MPI which supports threads. We have used
OpenMPI 1.8.8, configured in the following way:
{{{
./configure --prefix=/opt/openmpi1.8.8 --enable-mpirun-prefix-by-default --enable-mpi-thread-multiple --with-threads
make
make install # on all compute nodes
# To make sure mpirun and mpicc are in the path for use with FEMTO
export PATH=$PATH:/opt/openmpi1.8.8/bin
export LD_LIBRARY_PATH=/opt/openmpi1.8.8/lib
}}}

== COMPILATION ==

Make sure you have met the requirements first!

We recommend starting with a FEMTO release tarball from
https://github.com/femto-dev/femto/releases .

If you prefer to use a source checkout, there are additional build
dependencies.

If you're starting out with a source checkout as with
{{{
git clone https://github.com/femto-dev/femto.git
cd femto
}}}
you will need to also generate the configure script:
{{{
sh autogen.sh
}}}

To build FEMTO, issue the following commands:
{{{
./configure
make
}}}

You'll see lots of warnings that things are declared/defined but not used; this
is normal and not a problem. If you get errors and compilation fails, you may
not have all the required dev libraries installed. (e.g. if it is running g++
and fails to find -lssl, that would indicate you need to install libssl)

To run the included unit tests, use
{{{
make check
}}}

== INSTALLATION ==

To install FEMTO in a particular place, be sure to include --prefix in
your configure line, as in
{{{
./configure --prefix ~/femto_install
}}}

As usual,
{{{
make install
}}}
will install the FEMTO tools to the destination specified by ./configure.

You can also run the commands from the build directory.

See src/mod_femto/README for information about installing the FEMTO
apache module.

== USAGE ==

To build an index, run
{{{
femto/src/dcx_cc/femto_index --tmp /path/to/tmp_dir \
--outfile index.femto \
files_or_directories_to_index
}}}

Then, to query the index, use femto_search.
To count the number of occurrences (quickly!), use:
{{{
femto/src/main_cc/femto_search /path/to/index_dir --count 'pattern'
}}}
To report documents that matched (time depends on # reported), use:
{{{
femto/src/main_cc/femto_search /path/to/index_dir 'pattern'
}}}
To report documents and offsets that matched (time depends on # reported), use:
{{{
femto/src/main_cc/femto_search /path/to/index_dir --offsets 'pattern'
}}}

To learn more about what kinds of patterns you can use, see
femto/src/main/QUERY_FORMAT.txt

== THIRD PARTY ACKNOWLEDGEMENTS ==

FEMTO source includes the Google RE2 package, jQuery, jQuery SlickGrid, and jQuery SVG.