Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/thammegowda/572-hw2
Home Work 2 Indexing + NER
https://github.com/thammegowda/572-hw2
Last synced: about 1 month ago
JSON representation
Home Work 2 Indexing + NER
- Host: GitHub
- URL: https://github.com/thammegowda/572-hw2
- Owner: thammegowda
- Created: 2015-10-21T01:59:07.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2015-12-11T18:47:45.000Z (almost 9 years ago)
- Last Synced: 2024-04-18T02:58:22.774Z (7 months ago)
- Language: Java
- Size: 0 Bytes
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.txt
Awesome Lists containing this project
README
Search Engine Assignment of USC CSCI 572
=========================================Project Structure
-----------------+ dump-poster : This directory contains python program to post documents
to Solr Cell
+ nutch-tika-solr : This is the main project which includes nutch content reader,
tika parser, solr indexer, graph generator, page rank computer
and solr document updater
+ query Runner : This directory contains python script to execute challenge questions
+ d3 : This directory contains python script for converting output of
page rank computer to json file and D3 visualization files
+ weapons-ner-dataset : This directory contains dataset we used to train NER model
for weapons, and also to build regex
+ lda : This directory contains files related to LDA task# The tutorial/usage instructions file
--------------------------------------
nutch-tika-solr/README.md : this file explains how to setup the environment
and build the package
nutch-tika-solr/step-by-step.txt : this file explains how to run the code
query-runner/README.md : this file explains how to run challenge queries
of the assignment
d3/README.md : this file explains how to run d3 visualizationSolr config files:
------------------
nutch-tika-solr/conf/solrconfig.xml : the solr config file
nutch-tika-solr/conf/schema.xml : the schema file
nutch-tika-solr/conf/stopwords.txt : The stopwords