https://github.com/vecnet/vecnet-dl

VecNet Digital Library
https://github.com/vecnet/vecnet-dl

Last synced: 4 months ago
JSON representation

VecNet Digital Library

Host: GitHub
URL: https://github.com/vecnet/vecnet-dl
Owner: vecnet
License: other
Created: 2013-04-16T15:02:36.000Z (about 13 years ago)
Default Branch: master
Last Pushed: 2016-03-14T14:54:53.000Z (over 10 years ago)
Last Synced: 2025-04-02T12:38:58.922Z (about 1 year ago)
Language: Ruby
Homepage:
Size: 14.4 MB
Stars: 2
Watchers: 14
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Vecnet Metadata Catalog

This application provides the [Vecnet Metadata Catalog](http://dl.vecnet.org).
It handles the curation and indexing of the data generated by the Vecnet [cyberinfrastructure](https://vecnet.org/).

## Dependencies

* [Fedora Commons](http://fedora-commons.org/) 3.6
* Solr 4.3
* Redis (version?)
* Postgresql or other SQL database
* nginx
* [chruby](https://github.com/postmodern/chruby) Ruby version manager

The [SETUP](SETUP) file has detailed steps on installing the platform on a bare RHEL machine.

## Deployment

First, your public ssh key needs to be put on the server. Ask Don to do this.
To deploy to QA:

cap qa deploy

To deploy to Production:

cap production deploy

To deploy from branch

cap deploy -S branch=

To deploy new nginx config. This will reload nginx.

cap vecnet:update_nginx_config

## Other server admin tasks

To rebuild the Fedora object store:

sudo service tomcat6 stop
cd /opt/fedora/server/bin
sudo FEDORA_HOME=/opt/fedora CATALINA_HOME=/usr/share/tomcat6 ./fedora-rebuild.sh
# choose option 1 to rebuild the resource index
sudo FEDORA_HOME=/opt/fedora CATALINA_HOME=/usr/share/tomcat6 ./fedora-rebuild.sh
# choose option 2 to rebuild the SQL database
sudo service tomcat6 start

To resolarize everything...it will take a LONG time to complete.

chruby 2.0.0-p353
RAILS_ENV=qa bundle exec rake solrizer:fedora:solrize_objects

To load and build the MeSH trees run. This will run for a while (~0.5--1 hours)

chruby 2.0.0-p353
RAILS_ENV=qa bundle exec rake vecnet:import:mesh_subjects vecnet:import:eval_mesh_trees

To resolrize with mesh synonyms...it will take a LONG time to complete.

chruby 2.0.0-p353
# This builds the synonyms.txt file if needed.
# you could skip this if synonyms did not change
RAILS_ENV=qa bundle exec rake vecnet:solrize_synonym:get_synonyms FILE=solr_conf/conf/synonyms.txt
#copy this file to solr core
sudo cp solr_conf/conf/synonyms.txt /opt/solr-4.3.0/vecnet/conf/synonyms.txt
#copy schema and solrconfig
sudo cp solr_conf/conf/schema.xml /opt/solr-4.3.0/vecnet/conf/schema.xml
sudo cp solr_conf/conf/solrconfig.xml /opt/solr-4.3.0/vecnet/conf/solrconfig.xml
#change owner to be tomcat
sudo chown tomcat:tomcat -R /opt/solr-4.3.0
#restart solr
sudo service tomcat6 restart
#resolrize all objects
RAILS_ENV=qa bundle exec rake solrizer:fedora:solrize_objects

To ingest Citation to qa/Production
#Copy endnote file to file to /opt/endnote and make sure everyone can read
sudo cp /from/path/to/endnote/file /opt/endnote
sudo chmod -r 755 /opt/endnote
#Copy pdf to /opt/citation_file/ and make sure everyone can read
sudo cp -r /from/path/to/endnote/pdf/* /opt/citation_file/
sudo chmod -r 755 /opt/citation_file/
#Execute citation task as app user
sudo su app
cd /home/app/vecnet/current
chruby 2.0.0-p353
RAILS_ENV=production bundle exec rake vecnet:import:endnote_conversion ENDNOTE_FILE=/opt/endnote/ ENDNOTE_PDF_PATH=/opt/citation_files:/opt/citation_files/

Initializing new production environment

1. Do system setup as in `SETUP` file
2. Get capistrano deploy working to new site
3. on production machine:
* setup ruby: `chruby 2.0.0-p353`
* Setup mesh terms: `RAILS_ENV=production bundle exec rake vecnet:import:mesh_subjects vecnet:import:eval_mesh_trees`
* Migrate user table: See below
* Resolrize: `RAILS_ENV=production bundle exec rake solrizer:fedora:solrize_objects`
* Migrate fedora objects: `RAILS_ENV=production bundle exec rake vecnet:migrate:batch_to_collection`
7. Done!

## NCBI Terminalogy

Work in progress. After running `rake db:migrate` the following task will download the NCBI taxonomy from
the following location

ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz

and ingest the terms into the database.

rake vecnet:import:ncbi_taxonomy

There are about 1,091,096 terms (November 2013).

## Gather repository contents for statistics

OUTFILE=~/repo-stats-20130916.csv RAILS_ENV=production bundle exec rake vecnet:dump_statistics

## Pubtkt Authentication

The site uses the pubtkt authentication scheme, which uses a signed cookie for every request.
For development, a dummy login to create a pubtkt is provided (class `DevelopmentSessions`).
But, first, a public/private key pair needs to be generated and installed.

rake pubtkt:generate_keys
mv pubtkt.pem config/pubtkt-development.pem
mv pubtkt-private.pem config/pubtkt-private-development.pem

And that should be enough for development.
There are also utility rake tasks for creating and verifying tickets:

1. To create a ticket on the comand line:

$ P_KEY=pubtkt-private.pem P_UID=dbrower P_VALIDUNTIL=3456789012 P_TOKENS='dl_librarian,dl_write' rake pubtkt:create
uid=dbrower;validuntil=3456789012;tokens=dl_librarian,dl_write;sig=MCwCFHiaErA+7lHoHxbSUIZaSnmTovIPAhRf4RxtrmArBMD8CBnZaUM/yWI+Cw==

The valid until date above has the date July 16, 2079 in the Unix epoch, so the ticket should not expire while you are using it.
2. To validate tickets from the command line:

$ P_KEY=pubtkt-private.pem P_TICKET='uid=dbrower;validuntil=3456789012;tokens=dl_librarian,dl_write;sig=MCwCFF1/aaSbtrxN9PLrZE1XvLH5SIWQAhRXN8AHevzPMFbMuIIlOwuCLTZDPw==' rake pubtkt:verify
Ticket text: uid=dbrower;validuntil=3456789012;tokens=dl_librarian,dl_write
Ticket sig : MCwCFF1/aaSbtrxN9PLrZE1XvLH5SIWQAhRXN8AHevzPMFbMuIIlOwuCLTZDPw==
Sig Valid? : true
Expired? : true

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vecnet/vecnet-dl

Awesome Lists containing this project

README