Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/connectedsystems/wosis
Wosis - a python package developed to support bibliometric analysis of Web of Science data from querying to visualization.
https://github.com/connectedsystems/wosis
bibliometric-analysis python science science-data visualization web-of-science
Last synced: 18 days ago
JSON representation
Wosis - a python package developed to support bibliometric analysis of Web of Science data from querying to visualization.
- Host: GitHub
- URL: https://github.com/connectedsystems/wosis
- Owner: ConnectedSystems
- Created: 2018-11-01T03:40:22.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-09-13T05:04:56.000Z (about 5 years ago)
- Last Synced: 2024-10-12T10:30:18.976Z (26 days ago)
- Topics: bibliometric-analysis, python, science, science-data, visualization, web-of-science
- Language: HTML
- Homepage:
- Size: 1.35 MB
- Stars: 9
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Wosis
A python package developed to support analysis of Web of Science data from querying to visualization.
[![DOI](https://zenodo.org/badge/155658135.svg)](https://zenodo.org/badge/latestdoi/155658135)
This package is under development and is currently intended for limited use.
Currently it simplifies the process of:
* Getting publication data from the Web of Science collection
* Creating plots that indicate publication trends
* Identifying topics of interestSee the [included tutorial](https://github.com/ConnectedSystems/wosis/tree/master/tutorial) for a more complete introductory guide.
Examples of its use can be found by looking at the notebooks within the
[sd-prac-bibanalysis](https://github.com/ConnectedSystems/sd-prac-bibanalysis) repository.Key dependencies include:
* WOS Client, a SOAP-based client for Web of Science, developed by E. Bacis [@enricobacis](https://github.com/enricobacis)
* wos_parser, a parser for Web of Science XML data, developed by T. Achakulvisut [@titipata](https://github.com/titipata)
* Metaknowledge, a Python library for bibliometric research, developed at [Networks Lab](https://github.com/networks-lab/metaknowledge)
* Pandas, the de facto standard library for data analysis in Python.For the moment it is probably best to install by:
```bash
$ pip install git+https://github.com/titipata/wos_parser.git@master
$ git clone https://github.com/ConnectedSystems/wosis.git
$ cd wosis
$ pip install -e .
```Alternatively, via `pip`
```bash
$ pip install git+https://github.com/titipata/wos_parser.git@master
$ pip install git+https://github.com/ConnectedSystems/wosis.git@master
```# Getting Started
You will need access to the Premium API for Clarivate's Web of Science. This is given as a username and password.
It is advised that this information be placed in a `.yml` file in the following format:
```yaml
wos:
user: username
password: password
```This is to keep your secret information out of the code. Remember not to share this file with others.
The configuration file can then be loaded like so:
```python
import wosispath_to_your_config_file = "config.yml"
wos_config = wosis.load_config(path_to_your_config_file)
```This just returns a dictionary of the username and passwords.
Then build a query using a list of desired and undesired terms and the subject areas to search. The format follows the standard given by Web of Science, as seen [here](http://ipscience-help.thomsonreuters.com/wosWebServicesLite/WebServiceOperationsGroup/WebServiceOperations/g2/user_query.html).
```python
search_terms = {
"inclusive_kw": ("some", "keywords", "of", "interest"),
"exclusive_kw": ("I", "do not", "want", "these", "keywords"),
"exclusive_jo": ('PSYCHOL*', ), # journals to exclude
"subject_area": ("ENVIRONMENTAL SCIENCES", ), # Note the trailing comma for single item lists!
}# Build a list of queries to send
topics = [wosis.build_query(search_terms), ]
```The queries can then be sent to the Web of Science servers. The results will be dumped to a text file labelled with a `query_id` inside a temporary directory (`tmp`). Be warned that this temporary data store is up to you to manage. Please take care to remove the data once your analysis is complete.
```python
overwrite = False # do not overwrite existing data store if it exists
id_to_query = wosis.query(topics, overwrite, wos_config)
````id_to_query` will be a Python dictionary which maps a generated query id to the query that was sent. Because we sent a single query in this example, we are interested in the first `query_id`.
```python
import pandas as pd
import metaknowledge as mkquery_id = list(id_to_query.keys())[0] # Get the first query_id
RC = mk.RecordCollection(f"tmp/{query_id}.txt") # Load the results
```Wosis provides convenient plotting methods.
```python
import wosis.analysis.plotting as wos_plotwos_plot.plot_kw_trend(RC, title='Plot of the number of keywords over time', save_plot_fn='figs/num_kw_per_pub.png')
```Specific analysis can be accomplished by using Metaknowledge and Pandas.
See the [included tutorial](https://github.com/ConnectedSystems/wosis/tree/master/tutorial) for a more complete introductory guide.
# Related Works
* [revtools](http://revtools.net/), an R package for exploratory analysis of bibliographic data developed by M. Westgate (https://doi.org/10.1101/262881)
* [Science Concierge](https://github.com/titipata/science_concierge), Python package for content based recommendation by T. Achakulvisut et al. (http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0158423)