Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alejoduarte23/researchaggregator
Fetch academic content from multiple sources (Hindawi, MDPI, SpringerOpen, Arxiv, Elsevier) and generate an HTML report using Jinja2 templates. The script sanitizes query strings, fetches metadata, abstracts, and conclusions.
https://github.com/alejoduarte23/researchaggregator
aiohttp asyncio bs4 pydantic webscrapping
Last synced: about 2 months ago
JSON representation
Fetch academic content from multiple sources (Hindawi, MDPI, SpringerOpen, Arxiv, Elsevier) and generate an HTML report using Jinja2 templates. The script sanitizes query strings, fetches metadata, abstracts, and conclusions.
- Host: GitHub
- URL: https://github.com/alejoduarte23/researchaggregator
- Owner: AlejoDuarte23
- Created: 2024-07-25T15:43:02.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-07-29T01:58:58.000Z (5 months ago)
- Last Synced: 2024-07-29T20:07:43.995Z (5 months ago)
- Topics: aiohttp, asyncio, bs4, pydantic, webscrapping
- Language: Python
- Homepage:
- Size: 57.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Research Aggregator
A tool for querying and processing research data from various sources such as Arxiv, Mdpi, Springeropen, and Elsevier. The main functionality is encapsulated in the QueryExecutor class, which handles the asynchronous fetching and processing of data.
## Sources
- Arvix: Fetch and process functions for Arxiv.
- Mdpi: Fetch and process functions for Mdpi, with specified journals.
- Springeropen: Fetch and process functions for Springeropen.
- Elsevier: Fetch and process functions for Elsevier (requires API).Sources are defined with a fetch function and a process function as follow:
```python
from typing import Callable, Literal, Union
from pydantic import BaseModelfrom .springer_open_functions import fetch_springeropen_content, process_data_springeropen
from .mdp_functions import fetch_mdpi_content, process_data_mdpi
from .arvix_functions import fetch_arvix_content, process_data_arvix
from .elsevier_functions_async import fetch_elsevier_content, process_data_elsevierclass Arvix(BaseModel):
fetch_function: Callable = fetch_arvix_contents
process_function: Callable = process_data_arvixclass Mdpi(BaseModel):
fetch_function: Callable = fetch_mdpi_content
process_function: Callable = process_data_mdpi
journal: str = Literal['buildings', 'sensors', 'acoustics', 'algorithms', 'applmech', 'computation']class Springeropen(BaseModel):
fetch_function: Callable = fetch_springeropen_content
process_function: Callable = process_data_springeropenclass Elsevier(BaseModel):
fetch_function: Callable = fetch_elsevier_content
process_function: Callable = process_data_elsevierResearchSource = Union[Arvix, Mdpi, Springeropen, Elsevier]
```
Go to classes.py for more details.## Usage
```python
if __name__ == '__main__':
from research_aggregator import Arvix, Mdpi, Springeropen, Elsevier, QueryExecutor
if __name__ == "__main__":
query = "Modal Analysis"
sources = [
Mdpi(journal="buildings"),
Arvix(),
Mdpi(journal="sensors"),
]executor = QueryExecutor(query=query, sources=sources)
result = executor.search() # Queryresults
print(result)
```