https://github.com/multivacplatform/multivac-pubmed
Update PubMed articles daily on HDFS by using Spark Cluster
https://github.com/multivacplatform/multivac-pubmed
apache-spark dataframe hadoop hdfs pubmed pubmed-parser spark-sql yarn
Last synced: 8 months ago
JSON representation
Update PubMed articles daily on HDFS by using Spark Cluster
- Host: GitHub
- URL: https://github.com/multivacplatform/multivac-pubmed
- Owner: multivacplatform
- License: apache-2.0
- Created: 2020-04-08T15:34:02.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-18T09:00:02.000Z (almost 3 years ago)
- Last Synced: 2025-01-12T16:11:30.805Z (10 months ago)
- Topics: apache-spark, dataframe, hadoop, hdfs, pubmed, pubmed-parser, spark-sql, yarn
- Language: Scala
- Size: 30.3 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# multivac-pubmed [](https://github.com/multivacplatform/multivac-pubmed/blob/master/LICENSE) [](https://travis-ci.org/multivacplatform/multivac-pubmed) [](https://discourse.iscpif.fr/c/multivac) [](https://chat.iscpif.fr/channel/multivac)
## Testing Environment
* Spark 2.4.4 Local / IntelliJ
* Spark 2.4 / Cloudera CDH 6.3 / YARN (cluster - client)
## Code of Conduct
This, and all github.com/multivacplatform projects, are under the [Multivac Platform Open Source Code of Conduct](https://github.com/multivacplatform/code-of-conduct/blob/master/code-of-conduct.md). Additionally, see the [Typelevel Code of Conduct](http://typelevel.org/conduct) for specific examples of harassing behavior that are not tolerated.
## Copyright and License
Code and documentation copyright (c) 2020 [ISCPIF - CNRS](http://iscpif.fr). Code released under the [Apache 2.0 license](https://github.com/multivacplatform/multivac-pubmed/blob/master/LICENSE).