https://github.com/infosys/berguig
Berguig is an application which delivers personalized news feed, tweets, youtube video links and articles published at Gartner, based on the interests specified by the user. The application gives score to these items to highlight their relevance with the help of Machine Learning models.
https://github.com/infosys/berguig
Last synced: 12 months ago
JSON representation
Berguig is an application which delivers personalized news feed, tweets, youtube video links and articles published at Gartner, based on the interests specified by the user. The application gives score to these items to highlight their relevance with the help of Machine Learning models.
- Host: GitHub
- URL: https://github.com/infosys/berguig
- Owner: Infosys
- License: mit
- Created: 2020-03-26T11:12:52.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2021-01-12T11:14:18.000Z (over 5 years ago)
- Last Synced: 2025-01-18T05:46:36.877Z (over 1 year ago)
- Language: Python
- Size: 86.9 KB
- Stars: 0
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: License.txt
Awesome Lists containing this project
README
Berguig Application-Python
==========================
Berguig is a python application to deliver personalized news feed, tweets, youtube video links and gartner articles based on the company interests specified by the user. The list of companies that interest the user are stored in XML files, which are then read by the Berguig program.
Example of XML file
-------------------
people.xml
----------
Contains the person's name, mail id, groups of comapnies or individual companiees of interest
comapnies.xml
-------------
The information about the companies like twitter handle, youtube channel is stored in companies.xml as below
c1
company_name
channel_name
@twitter_account
#hashtag
keyword1
keyword2
c1
company_name
channel_name
@twitter_account
#hashtag
keyword1
keyword2
keywords_list.xml
-----------------
contains a list of keywords that the application will use to filter the news articles from google news API.
log.xml
-----------------
Contains the last executed date of the program.
Algo_4
------------------
The Algo_4 is used to train the machine learning model. The input for this python program is the scoring_v3.xlsx file.
The user has to score the articles generated by the application to train it to classify between relevant and non-relevant news.
Scoring_v3.xlsx
------------------
This xlsx file is used to train the machine learning model. The user has to score the articles generated by the application and store in this xlsx in the format shown below:
Category Company Source/User Keywords Subjects/Views Title/Tweet Date Link Useful Relevance Relevance_fig
The possible values of useful are
0,1,2 : Not Useful
3,4,5 : Useful
Relevance values based on useful values:
0,1,2 : Relevant
3,4,5 : Irrelevant
Relevance_fig based in the Relevance values:
Relevant : 1
Irrelevant : 0