https://github.com/thinkwee/subjectivebiasabs
code for the arxiv paper "Subjective Bias in Abstractive Summarization"
https://github.com/thinkwee/subjectivebiasabs
Last synced: 6 months ago
JSON representation
code for the arxiv paper "Subjective Bias in Abstractive Summarization"
- Host: GitHub
- URL: https://github.com/thinkwee/subjectivebiasabs
- Owner: thinkwee
- Created: 2021-06-18T13:45:02.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-10-07T04:31:44.000Z (8 months ago)
- Last Synced: 2024-12-04T03:51:34.582Z (6 months ago)
- Language: Python
- Size: 36.1 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Subjective Bias in Abstractive Summarization
![]()
![]()
- code for the paper [*Subjective Bias in Abstractive Summarization*](https://arxiv.org/pdf/2106.10084.pdf)
- We examined the influence of subjective style bias in large-scale abstractive summarization datasets and introduced a Graph Convolutional Network method to capture and embed writing styles. Results demonstrate that style-clustered datasets enhance model convergence, abstraction, and generalization.# introduction
- params.py: hyperparameters
- get_datasets.py: get the topk Oracle sentences in the article then parse
- process_dataset.py: turn parsed file into the format of DGL graph triplet
- model.py: the self-supervised GCN model for extracting subjective style embedding
- train.py: training
- infer.py: infer the whole training set to get subjective style embedding for clustering# detail
- negative samples of Oracle sentences are uniform-sampled by the Jaccard sim