Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-document-similarity
A curated list of resources on document similarity measures (papers, tutorials, code, ...)
https://github.com/malteos/awesome-document-similarity
Last synced: 4 days ago
JSON representation
-
Motivation
-
Similarity concepts
- Nguyen, D., Trieschnigg, D., & Theune, M. (2014). Using crowdsourcing to investigate perception of narrative similarity. CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, 321–330.
- Bär, D., Zesch, T., & Gurevych, I. (2011). A reflective view on text similarity. International Conference Recent Advances in Natural Language Processing, RANLP, (September), 515–520.
- Bär, D., Zesch, T., & Gurevych, I. (2015). Composing Measures for Computing Text Similarity. Technical Report TUD-CS-2015-0017, 1–30.
- Tversky, A. (1977). Features of similarity. Psychological Review, 84(4), 327.
- Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for Similarity. Psychological Review, 100(2), 254–278.
- Bär, D., Zesch, T., & Gurevych, I. (2015). Composing Measures for Computing Text Similarity. Technical Report TUD-CS-2015-0017, 1–30.
-
-
Document Representations
-
Sentence-level
- Paper
- Paper
- Paper
- Paper - thoughts)
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper - thoughts)
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
-
Similarity concepts
-
Word-level
-
Word Context
-
From word to sentence level
-
BERT and other Transformer Language Models
- Paper
- Paper
- Paper
- Paper
- Paper - xl)
- Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. (2020). On the Sentence Embeddings from Pre-trained Language Models. EMNLP 2020. - flow))
- SimCSE: Simple Contrastive Learning of Sentence Embeddings
- WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
- Blogpost
- BERT-AL: BERT for Arbitrarily Long Document Understanding
- Blockwise Self-Attention for Long Document Understanding
- Paper
- Sentence Transformers
- Easy-to-use interface to fine-tuned BERT models for computing semantic similarity
- Code
- Natural Language Recommendations: A novel research paper search engine developed entirely with embedding and transformer models.
- Paper
- Paper
-
Document-level
-
Citations
- Martyn, J (1964)
- Small, Henry (1973)
- Gipp, Bela; Beel, Joeran (2006)
- Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references?
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Nickel and Kiela (2017)
- GraphVite - graph embedding at high speed and large scale
- Karate Club is an unsupervised machine learning extension library for NetworkX.
- Martyn, J (1964)
- Nickel and Kiela (2017)
-
Hybird
-
-
Surveys
-
Text matching
-
-
See also
-
Text matching
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Text Similarities: Estimate the degree of similarity between two texts
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Sentence Similarity Calculator (ELMo, BERT and Universal Sentence Encoder, and different similarity measures)
- Awesome Sentence Embeddings
- Awesome Neural Models for Semantic Match
- Awesome Network Embedding
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
-
-
Similarity / Distance Measures
-
Hybird
-
Siamese Networks
- (Bromley, Jane, et al. "Signature verification using a siamese time delay neural network". Advances in neural information processing systems. 1994.)
- Jiang, J. et al. 2019. Semantic Text Matching for Long-Form Documents. The World Wide Web Conference on - WWW ’19 (New York, New York, USA, 2019), 795–806.
- Liu, B. et al. 2018. Matching Article Pairs with Graphical Decomposition and Convolutions. (Feb. 2018).
- Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task
-
Text matching
- Simple and Effective Text Matching with Richer Alignment Features - edu/simple-effective-text-matching-pytorch)
- Paper
- FiLM
- Feature-wise transformations (Distill)
- (Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks)
-
-
Benchmarks & Datasets
-
Text matching
- Paper
- CSFCube -- A Test Collection of Computer Science Research Articles for Faceted Query by Example
- STSbenchmark
- A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications
- SciDocs - The Dataset Evaluation Suite for SPECTER (for classification, citation prediction, user activity, recommendation)
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
- Paper
-
-
Performance measures
-
Tutorials
Programming Languages
Categories
Sub Categories
Keywords
deep-learning
4
bert
3
nlp
3
sentence-similarity
3
network-embedding
3
information-retrieval
2
natural-language-processing
2
representation-learning
2
machine-learning
2
knowledge-graph
2
text-similarity
2
unsupervised-learning
2
semantic-similarity
2
distance-calculation
1
distance
1
diff
1
hamming-distance
1
jellyfish
1
damerau-levenshtein-distance
1
levenshtein
1
damerau-levenshtein
1
levenshtein-distance
1
python
1
textdistance
1
attention
1
deep-architectures
1
deep-neural-networks
1
multihead-attention
1
multihead-attention-networks
1
paraphrase
1
paraphrase-identification
1
python3
1
quora-question-pairs
1
siamese-cnn
1
bert-model
1
clinical-semantic-similarity
1
med-sts
1
sts-b
1
cuda
1
data-visualization
1
gpu
1
2vec
1
community-detection
1
deepwalk
1
gcn
1
graph-clustering
1
graph-embedding
1
graph2vec
1
label-propagation
1
louvain
1