Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-document-similarity
A curated list of resources on document similarity measures (papers, tutorials, code, ...)
https://github.com/malteos/awesome-document-similarity
Last synced: 5 days ago
JSON representation
-
Motivation
-
Similarity concepts
- Nguyen, D., Trieschnigg, D., & Theune, M. (2014). Using crowdsourcing to investigate perception of narrative similarity. CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, 321–330.
- Bär, D., Zesch, T., & Gurevych, I. (2011). A reflective view on text similarity. International Conference Recent Advances in Natural Language Processing, RANLP, (September), 515–520.
- Bär, D., Zesch, T., & Gurevych, I. (2015). Composing Measures for Computing Text Similarity. Technical Report TUD-CS-2015-0017, 1–30.
- Tversky, A. (1977). Features of similarity. Psychological Review, 84(4), 327.
- Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for Similarity. Psychological Review, 100(2), 254–278.
-
-
Document Representations
-
Sentence-level
- Paper
- Paper
- Paper
- Paper - thoughts)
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper - thoughts)
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
-
Similarity concepts
-
Word-level
-
Word Context
-
From word to sentence level
-
BERT and other Transformer Language Models
- Paper
- Paper
- Paper
- Paper
- Paper - xl)
- Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. (2020). On the Sentence Embeddings from Pre-trained Language Models. EMNLP 2020. - flow))
- SimCSE: Simple Contrastive Learning of Sentence Embeddings
- WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
- Blogpost
- BERT-AL: BERT for Arbitrarily Long Document Understanding
- Blockwise Self-Attention for Long Document Understanding
- Paper
- Sentence Transformers
- Easy-to-use interface to fine-tuned BERT models for computing semantic similarity
- Code
- Natural Language Recommendations: A novel research paper search engine developed entirely with embedding and transformer models.
- Paper
- Paper
-
Document-level
-
Citations
- Martyn, J (1964)
- Small, Henry (1973)
- Gipp, Bela; Beel, Joeran (2006)
- Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references?
- Paper
- Paper
- Paper
- Paper
- Paper
- Paper
- Nickel and Kiela (2017)
- GraphVite - graph embedding at high speed and large scale
- Karate Club is an unsupervised machine learning extension library for NetworkX.
- Nickel and Kiela (2017)
-
Hybird
-
-
Surveys
-
Text matching
-
-
See also
-
Text matching
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Text Similarities: Estimate the degree of similarity between two texts
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Sentence Similarity Calculator (ELMo, BERT and Universal Sentence Encoder, and different similarity measures)
- Awesome Sentence Embeddings
- Awesome Neural Models for Semantic Match
- Awesome Network Embedding
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
- Michael J. Pazzani, Daniel Billsus. Content-Based Recommendation Systems
- Charu C. Aggarwal. Content-Based Recommender Systems
-
-
Similarity / Distance Measures
-
Hybird
-
Siamese Networks
- (Bromley, Jane, et al. "Signature verification using a siamese time delay neural network". Advances in neural information processing systems. 1994.)
- Jiang, J. et al. 2019. Semantic Text Matching for Long-Form Documents. The World Wide Web Conference on - WWW ’19 (New York, New York, USA, 2019), 795–806.
- Liu, B. et al. 2018. Matching Article Pairs with Graphical Decomposition and Convolutions. (Feb. 2018).
- Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task
-
Text matching
- Simple and Effective Text Matching with Richer Alignment Features - edu/simple-effective-text-matching-pytorch)
- Paper
- FiLM
- Feature-wise transformations (Distill)
-
-
Benchmarks & Datasets
-
Text matching
- Paper
- CSFCube -- A Test Collection of Computer Science Research Articles for Faceted Query by Example
- STSbenchmark
- A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications
- SciDocs - The Dataset Evaluation Suite for SPECTER (for classification, citation prediction, user activity, recommendation)
- Paper
-
-
Performance measures
-
Tutorials
Programming Languages
Categories
Sub Categories
Keywords
sentence-similarity
3
network-embedding
3
deep-learning
3
bert
2
semantic-similarity
2
knowledge-graph
2
machine-learning
2
representation-learning
2
unsupervised-learning
2
natural-language-processing
2
nlp
2
text-similarity
2
network-science
1
networkx
1
networkx-graph
1
node-embedding
1
node2vec
1
scikit
1
sklearn
1
supervised-learning
1
subword-models
1
algorithm
1
algorithms
1
damerau-levenshtein
1
damerau-levenshtein-distance
1
diff
1
distance
1
distance-calculation
1
hamming-distance
1
jellyfish
1
levenshtein
1
levenshtein-distance
1
python
1
textdistance
1
attention
1
graph-embeddings
1
bert-model
1
clinical-semantic-similarity
1
med-sts
1
semantic-matching
1
question-answering
1
sts-b
1
cuda
1
data-visualization
1
gpu
1
neu-ir
1
information-retrieval
1
wordembedding
1
word-embeddings
1
2vec
1