Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vsm/vsm-dictionary-cacher
VSM-dictionary wrapper that manages a cache for various get-functions
https://github.com/vsm/vsm-dictionary-cacher
cache dictionary ontology-search vsm
Last synced: about 1 month ago
JSON representation
VSM-dictionary wrapper that manages a cache for various get-functions
- Host: GitHub
- URL: https://github.com/vsm/vsm-dictionary-cacher
- Owner: vsm
- License: agpl-3.0
- Created: 2018-04-09T19:13:47.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-01-03T16:30:50.000Z (about 2 years ago)
- Last Synced: 2024-05-27T20:41:03.394Z (8 months ago)
- Topics: cache, dictionary, ontology-search, vsm
- Language: JavaScript
- Homepage:
- Size: 1.09 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# vsm-dictionary-cacher
## Summary
`vsm-dictionary-cacher` augments a given VSM-dictionary with a layer of
caching functionality.This speeds up requests for string-matches, refTerms, and dictInfos;
and fixedTerms-preloading.
## Use in Node.js
Install like (after also installing a `vsm-dictionary-...` of choice) :
```
npm install vsm-dictionary-cacher
```Then use like:
```js
const Dictionary = require('vsm-dictionary-local'); // ...or any other VsmDictionary implementation.
const cacher = require('vsm-dictionary-cacher');
const CachedDictionary = cacher(Dictionary); // This makes a cache-enabled subclass.var dict = new CachedDictionary(); // This makes an instance.
// This will query the Dictionary as normal, bypassing the cache.
dict.getMatchesForString('abc', {filter: {dictID: ['Foo']}}, (err, res) => {
console.dir(res);// This will get the result from the cache, instead of re-running the query.
dict.getMatchesForString('abc', {filter: {dictID: ['Foo']}}, (err, res) => {
console.dir(res);
});// These will *not* get their result from the cache.
dict.getMatchesForString('abc', {filter: {dictID: ['BAR']}}, (err, res) => {});
dict.getMatchesForString('QQQ', {filter: {dictID: ['Foo']}}, (err, res) => {});// And similar behavior for the other three cached functions:
// - dict.getRefTerms({}, cb)
// - dict.getDictInfos({}, cb)
// - dict.loadFixedTerms([], {}, cb)});
```Specify options like:
```js
const CachedDictionary = cacher(Dictionary, { maxItems: 1000 });
```
## Use in the browser
```
```
after which it is accessible as the global variable `VsmDictionaryCacher`.
Then it can be wrapped around a VsmDictionary, e.g. a `VsmDictionaryLocal`, like:```
....var dict = new (VsmDictionaryCacher(VsmDictionaryLocal)) (options);
dict.getMatchesForString(....```
## Details
This package provides a factory function that accepts any VsmDictionary
(sub)class, and returns a (further) subclass of it,
which inserts cache handling code into several functions:
+ It speeds up requests for string-matches, to `getMatchesForString()`, in
three ways:- It stores results from requests to this function in a cache.
These results are returned for subsequent requests
that use same search-string & options, instead of re-running the query.> This helps e.g. `vsm-autocomplete` avoid making duplicate requests to an
> online dictionary server.
> It creates a more responsive autocomplete when the user
> types, and then backspaces.- It also prevents sending a second (or more) query to the underlying
datastore, if this query has the same search-string & options as an
ongoing query, whose results haven't arrived yet.
Instead, it puts such identical queries in a queue, and when the first one's
result comes in, it shares its error+result with the queued ones,
almost immediately (=on individual, next event-loops).> This helps e.g. `vsm-autocomplete` avoid making duplicate requests
> when a user types and backspaces quickly, before any results could come in.Note: when a query on the underlying storage fails, then no item will be
added to the cache, and no attempt to re-query will be made by the queued
ones.
This also means that on error, the same error would be returned by all
queued requests.- It can also remember for which strings there were no 'normal entry'-type
matches.
Then for subsequently queried strings, that start with such a 'no matches'
string, it can immediately return an empty list for the 'entry-matches' too.
(But it still checks for refTerm/number/etc-type matches).> This helps e.g. `vsm-autocomplete` avoid making unnecessary requests for
> search-strings, for which a substring already returned no entry-matches.
+ It maintains a cache for `getRefTerms()`.
- There are usually only a small number of refTerms, and they are just Strings.
Therefore, at a first call for anything, it queries all of them, and puts
them into a cache that is used for all further lookups.> This makes e.g. `vsm-autocomplete` not launch two queries per
> search-string (i.e. one for entries, and one for refTerms).
> And for cacheEmpty-hits, it will even serve all data from either cache or
> computation.- It also catches and prevents any concurrent calls, until this cache data
is received.
+ It partially maintains a cache for `getDictInfos()`.
- It caches all dictInfo-objects that are returned from any queries to the
underlying datastore, no matter what `options` were used.- Then it may use this cache:
- It uses it: _only_ for requests that filter for a list of dictIDs
(i.e. having a `options.filter.id`),
_or_ for requests that ask all dictInfos (i.e. having no `options.filter`).
- • Then it collects all dictInfos with a cache-hit in the
dictInfos-cache,
• and sends a query _only_ for the dictInfos that had a cache-miss,
_and_ that are not marked as being-queried by another concurrent
`getDictInfos()` call,
• after marking these as being-queried now too.
+ Note: queries to the underlying datastore are made, explicitly
unpaginated (i.e. with `options = { perPage: Number.MAX_VALUE, ... }`).
- When all dictInfos that it depends on have come in, (possibly as partial
results from several other concurrent calls), then it finally returns
its own, complete, assembled result.
- The implementation of this is not trivial. But it ensures that for a
large amount of concurrent requests (which may be launched when an app
starts up), no dictID is queried twice.
And it works even if `clearCache()` (see below) is called during this
process.> A `vsm-autocomplete` needs a corresponding dictInfo for each of its
> string-matches.
> So if its string-matches already came from cache-hits, then the above
> makes all its other data also come only from cache.
> If a `vsm-box` with a template, or several `vsm-box`es loaded on a same
> page, would launch multiple concurrent requests for dictInfos, then this
> caching may result in a lot less queries to the underlying datastore.
+ It enhances the cache management of `loadFixedTerms()`.
- - Note: this function (of the VsmDictionary parent class) queries
a VsmDictionary subclass's `getEntries()`, and puts the processed
results in its own simple cache (in the VsmDictionary parent class).
- But if a web page would contain multiple `vsm-box`es based on a template,
then each of them may call the (shared) VsmDictionary's `loadFixedTerms()`
and launch a query. This would query and add results to the VsmDictionary's
`fixedTermsCache`, no matter whether these results were in there already.- • It maintains a list of fixedTerms (`idts`) that have ever been
queried.
• It removes, from a request's `idts`-argument, any that were queried
before,
• and then launches the query only for the remaining `idts`,
• after marking them as 'pending/having-been-queried-now',
• so that concurrent calls can be prevented from requesting anything
twice.Note: the `options.z`, for z-object-pruning, is not taken into account here,
because VsmDictionary's fixedTerms-cache does so neither.> This prevents that `loadFixedTerms()` is called multiple times for the same
> data, when loading multiple `vsm-box`es with the same template.
## Options
An options object can be given as second argument to the factory function
(see example above), with these optional properties:- `maxItems`: {Number}:
This limits the amount of items kept in the string-match cache (only).
One item equals the result of one `getMatchesForString()` query (which is
often a list of match-objects).
When adding a new item to a full cache, the least recently added or accessed
item gets removed first.
Default is 0, which means unlimited storage.
+ Note: this pertains to _string-match-objects_ only;
so not to refTerms (they are small and few), not to dictInfos (same reason,
and the result of one query is spread out over multiple cache-items, which
is hard to manage), and not to fixedTerms (same).
- `predictEmpties`: {Boolean}:
If `true`, then it keeps a list of strings (per options-object) for which
`getEntryMatchesForString()` returned no results (i.e.: `{ items: [] }`).
Then for any subsequent query (with same options) for a string that
starts with any such empty-returning-string, we can assume that no results
will be returned either.
E.g. if a search for 'ab' returned no matching entries, then neither will
'abc'. So it can avoid running that query and immediately return the
empty `{ items: [] }` for 'abc'.
Default is `true`.
- Note: `maxItems` does not apply to this collection of strings either.
But the collection gets cleared, like everything else, by a call
to `clearCache()` (see below).
- Note: this is handled in `getEntryMatchesForString()`, not in
`getMatchesForString()`, because the latter may still add 'extra' matches
(refTerm/number/fixedTerm), other than the 'entry'-type matches.
- Example 1: after a call for "i" would give no entry-matches (and "i"
ends up in the 'cacheEmpties'), a subsequent call for "it" should still
return "it" as a refTerm-match.
(Note that a refTerm only matches for a full, not partial, string match).
- Example 2: after a call for "1e" gave no results, a subsequent call
for the valid number-string "1e5" should still return it as a result.
## Functions
An extra function is added to the VsmDictionary subclass:
- `clearCache()`:
This removes all data from the cache layer, including e.g. the list used by
`predictEmpties`.
## License
This project is licensed under the AGPL license - see [LICENSE.md](LICENSE.md).