https://github.com/zkry/p-search
Seach engine for Emacs
https://github.com/zkry/p-search
emacs search-engine
Last synced: 4 months ago
JSON representation
Seach engine for Emacs
- Host: GitHub
- URL: https://github.com/zkry/p-search
- Owner: zkry
- License: gpl-3.0
- Created: 2024-02-05T05:34:41.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-08-02T05:38:46.000Z (10 months ago)
- Last Synced: 2026-01-30T19:49:44.519Z (4 months ago)
- Topics: emacs, search-engine
- Language: Emacs Lisp
- Homepage:
- Size: 11.1 MB
- Stars: 117
- Watchers: 3
- Forks: 6
- Open Issues: 31
-
Metadata Files:
- Readme: README.org
- License: LICENSE
Awesome Lists containing this project
README
* p-search
p-search is an Emacs tool to find things. It combines concepts from
information retrievial and Bayesian search theory to assist a user
in finding documents.
Documentation: https://p-search.org/documentation/
[[./documents/screenshot.png]]
* Important Changes
- *2025-05-31*: Package names beginning with `psx-` have been renamed to `p-search-x-`
* Installation
https://p-search.org/documentation/Installation.html
p-search is installable via MELPA under the name p-search. You can
also install it as follows:
Using Quelpa:
#+begin_src lisp
(quelpa '(p-search :repo "zkry/p-search" :fetcher github))
#+end_src
Using Straight:
#+begin_src lisp
(use-package p-search :straight (:host github :repo "zkry/p-search"))
#+end_src
Using Elpaca:
#+begin_src lisp
(use-package p-search :elpaca (:host github :repo "https://github.com/zkry/p-search.git"))
#+end_src
* Usage
https://p-search.org/documentation/Getting-Started.html
A search session can be initiated with the =p-search= command. The
command will set up the session to search for files either in the
projects directory (see [[https://www.gnu.org/software/emacs/manual/html_node/emacs/Projects.html][project.el]]) if a project exists or the
current directory. Execute =p-search= with the prefix ~C-u~
to instantiate an empty session.
** The p-search buffer
The p-search session is composed of three main sections: Candidate
Generators, Priors, and Search Results.
[[./documents/p-search-demo-1.png]]
*** Candidate Generators
Candidate generators are the parts of the search session that
enumerate all possible search candidates. A search candidate is
an entity with a set of key/value properties, ='content= and ='name=
being mandetory. Other properties may exist which will allow you
to use additional prior functions. In the p-search session run
=p-search-add-candidate-generator= (~C~)to add a new candidate generator.
You can delete a prior with the command =p-search-kill-entry-at-point=.
*** Priors
The Priors section is the part where you add search criteria to
your session. Run =p-search-add-prior= (~P~) to add a prior function.
First you must select the type of prior you want to add. Then you
will have to configure the prior. It will first prompt you for
any fields that are mandetory.
After that, a new transient menu will appear, allowing you
configure the prior. Each prior function will have its own set of
inputs and options, but each one will let you set its *importance*
and whether the *complement* should be taken.
You can delete a prior with the command =p-search-kill-entry-at-point= (~k~).
Running =p-search-explain-dwm= (~x~) with the point on a prior
will display an explanation of the prior, showing a list of the
results it has generated.
*** Calculation
Each candidate document is given a score from each prior function
depending on how well the prior function matches.
So for example, suppose you have a text query search. The query
will rank each document on a scale from 0 to 1. This score is
then modified by the importance. If you assign a high importance,
then the probabilities will be pushed to the extremes. A low
importance pushes the probabilities to 0.5, thus lowering its impact.
So for example, if a text search query marked a document as highly
relevant, 0.7, but was given a low importance, its probability may
be modified to 0.55, thus lowering its impact. On the other hand,
if a text query matches poorly giving a score of 0.3 but its
importance is low, then its probability will be raised to perhaps
0.45.
#+begin_src
[CANDIDATE GENERATOR]
|
| [PRIOR_X] [PRIOR_Y]
|
|\-- DOC_A -> importance_X(Score_X_A) ✖ importance_Y(Score_Y_A)
|
|\-- DOC_B -> importance_X(Score_X_B) ✖ importance_Y(Score_Y_B) ...
|
\--- DOC_C -> importance_X(Score_X_C) ✖ importance_Y(Score_Y_C)
#+end_src
*** Text Search
Text search is a prominent component in p-search. While text
search functions the same way as other prior functions (resulting
in a score of 0 to 1), the mecahnisms behind it are more complex.
You can create a text query by selecting "text query" in the
transient menu when running =p-search-add-prior=.
You will then be prompted for your query. Depending on the query
you write, one or more processes will be created to perform the search.
As mentioned earlier, each search candidate document has a
property ='content=. The text search is performed on this field.
As you can probably immagine, having to search each document on a
single Emacs Lisp thread is slow, so each candidate generator
function can have a quicker method to perform the search. This is
why you see the search tool like =:grep= or =:rg= on the FILESYSTEM
candidate generator. When performing a text query on documents
coming from this, it will rely on this tool to perform the search.
For the text query, each search term is space separated. So if
you type =teacher student school= it will perform three separate
searches for the three terms. Each term will generate its own
score for each document and they will then be combined to form a
final score. You can use quotes to group words to search
something as a whole, thus ="teacher student school"= will perform
one search with the words in a sequence.
Unquoted terms will be processed into multiple variants and
searched in parallel. So for example =teacherStudentSchool= will
search both "teacherstudentschool" (case insensitive), but also
"teacher_student_school", "teacher-student-school" (with a lower
score), and the sepearate terms "teacher", "student", and "school"
(given even a lower score).
You can boost a term with =^= so that =teacher student^ school= will
give a boost to student. You can also specify a numeric boost, as
in =teacher student^2 school^3=.
You can search for terms that occur near to one another with the
=(term1 term2 ...)~= syntax. Depending on the value of
=p-search-default-near-line-length=, the items will be required to
be within a certain number of lines from one another.
** Observation
:PROPERTIES:
:ID: 360EC6A5-F76A-45E9-9797-F2992CE64FEC
:END:
p-search will only show you the first =p-search-top-n= values of
the search results. If you are not seeing relevant results you may
want to consider adding search criteria. You can also run the
command =p-search-observe= to lower the probability of a particular
result. Doing so will lower the probability of the item by
multiplying it by 0.3. With prefix =C-u p-search-observe=, you can
specify the probability. After you perform the observation the
probabilities will be recalculated and the results will update.
Running =p-search-explain-dwm= (~x~) with the point on a result
will display an explanation of the result, showing why it was
given the score it got.
** Saving Sessions
p-search contains a number of mechanims to speed up your searching
process. On the one hand, you can programatically create a command
and call various p-search functions to instantiate a session to
your liking. On the other, simply
bookmarking the session using the command =bookmark-set= (usually
bound =C-x r m=) will let you save the session, candidate
generators and priors, to quickly access in the future.
Another way to configure the behavior of p-search is by setting the
variable =p-search-default-command-behavior=. By setitng it's
value globally you can configure how the command =p-search=
behaves. You can also set the variable via a ".dir-locals.el"
file, like as follows, to have directory-local settings:
#+begin_src lisp
((p-search-mode . ((p-search-default-command-behavior . (:candidate-generator p-search-candidate-generator-filesystem :args ((base-directory . "~/dev/go/delve/cmd")))))))
#+end_src
You can run the command =p-search-show-session-preset= to see the
current session represented as a Lisp object. By passing this data
structure to the function =p-search-setup-buffer=, you can
programatically create the p-search session that you want.
** Extensions
p-search was designed to be extensible, both in what you can search
on and how the search is performed. Add =(require 'p-search-x-info)= to
load a p-search extension whilch lets you search on info files.
[[./documents/psx-info-demo.gif]]
This package adds a new candidate generator for info files. The
above example shows a search with two different info files.
p-search is meant to be more like a search-engine creator, rather
than a search-engine for for a specifc use case. Suppose you found
yourself searching the Emacs documentation often and you wanted to
create a search command for this. Doing so with p-search is easy.
#+begin_src lisp
(defun my/search-emacs (search-query)
(interactive "sSearchs Term: ")
(p-search-setup-buffer
`(:group ((:prior-template p-search-prior-query
:args ((query-string . ,search-query) (importance . medium)))
(:candidate-generator p-search-x-info-candidate-generator :args ((info-node . emacs)))
(:candidate-generator p-search-x-info-candidate-generator :args ((info-node . elisp)))))))
#+end_src
The above command will search both the emacs and elisp info
manuals. If you're trying to create a search command yourself and
are not sure what you should pass into the =p-search-setup-buffer=,
you can run the command =p-search-show-session-preset= on a
p-search buffer with your desired setup to see the data
representation of the search.