https://github.com/boppreh/bayesian
Utility for Bayesian reasoning
https://github.com/boppreh/bayesian
Last synced: 3 months ago
JSON representation
Utility for Bayesian reasoning
- Host: GitHub
- URL: https://github.com/boppreh/bayesian
- Owner: boppreh
- License: other
- Created: 2013-05-03T00:35:07.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2014-11-05T18:46:36.000Z (over 10 years ago)
- Last Synced: 2025-03-18T10:51:37.690Z (3 months ago)
- Language: Python
- Size: 301 KB
- Stars: 36
- Watchers: 2
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.txt
- License: LICENSE.txt
Awesome Lists containing this project
README
Bayesian
========**bayesian** is a small Python utility to reason about probabilities.
It uses a Bayesian system to extract features, crunch belief updates and
spew likelihoods back. You can use either the high-level functions to
classify instances with supervised learning, or update beliefs manually
with the ``Bayes`` class.If you want to simply classify and move files into the most fitting folder, run
this program from the command line passing the root folder path as parameter.
High Level
----------::
from bayesian import classify, classify_file, classify_folder, classify_normal
spams = ["buy viagra", "dear recipient", "meet sexy singles"] # etc
genuines = ["let's meet tomorrow", "remember to buy milk"]
message = "remember the meeting tomorrow"
# Classify as "genuine" because of the words "remember" and "tomorrow".
print(classify(message, {'spam': spams, 'genuine': genuines}))# Decides if the person with those measures is male or female.
print(classify_normal({'height': 6, 'weight': 130, 'foot size': 8},
{'male': [{'height': 6, 'weight': 180, 'foot size': 12},
{'height': 5.92, 'weight': 190, 'foot size': 11},
{'height': 5.58, 'weight': 170, 'foot size': 12},
{'height': 5.92, 'weight': 165, 'foot size': 10}],
'female': [{'height': 5, 'weight': 100, 'foot size': 6},
{'height': 5.5, 'weight': 150, 'foot size': 8},
{'height': 5.42, 'weight': 130, 'foot size': 7},
{'height': 5.75, 'weight': 150, 'foot size': 9}]}))# Classifies "unknown_file" as either a Python or Java file, considering
# you have directories with examples of each language.
print(classify_file("unknown_file", ["java_files", "python_files"]))# Classifies every file under "folder" as either a Python or Java file,
# considering you have subdirectories with examples of each language.
print(classify_folder("folder"))Low Level
-------------::
from bayesian import Bayes
print ' -- Spam Filter --'
# Database with number of sightings of each words in (genuine, spam)
# emails.
words_odds = {'buy': (5, 100), 'viagra': (1, 1000), 'meeting': (15, 2)}
# Emails to be analyzed.
emails = [
"let's schedule a meeting for tomorrow", # 100% genuine (meeting)
"buy some viagra", # 100% spam (buy, viagra)
"buy coffee for the meeting", # buy x meeting, should be genuine
]for email in emails:
# Start with priors of 90% chance being genuine, 10% spam.
# Probabilities are normalized automatically.
b = Bayes([('genuine', 90), ('spam', 10)])
# Update probabilities, using the words in the emails as events and the
# database of chances to figure out the change.
b.update_from_events(email.split(), words_odds)
# Print the email and if it's likely spam or not.
print email[:15] + '...', b.most_likely()
print ''print ' -- Spam Filter With Email Corpus -- '
# Email corpus. A hundred spam emails to buy products and with the word
# "meeting" thrown around. Genuine emails are about meetings and buying
# milk.
instances = {'spam': ["buy viagra", "buy cialis"] * 100 + ["meeting love"],
'genuine': ["meeting tomorrow", "buy milk"] * 100}# Use str.split to extract features/events/words from the corpus and build
# the model.
model = Bayes.extract_events_odds(instances, str.split)
# Create a new Bayes instance with 10%/90% priors on emails being genuine.
b = Bayes({'spam': .9, 'genuine': .1})
# Update beliefs with features/events/words from an email.
b.update_from_events("buy coffee for meeting".split(), model)
# Print the email and if it's likely spam or not.
print "'buy coffee for meeting'", ':', bprint ''
print ' -- Classic Cancer Test Problem --'
# 1% chance of having cancer.
b = Bayes([('not cancer', 0.99), ('cancer', 0.01)])
# Test positive, 9.6% false positives and 80% true positives
b.update((9.6, 80))
print b
print 'Most likely:', b.most_likely()print ''
print ' -- Are You Cheating? -- '
results = ['heads', 'heads', 'tails', 'heads', 'heads']
events_odds = {'heads': {'honest': .5, 'cheating': .9},
'tails': {'honest': .5, 'cheating': .1}}
b = Bayes({'cheating': .5, 'honest': .5})
b.update_from_events(results, events_odds)
print bdef b():
return Bayes((0.99, 0.01), labels=['not cancer', 'cancer'])# Random equivalent examples, all achieve the same result.
b() * (9.6, 80)
(b() * (9.6, 80)).opposite().opposite()
b().update({'not cancer': 9.6, 'cancer': 80})
b().update((9.6, 80))
b().update_from_events(['pos'], {'pos': (9.6, 80)})
b().update_from_tests([True], [(9.6, 80)])
Bayes([('not cancer', 0.99), ('cancer', 0.01)]) * (9.6, 80)
Bayes({'not cancer': 0.99, 'cancer': 0.01}) * {'not cancer': 9.6,
'cancer': 80}Project details
---------------:License: MIT
:Code: https://github.com/boppreh/bayesian/
:PyPI: https://pypi.python.org/pypi/Bayesian
:Issue tracker: https://github.com/boppreh/bayesian/issues