https://github.com/boppreh/bayesian

Utility for Bayesian reasoning
https://github.com/boppreh/bayesian

Last synced: 3 months ago
JSON representation

Utility for Bayesian reasoning

Host: GitHub
URL: https://github.com/boppreh/bayesian
Owner: boppreh
License: other
Created: 2013-05-03T00:35:07.000Z (about 12 years ago)
Default Branch: master
Last Pushed: 2014-11-05T18:46:36.000Z (over 10 years ago)
Last Synced: 2025-03-18T10:51:37.690Z (3 months ago)
Language: Python
Size: 301 KB
Stars: 36
Watchers: 2
Forks: 10
Open Issues: 0
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.txt
- License: LICENSE.txt

Awesome Lists containing this project

README

        Bayesian

========

**bayesian** is a small Python utility to reason about probabilities.

It uses a Bayesian system to extract features, crunch belief updates and

spew likelihoods back. You can use either the high-level functions to

classify instances with supervised learning, or update beliefs manually

with the ``Bayes`` class.

If you want to simply classify and move files into the most fitting folder, run

this program from the command line passing the root folder path as parameter.

  

High Level

----------

::

  from bayesian import classify, classify_file, classify_folder, classify_normal

  spams = ["buy viagra", "dear recipient", "meet sexy singles"] # etc

  genuines = ["let's meet tomorrow", "remember to buy milk"]

  message = "remember the meeting tomorrow"

  # Classify as "genuine" because of the words "remember" and "tomorrow".

  print(classify(message, {'spam': spams, 'genuine': genuines}))

  # Decides if the person with those measures is male or female.

  print(classify_normal({'height': 6, 'weight': 130, 'foot size': 8},

                        {'male': [{'height': 6, 'weight': 180, 'foot size': 12},

                                  {'height': 5.92, 'weight': 190, 'foot size': 11},

                                  {'height': 5.58, 'weight': 170, 'foot size': 12},

                                  {'height': 5.92, 'weight': 165, 'foot size': 10}],

                         'female': [{'height': 5, 'weight': 100, 'foot size': 6},

                                    {'height': 5.5, 'weight': 150, 'foot size': 8},

                                    {'height': 5.42, 'weight': 130, 'foot size': 7},

                                    {'height': 5.75, 'weight': 150, 'foot size': 9}]}))

  # Classifies "unknown_file" as either a Python or Java file, considering

  # you have directories with examples of each language.

  print(classify_file("unknown_file", ["java_files", "python_files"]))

  # Classifies every file under "folder" as either a Python or Java file,

  # considering you have subdirectories with examples of each language.

  print(classify_folder("folder"))

Low Level

-------------

::

  from bayesian import Bayes

  print ' -- Spam Filter --'

  # Database with number of sightings of each words in (genuine, spam)

  # emails.

  words_odds = {'buy': (5, 100), 'viagra': (1, 1000), 'meeting': (15, 2)}

  # Emails to be analyzed.

  emails = [

            "let's schedule a meeting for tomorrow", # 100% genuine (meeting)

            "buy some viagra", # 100% spam (buy, viagra)

            "buy coffee for the meeting", # buy x meeting, should be genuine

           ]

  for email in emails:

      # Start with priors of 90% chance being genuine, 10% spam.

      # Probabilities are normalized automatically.

      b = Bayes([('genuine', 90), ('spam', 10)])

      # Update probabilities, using the words in the emails as events and the

      # database of chances to figure out the change.

      b.update_from_events(email.split(), words_odds)

      # Print the email and if it's likely spam or not.

      print email[:15] + '...', b.most_likely()

      

  print ''

  print ' -- Spam Filter With Email Corpus -- '

  # Email corpus. A hundred spam emails to buy products and with the word

  # "meeting" thrown around. Genuine emails are about meetings and buying

  # milk.

  instances = {'spam': ["buy viagra", "buy cialis"] * 100 + ["meeting love"],

               'genuine': ["meeting tomorrow", "buy milk"] * 100}

  # Use str.split to extract features/events/words from the corpus and build

  # the model.

  model = Bayes.extract_events_odds(instances, str.split)

  # Create a new Bayes instance with 10%/90% priors on emails being genuine.

  b = Bayes({'spam': .9, 'genuine': .1})

  # Update beliefs with features/events/words from an email.

  b.update_from_events("buy coffee for meeting".split(), model)

  # Print the email and if it's likely spam or not.

  print "'buy coffee for meeting'", ':', b

  print ''

  print ' -- Classic Cancer Test Problem --'

  # 1% chance of having cancer.

  b = Bayes([('not cancer', 0.99), ('cancer', 0.01)])

  # Test positive, 9.6% false positives and 80% true positives

  b.update((9.6, 80))

  print b

  print 'Most likely:', b.most_likely()

  print ''

  print ' -- Are You Cheating? -- '

  results = ['heads', 'heads', 'tails', 'heads', 'heads']

  events_odds = {'heads': {'honest': .5, 'cheating': .9},

                 'tails': {'honest': .5, 'cheating': .1}}

  b = Bayes({'cheating': .5, 'honest': .5})

  b.update_from_events(results, events_odds)

  print b

  def b():

      return Bayes((0.99, 0.01), labels=['not cancer', 'cancer'])

  # Random equivalent examples, all achieve the same result.

  b() * (9.6, 80)

  (b() * (9.6, 80)).opposite().opposite()

  b().update({'not cancer': 9.6, 'cancer': 80})

  b().update((9.6, 80))

  b().update_from_events(['pos'], {'pos': (9.6, 80)})

  b().update_from_tests([True], [(9.6, 80)])

  Bayes([('not cancer', 0.99), ('cancer', 0.01)]) * (9.6, 80)

  Bayes({'not cancer': 0.99, 'cancer': 0.01}) * {'not cancer': 9.6,

                                                 'cancer': 80}

Project details

---------------

:License: MIT

:Code: https://github.com/boppreh/bayesian/

:PyPI: https://pypi.python.org/pypi/Bayesian

:Issue tracker: https://github.com/boppreh/bayesian/issues

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/boppreh/bayesian

Awesome Lists containing this project

README