{"id":13441569,"url":"https://github.com/opencog/atomspace","last_synced_at":"2025-05-14T00:10:46.947Z","repository":{"id":31118561,"uuid":"34678084","full_name":"opencog/atomspace","owner":"opencog","description":"The OpenCog (hyper-)graph database and graph rewriting system","archived":false,"fork":false,"pushed_at":"2025-05-01T02:16:11.000Z","size":170843,"stargazers_count":871,"open_issues_count":76,"forks_count":242,"subscribers_count":83,"default_branch":"master","last_synced_at":"2025-05-01T03:21:14.526Z","etag":null,"topics":["atomspace","graph-database","graph-rewriting","knowledge-base","knowledge-graph","knowledge-representation","logic-programming","query-engine","query-language","relational-algebra","relational-database","rewrite-system","rewriting"],"latest_commit_sha":null,"homepage":"https://wiki.opencog.org/w/AtomSpace","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opencog.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-04-27T16:32:57.000Z","updated_at":"2025-05-01T02:16:15.000Z","dependencies_parsed_at":"2025-01-06T04:00:30.642Z","dependency_job_id":"7e00c95c-99dd-4ae9-8313-546d601a4488","html_url":"https://github.com/opencog/atomspace","commit_stats":{"total_commits":28809,"total_committers":161,"mean_commits":"178.93788819875778","dds":0.4233399284945677,"last_synced_commit":"884d05157c27de852d26a3a6f8e924201075b6da"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fatomspace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fatomspace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fatomspace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fatomspace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opencog","download_url":"https://codeload.github.com/opencog/atomspace/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254044357,"owners_count":22005142,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atomspace","graph-database","graph-rewriting","knowledge-base","knowledge-graph","knowledge-representation","logic-programming","query-engine","query-language","relational-algebra","relational-database","rewrite-system","rewriting"],"created_at":"2024-07-31T03:01:35.564Z","updated_at":"2025-05-14T00:10:41.937Z","avatar_url":"https://github.com/opencog.png","language":"C++","funding_links":[],"categories":["C++","others","knowledge-graph","Others"],"sub_categories":[],"readme":"OpenCog AtomSpace\n=================\n\n[![CircleCI](https://circleci.com/gh/opencog/atomspace.svg?style=svg)](https://circleci.com/gh/opencog/atomspace)\n\nThe OpenCog AtomSpace is an in-RAM knowledge representation (KR)\ndatabase with an associated query engine and graph-re-writing system.\nIt is a kind of in-RAM generalized hypergraph (metagraph) database.\nMetagraphs offer more efficient, more flexible and more powerful ways\nof representing graphs: [a metagraph store is literally just-plain\nbetter than a graph store.](https://github.com/opencog/atomspace/blob/master/opencog/sheaf/docs/ram-cpu.pdf)\nOn top of this, the Atomspace provides a large variety of advanced\nfeatures not available anywhere else.\n\nThe AtomSpace is a platform for building Artificial General Intelligence\n(AGI) systems. It provides the central knowledge representation component\nfor OpenCog. As such, it is a fairly mature component, on which a lot of\nother systems are built, and which depend on it for stable, correct\noperation in a day-to-day production environment.\n\nThere are several dozen modules built on top of the AtomSpace. Notable\nones include:\n\n* [Store AtomSpaces to disk](https://github.com/opencog/atomspace-rocks)\n* [Network-distributed AtomSpace storage](https://github.com/opencog/atomspace-cog)\n* [Network shell to AtomSpaces, including a WebSocket API](https://github.com/opencog/cogserver)\n* [Sparse Vector/Matrix embeddings/access to graphs](https://github.com/opencog/matrix)\n* [Sensori-motor research](https://github.com/opencog/sensory)\n* [Language learning](https://github.com/opencog/learn)\n\nData as MetaGraphs\n==================\nIt is now commonplace to represent data as graphs; there are more graph\ndatabases than you can shake a stick at. What makes the AtomSpace\ndifferent? A dozen features that no other graph DB does, or has even\ndreamed of doing.\n\nBut, first: five things everyone else does:\n* Perform [graphical database queries](https://wiki.opencog.org/w/Pattern_engine),\n  returning results that satisfy a provided search pattern.\n* Arbitrarily complex patterns with an arbitrary number of variable\n  regions can be specified, by unifying multiple clauses.\n* Modify searches with conditionals, such as \"greater than\", and with\n  user callbacks into scheme, python or Haskell.\n* Perform **graph rewriting**: use search results to create new graphs.\n* Trigger execution of user callbacks... or of executable graphs (as\n  explained below).\n\nA key difference: the AtomSpace is a metagraph store, not a graph store.\nMetagraphs can efficiently represent graphs, but not the other way around.\nThis is carefully explained\n[here,](https://github.com/opencog/atomspace/blob/master/opencog/sheaf/docs/ram-cpu.pdf)\nwhich also gives a precise definition of what a metagraph is, and how it\nis related to a graph.  As a side-effect, metagraphs open up many\npossibilities not available to ordinary graph databases. These are\nlisted below.  Things are things that no one else does:\n* **Search queries are graphs.**\n  (The API to the [pattern engine](https://wiki.opencog.org/w/Pattern_engine)\n  is a graph.) That is, every query, every search is also a graph. That\n  means one can store a collection of searches in the database, and\n  access them later. This allows a graph rule engine to be built up.\n* **Inverted searches.**\n  ([DualLink](https://wiki.opencog.org/w/DualLink).)\n  Normally, a search is like \"asking a question\" and \"getting an\n  answer\". For the inverted search, one \"has an answer\" and is looking\n  for all \"questions\" for which its a solution. This is pattern\n  recognition, as opposed to pattern search. All chatbots do this as\n  a matter of course, to handle chat dialog. No chatbot can host\n  arbitrary graph data, or search it. The AtomSpace can. This is because\n  queries are also graphs, and not just data.\n* Both [**\"meet\" and \"join\"**](https://en.wikipedia.org/wiki/Join_and_meet)\n  searches are possible: One can perform a \"fill in the blanks\" search\n  (a meet, with [MeetLink](https://wiki.opencog.org/w/MeetLink))\n  and one can perform a \"what contains this?\" search (a join, with\n  [JoinLink](https://wiki.opencog.org/w/JoinLink).)\n* **Graphs are executable.** Graph vertex types include \"plus\", \"times\",\n  \"greater than\" and many other programming constructs. The resulting\n  graphs encode\n  [\"abstract syntax trees\"](https://en.wikipedia.org/wiki/Abstract_syntax_tree)\n  and the resulting language is called\n  [Atomese](https://wiki.opencog.org/w/Atomese).\n  It resembles the\n  [intermediate representation](https://en.wikipedia.org/wiki/Intermediate_representation)\n  commonly found in compilers, except that, here, its explicitly exposed\n  to the user as a storable, queriable, manipulable, executable graph.\n* **Graphs are typed**\n  ([TypeNode](https://wiki.opencog.org/w/TypeNode) and\n  [type constructors](https://wiki.opencog.org/w/Type_constructor).)\n  Graph elements have types, and there are half a dozen type\n  constructors, including types for graphs that are functions. This\n  resembles programming systems that have type constructors, such as\n  CaML or Haskell.\n* **Graph nodes carry vectors**\n  [Values](https://wiki.opencog.org/w/Value) are mutable vectors of\n  data. Each graph element (vertex or edge, node or link) can host\n  an arbitrary collection of Values. This is, each graph element is\n  also a key-value database.\n* **Graphs specify flows**\n  Values can be static or dynamic.  For the dynamic case, a given\n  graph can be thought of as \"pipes\" or \"plumbing\"; the Values can\n  \"flow\" along that graph.  For example, the\n  [FormulaStream](https://wiki.opencog.org/w/FormulaStream) allows\n  numeric vector operations (\"formulas\") to be defined. Accessing\n  a FormulaStream provides the vector value *at that instant*.\n* **Unordered sets**\n  ([UnorderedLink](https://wiki.opencog.org/w/UnorderedLink).)\n  A graph vertex can be an unordered set (Think of a list of edges, but\n  they are not in any fixed order.) When searching for a matching\n  pattern, one must consider **all** permutations of the set. This is\n  easy, if the search has only one unordered set. This is hard, if\n  they are nested and inter-linked: it becomes a constraint-satisfaction\n  problem.  The AtomSpace pattern engine handles all of these cases\n  correctly.\n* **Alternative sub-patterns**\n  ([ChoiceLink](https://wiki.opencog.org/w/ChoiceLink).)\n  A search query can include a menu of sub-patterns to be matched. Such\n  sets of alternatives can be nested and composed arbitrarily. (*i.e.*\n  they can contain variables, *etc.*)\n* **Globby matching**\n  ([GlobNode](https://wiki.opencog.org/w/GlobNode).)\n  One can match zero, one or more subgraphs with globs This is similar\n  to the idea of globbing in a regex. Thus, a variable need not be\n  grounded by only one subgraph: a variable can be grounded by an\n  indeterminate range of subgraphs.\n* **Quotations** ([QuoteLink](https://wiki.opencog.org/w/QuoteLink).)\n  Executable graphs can be quoted.  This is similar to quotations in\n  functional programming languages. In this case, it allows queries\n  to search for other queries, without triggering the query that was\n  searched for. Handy for rule-engines that use rules to find other\n  rules.\n* **Negation as failure**\n  ([AbsentLink](https://wiki.opencog.org/w/AbsentLink).)\n  Reject matches to subgraphs having particular sub-patterns in them.\n  That is, find all graphs of some shape, except those having these\n  other sub-shapes.\n* **For-all predicate**\n  ([AlwaysLink](https://wiki.opencog.org/w/AlwaysLink).)\n  Require that all matches contain a particular subgraph or satisfy a\n  particular predicate.  For example: find all baskets that have only\n  red balls in them. This requires not only finding the baskets, making\n  sure they have balls in them, but also testing each and every ball in\n  a basket to make sure they are **all** of the same color.\n* **Frames (ChangeSets)**\n  Store a sequence of graph rewrites, changes of values as a single\n  changeset. The database itself is a collection of such changesets or\n  \"Frames\".  Very roughly, a changeset resembles a git commit, but for\n  the graph database. The word \"Frame\" is mean to invoke the idea of a\n  stackframe, or a Kripke frame: the graph state, at this moment. By\n  storing frames, it is possible to revert to earlier graph state. It is\n  possible to compare different branches and to explore different\n  rewrite histories starting from the same base graph.  Different\n  branches may be merged, forming a set-union of thier contents. This\n  is useful for inference and learning algos, which explore long chains\n  of large, complex graph rewrites.\n\n\n### What it Isn't\nNewcomers often struggle with the AtomSpace, because they bring\npreconceived notions of what they think it should be, and its not that.\nSo, a few things it is not.\n\n* **It's not JSON.**  So JSON is a perfectly good way of representing\n  structured data. JSON records data as `key:value` pairs, arranged\n  hierarchically, with braces, or as lists, with square brackets.\n  The AtomSpace is similar, except that there are no keys! The\n  AtomSpace still organizes data hierarchically, and provides lists,\n  but all entries are anonymous, nameless. Why? There are performance\n  (CPU and RAM usage) and other design benefits in not using explicit\n  named keys in the data structure. You can still have named values;\n  it is just that they are not required. There are several different\n  ways of importing JSON data into the AtomSpace. If your mental model\n  of \"data\" is JSON, then you will be confused by the AtomSpace.\n\n* **It's not SQL. It's also not noSQL**. Databases from 50 years ago\n  organized structured data into tables, where the `key` is the label\n  of a column, and different `values` sit in different rows. This is\n  more efficient than JSON, when you have many rows: you don't have to\n  store the same key over and over again, for each row. Of course,\n  tabular data is impractical if you have zillions of tables, each with\n  only one or two rows. That's one reason why JSON was invented.\n  The AtomSpace was designed to store *unstructured* data. You can\n  still store structured data in it; there are several different ways\n  of importing tabular data into the AtomSpace. If your mental model\n  of \"data\" is structured data, then you will be confused by the AtomSpace.\n\n* **It's not a vertex+edge store**. (Almost?) all graph databases\n  decompose graphs into lists of vertexes and edges. This is just fine,\n  if you don't use complex algorithms. The problem with this storage\n  format is locality: graph traversal becomes a game of repeatedly\n  looking up a specific vertex and then, a specific edge, each located\n  in a large table of vertexes and edges. This is non-local; it\n  requires large indexes on those tables (requires a lot of RAM),\n  and the lookups are CPU consuming. Graph traversal can be a\n  bottleneck. The AtomSpace [avoids much of this overhead by using\n  (hyper-/meta-)graphs.](https://github.com/opencog/atomspace/blob/master/opencog/sheaf/docs/ram-cpu.pdf)\n  This enables more effective and simpler\n  traversal algorithms, which in turn allows more sophisticated\n  search features to be implemented.  If your mental model of\n  graph data is lists of vertexes and edges, then you will be confused\n  by the AtomSpace.\n\n\n**What is it, then?** Most simply, the AtomSpace stores immutable,\nglobally unique, [typed](https://en.wikipedia.org/wiki/Type_theory)\n[s-expressions.](https://en.wikipedia.org/wiki/S-expression) The types\ncan be thought of as being like object-oriented classes, and many (not\nall) Atom types do have a corresponding C++ class. Each s-expression is\ncalled \"an Atom\". Each Atom is globally unique: there is only one copy,\never, of any given s-expression (Atom). It's almost just that simple,\nwith only one little twist: a (mutable) key-value database is attached\nto each Atom. Now, \"ordinary\" graph databases do this too: every vertex\nor edge can have \"attributes\" on it. The AtomSpace allows these\nattributes to be dynamic: to change in time or to \"flow\". The flow\nitself is described by a graph; thus, graphs can be thought of as\n\"plumbing\"; whereas the Values are like the \"fluid\" in these pipes.\nThis is much like the distinction between \"software\" and \"data\":\nsoftware describes algos, data is what moves through them. In the\nAtomSpace, the algos are explicit graphs. The Values are the data.\n\nThe AtomSpace borrows ideas and concepts from many different systems,\nincluding ideas from JSON, SQL and graph stores. The goal of the\nAtomSpace is to be general: to allow you to work with whatever style\nof data you want: structured or unstructured. As graphs, as tables,\nas objects. As lambda expressions, as abstract syntax trees, as\nprolog-like logical statements.  A place to store relational data\nobeying some relational algebra. As a place to store ontologies or\nmereologies or taxonomies. A place for syntactic (BNF-style)\nproductions or constraints or RDF/OWL-style schemas.\nIn a mix of declarative, procedural and functional styles.\nThe AtomSpace is meant to allow general knowledge representation,\nin any format.\n\nThe \"special extra twist\" of immutable graphs decorated with mutable\nvalues resembles a corresponding idea in logic: the split between\nlogical statements, and the truth values (valuations) attached to them.\nThis is useful not only for logic, but also for specifying data\nprocessing pipelines: the graph specifies the pipeline; the values are\nwhat flow through that pipeline. The graph is the \"code\"; the values\nare the data that the code acts on.\n\nAll this means that the AtomSpace is different and unusual.\nIt might be a bit outside of the comfort zone for most programmers.\nIt doesn't have API's that are instantly recognizable to users of\nthese other systems. There is a challenging learning curve involved.\nWe're sorry about that: if you have ideas for better API's that\nwould allow the AtomSpace to look more conventional, and be less\nintimidating to most programmers, then contact us!\n\n### Status and Invitation\n\nAs it turns out, knowledge representation is hard, and so the AtomSpace\nhas been (and continues to be) a platform for active scientific research\non knowledge representation, knowledge discovery and knowledge\nmanipulation.  If you are comfortable with extremely complex\nmathematical theory, and just also happen to be extremely comfortable\nwriting code, you are invited -- encouraged -- to join the project.\n\n### Related ideas\nA short list of some related concepts:\n\n* [Carnegie Mellon Binary Analysis Platforrm (BAP)](https://github.com/BinaryAnalysisPlatform/bap)\n  allows binary programs (viruses, etc.) to be disassembled and analyzed.\n  The disassembled program is stored as a graph in a database. The graph\n  can be analyzed, investigated, and even executed, to see what it does.\n  Thus, similar to the AtomSpace, but very highly specialized for binaries,\n  and nothing else.\n\n* [Modelica](https://en.wikipedia.org/wiki/Modelica) is a modelling\n  language for describing complex systems. Intended for describing\n  mechanical, electrical, electronic, hydraulic, thermal, control,\n  electric power and process-oriented systems. The descriptions are\n  static, object-oriented, file-based, and meant to be written by\n  humans. That is, the models are atomated, but not the creation and\n  management of them. Not suitable for general graph structures.\n\n* The concept of graph programming.\n\n\nUsing Atomese and the AtomSpace\n===============================\nThe AtomSpace is not an \"app\". Rather, it is a knowledge-base platform.\nIt is probably easiest to think of it as kind-of-like an operating\nsystem kernel: you don't need to know how it works to use it. You\nprobably don't need to tinker with it. It just works, and it's there\nwhen you need it.\n\nEnd-users and application developers will want to use one of the existing\n\"app\" subsystems, or write their own.  Most of the existing AtomSpace \"apps\"\nare focused on various aspects of \"Artificial General Intelligence\". This\nincludes (unsupervised) natural-language learning, machine-learning,\nreasoning and induction, chatbots, robot control, perceptual subsystems\n(vision processing, sound input), genomic and proteomic data analysis,\ndeep-learning neural-net interfaces. These can be found in other github\nrepos, including:\n\n* [Unsupervised natural language learning](https://github.com/opencog/learn)\n  (learn repo)\n* [JSON, Python, Scheme network interfaces](https://github.com/opencog/cogserver)\n  (cogserver repo)\n\nZombie projects: these are half-dead; no one is currently working on them,\nbut they should still work and still provide useful capabilities.\n* [Genomic, proteomic data analysis](https://github.com/opencog/agi-bio)\n  (agi-bio repo) and various [MOZI.AI](https://github.com/mozi-ai) repos.\n* [Port of the MOSES machine learning to Atomese](https://github.com/opencog/as-moses)\n  (as-moses repo)\n* [Unified Rule Engine](https://github.com/opencog/ure) (ure repo)\n* [OpenAI Gym and Minecraft agents](https://github.com/opencog/rocca)\n  (rocca repo)\n\nDead projects: these are no longer maintained. They used to work, but have\nbeen abandoned for various theoretical and political reasons:\n* [Natural language chat, robot control](https://github.com/opencog/opencog)\n  (the opencog repo)\n* [ROS bridge to robots, vision subsystem, chat](https://github.com/opencog/ghost_bridge)\n  (ghost-bridge repo)\n* [Opencog on a Raspberry Pi](https://github.com/opencog/tinycog)\n  (tinycog repo)\n* [Probabilistic Logic Networks](https://github.com/opencog/pln) (pln repo)\n\n\nExamples, Documentation, Blog\n=============================\nIf you are impatient, a good way to learn the AtomSpace is to run the\nexample demos. [Start with these.](examples/atomspace) Then move on to\nthe [pattern-matcher examples](examples/pattern-matcher).\n\nDocumentation is on the OpenCog wiki. Good places to start are here:\n* [AtomSpace](https://wiki.opencog.org/w/AtomSpace)\n* [Atom types](https://wiki.opencog.org/w/Atom_types)\n* [Pattern matching](https://wiki.opencog.org/w/Pattern_matching)\n\nThe [OpenCog Brainwave blog](https://blog.opencog.org/) provides reading\nmaterial for what this is all about, and why.\n\nA Theoretical Overview\n======================\nAtomese is a collection of structural primitives meant to describe\nstructural relationships as they are witnessed in \"reality\". This\nincludes descriptions of physical nature, biological nature,\npsychological, social, cultural, political and economic, and, of course,\nmathematical and technological. So, software and programming.\n\n### Motivation\n\nThe idea of representing \"everything\" is as old as Aristotle. Set theory\nis an early mathematical framework. This is followed by combinators and\nlambda calulus, by means of which \"anything sayable can be said\". Modern\nmath offers Category Theory and Topos Theory, along with Proof Theory\nand Model Theory as ways of talking about \"anything\". The goals of\nmathematicians, however, are not the same as the more\nentrepreneurial-minded, and the latter have created the trillion dollar\ncomputer industry, with only token acknowledgement of the mathematical\nfoundations. The computer industry gives us relational databases,\nknowledge representation, upper ontologies, and now LLM's, transformers\nand weights as mechanisms by which \"anything\" can be represented.\n\nAtomese is an ongoing attempt to roll all of this up into one, and to do\nso in a way that makes general intelligence algorithmically accessible.\nUntil now, all attempts to extract structure from the universe are\ncomplex systems hand-crafted by human engineers. These might be\nfinancial credit-worthiness rating systems, or astronomical\nstellar-redshift analysis tools. The software for these systems are\nwritten by humans, applying conventional software development\nmethodologies, using conventional programming languages, designed to\nmake it easy for the human software engineer to perform their task.\n\nWhat if, instead, we ask: what would it take to make it easy for\nalgorithmic systems to automatically explore and extract structure? To\ncreate world-models that can be stored in short-term or long-term\nmemory, to process and transform sensory information, to drive motors\nand perform actions in the real world? That is, rather than having a\nsmall army of humans hand-crafting custom robots for others to use, to\ninstead provide a recursive infrastructure to allow, umm, err, robots to\ncraft themselves? This is the driving vision of Atomese.\n\n### History\n\nAtomese originally arose as an attempt by Ben Goertzel and company to\ncombine symbolic AI methods with probability theory, resulting in the\ndefinition of PLN, Probabilistic Logic Networks, articulated in several\nbooks devoted to the topic. In this articulation, the primitives of\nknowledge representation theory are mashed up with mathematical logic to\nprovide Nodes and Links, which are general enough to represent almost\nany kind of relational structure. The base object then becomes a\ncollection of graphs, or, more properly, hypergraphs. To be able to\nprocess, digest, reason and manipulate these, these are placed in a\n(hyper-)graph database, the AtomSpace.\n\nTo layer on probability theory onto what is otherwise a purely symbolic\nrepresentation of nature, the SimpleTruthValue is introduced. This is a\npair of floating-point numbers, representing the probability, and the\nconfidence of any given symbolic factual assertion. The goal is to\nsupport logical reasoning systems of any type, not only conventional\nBayesian inference, but any collection of rule systems and axioms, as\nmight be encountered in mathematical proof theory. This would include,\nfor example, any of the rich varieties of modal logic, but also fuzzy\nlogic, the so-called \"non-axiomatic reasoning systems\" and\nstatistical-mechanical systems like Markov logic.\n\nThe word-phrase TruthValue, and more generally Value, has its roots in\nmathematical logic, where any given assertion in first-order logic (or\nhigher-order logic) can be assigned a \"valuation\", indicating it's\nbinary truth/falsehood. Probability theory forces a replacement of crisp\n0/1 by a floating-point number. Probabilistic logic (along with neural\nnets) famously has issues with converging rapidly enough to a given\nsolution. For this reason, an extra float is introduced, the\n\"confidence\". This helps, but is still not enough to capture the concept\nof an ensemble, e.g. a \"Bell curve\", a Gaussian, or more generally any\nkind of probability distribution: a \"histogram\" or more simply \"a vector\nof numbers\". This leads to the idea of a FloatValue, and then rapidly to\na Value in general, which is a vector of anything at all, representing\ntruths in any ensemble, hypothetical modal universe, a set of Bayesian\npriors, as the case may be. Of course, vectors of floats are the\nbread-n-butter of neural nets.\n\nParallel universes, such as the hypothetical worlds of modal logic,\nthermodynamic canonical ensembles, the infinite collection of Bayesian\npriors, or, god forbid, quantum-mechanical decompositions, are often\nimagined to live \"in parallel\" or to somehow co-exist temporally. In\nphysical reality, though, the changing network of relationships and\nlikelihoods is time-varying, and usually accessible only through sensory\ndevices, rather than through pure reason. This motivates the recasting\nof Values as streams that flow data. This relegates the AtomSpace to\nbeing a form of memory, a repository for world-models, while flowing\nstreams encapsulate the process of, well, \"processing information\". This\nfits well with present-day software theory, which includes descriptions\nof generators, futures and promises as software primitive constructs for\ncreating sensory agentic systems. The backends of large commercial\nwebsites use futures and promises as extremely low-level programming\nconstructs to implement millisecond reaction times when customers click\non their favorite TikTok influencer. The point of having streams in\nAtomese is not to be hopelessly abstract, but to capture an idea that is\nalready widespread in the design and development of agentic software\nsystems.\n\nThis brings Atomese to it's present-day state: an infrastructure for\nsymbolic AI, together with a (hyper-)graph database, offering dynamic\nsensori-motor processing primitives. The hope is that this is an\nappropriate toolset for agentic systems that can reify, transform and\ntransmute their own content. It remains a research platform to figure\nout how this is possible, or, perhaps being more honest, if this is\npossible.\n\n### Atoms and Values\nOne of the primary conceptual distinctions in Atomese is between\n\"Atoms\" and \"Values\". The distinction is made for both usability and\nperformance.  Atoms are:\n\n* Used to represent graphs, networks, and long-term stable graphical relations.\n* Indexed (by the AtomSpace), which enables the rapid search and traversal of graphs.\n* Globally unique, and thus unambiguous anchor points for data.\n* Immutable: can only be created and destroyed, and are effectively static and unchanging.\n* Large, bulky, heavy-weight (because indexes are necessarily bulky).\n\nBy contrast, Values, and valuations in general, are:\n* A way of holding on to rapidly-changing data, including streaming data.\n* Hold \"truth values\" and \"probabilities\", which change over time as new\n  evidence is accumulated.\n* Provide a per-Atom key-value store (a mini noSQL database per-Atom).\n* Are not indexed, and are accessible only by direct reference.\n* Small, fast, fleeting (no indexes!)\n\nThus, for example, a piece of knowledge, or some proposition would be\nstored as an Atom.  As new evidence accumulates, the truth value of the\nproposition is adjusted. Other fleeting changes, or general free-form\nannotations can be stored as Values.  Essentially, the AtomSpace looks\nlike a database-of-databases; each atom is a key-value database; the\natoms are related to one-another as a graph. The graph is searchable,\neditable; it holds rules and relations and ontologies and axioms.\nValues are the data that stream and flow through this network, like\nwater through pipes. Atoms define the pipes, the connectivity. Values\nflow and change. See the blog entry\n[value flows](https://blog.opencog.org/2020/04/08/value-flows/) as\nwell as [Atom](https://wiki.opencog.org/w/Atom) and\n[Value](https://wiki.opencog.org/w/Value).\n\n### More info\nThe primary documentation for the atomspace and Atomese is here:\n\n* https://wiki.opencog.org/w/AtomSpace\n* https://wiki.opencog.org/w/Atomese\n* https://wiki.opencog.org/w/Atom\n* https://wiki.opencog.org/w/Value\n\nThe main project site is at https://opencog.org\n\n\nNew Developers; Pre-requisite skills\n====================================\nMost users should almost surely focus their attention on one of the\nhigh-level systems built on top of the AtomSpace. The rest of this\nsection is aimed at anyone who wants to work *inside* of the AtomSpace.\n\nMost users/developers should think of the AtomSpace as being kind-of-like\nan operating system kernel, or the guts of a database: its complex, and\nyou don't need to know how the innards work to use the system. These\ninnards are best left to committed systems programmers and research\nscientists; there is no easy way for junior programmers to participate,\nat least, not without a lot of hard work and study.  Its incredibly\nexciting, though, if you know what you're doing.\n\nThe AtomSpace is a relatively mature system, and thus fairly complex.\nBecause other users depend on it, it is not very \"hackable\"; it needs\nto stay relatively stable.  Despite this, it is simultaneously a\nresearch platform for discovering the proper way of adequately\nrepresenting knowledge in a way that is useful for general intelligence.\nIt turns out that knowledge representation is not easy.  This project\nis a -good- excellent place to explore it, if you're interested in that\nsort of thing.\n\nExperience in any of the following areas will make things easier for\nyou; in fact, if you are good at any of these ... we want you. Bad.\n\n* Database internals; query optimization.\n* Logic programming; Prolog.\n* SAT-solving; Answer Set programming; Satisfiability Modulo Theories.\n* Programming language design \u0026amp; implementation.\n* Rule engines; reasoning; inference; parsing.\n* Theorem-proving systems; Type theory.\n* Compiler internals; code generation; code optimization; bytecode; VM's.\n* Operating systems; distributed database internals.\n* GPU processing pipelines, lighting-shading pipelines, CUDA, OpenCL.\n* Dataflow in GPU's for neural networks.\n\nBasically, Atomese is a mash-up of ideas taken from all of the above\nfields.  It's kind-of trying to do and be all of these, all at once,\nand to find the right balance between all of them. Again: the goal is\nknowledge representation for general intelligence. Building something\nthat the AGI developers can use.\n\nWe've gotten quite far; we've got a good, clean code-base, more-or-less,\nand we're ready to kick it to the next level. The above gives a hint of\nthe directions that are now open and ready to be explored.\n\nIf you don't have at least some fair grounding in one of the above,\nyou'll be lost, and find it hard to contribute.  If you do know something\nabout any of these topics, then please dive into the open bug list. Fixing\nbugs is the #1 best way of learning the internals of any system.\n\nKey Development Goals\n=====================\nLooking ahead, some key major projects.\n\n### Distributed Processing\nOne of the development goals for the 2021-2023 time frame\nis to gain experience with distributed data processing. Currently,\none can build simple distributed networks of AtomSpaces, by using\nthe [**StorageNode**](https://wiki.opencog.org/w/StorageNode) to\nspecify a remote AtomSpace. However, it is up to you as to what\nkinds of data these AtomSpace exchange with one-another. Only two\nsimple pre-configured communications styles have been created: the\nread-thru and the write-thru proxies for the cogserver. These pass\nincoming data and results on to the next nodes in the network.\n\n### Cross-system Bridges\nBecause the AtomSpace can hold many different representatioinal\nstyles, it is relatively easy to import data into the AtomSpace.\nThe low-brow way to do this is to write a script file that imports\nthe data. This is fine, but leads to data management issues: who's\ngot the master copy?\n\nThe goal of data bridges is to create new Atoms that allow live\naccess into other online systems. For example, if an SQL database\nholds a table of `(name, address, phone-number)`, it should be\npossible to map this into the AtomSpace, such that updates not\nonly alter the SQL table, live and on line, but also such that\na query performed on the AtomSpace side translates into a query on\nthe SQL database side. This is not hard to do, but no one's done it\nyet.\n\nSimilarly, a live online bridge between the AtomSpace and popular\ngraph databases should also be possible. It's not clear if this\nshould use the [StorageNode](https://wiki.opencog.org/w/StorageNode)\nAPI mentioned above, or if it needs something else.\n\n\n### Exploring Values\n\nThe new Value system seems to provide a very nice way of working\nwith fast-moving high-frequency data.  It seems suitable for holding\non to live-video feeds and audio streams and piping them through\nvarious data-processing configurations. It looks to be a decent\nAPI for declaring the structure and topology of neural nets (e.g.\nTensorFlow).  However, it is more-or-less unused for these tasks.\nApparently, there is still some missing infrastructure, as well as\nsome important design decisions to be made. Developers have not begun\nto explore the depth and breadth of this subsystem, to exert pressure\non it.  Ratcheting up the tension by exploring new and better ways of\nusing and working with Values will be an important goal for the\n2021-2024 time-frame. See the\n[value flows](https://blog.opencog.org/2020/04/08/value-flows/) blog\nentry.\n\nA particularly important first step would be to build interfaces\nbetween values and an audio DSP framework. This would allow AtomSpace\nstructures to control audio processing, thus enabling (for example)\nsound recognition (do I hear clapping? Cheers? Boos?) without having\nto hard-code a \"cheer recognizer\". This opens the door to using machine\nlearning to learn how to detect different kinds of audio events.\n\nThere is no particular need to limit oneself to audio: other kinds\nof data is possible (*e.g.* exploring the syntactic, hierarchical\npart-whole structure in images) but audio is perhaps easier!?\n\n\n### Sheaf theory\n\nMany important types of real-world data, include parses of natural\nlanguage and biochemical processes resemble the abstract mathematical\nconcept of \"sheaves\", in the sense of sheaf theory.  One reason that\nthings like deep learning and neural nets work well is because some\nkinds of sheaves look like tensor algebras; thus one has things like\nWord2Vec and SkipGram models.  One reason why neural nets still\nstumble on natural language processing is because natural language\nonly kind-of-ish, partly looks like a tensor algebra. But natural\nlanguage looks a whole lot more like a sheaf (because things like\npre-group grammars and categorial grammars \"naturally\" look like\nsheaves.)  Thus, it seems promising to take the theory and all the\nbasic concepts of deep learning and neural nets, rip out the explicit\ntensor-algebra in those theories, and replace them by sheaves. A\n[crude sketch is here](/opencog/sheaf/docs/sheaves.pdf).\n\nSome primitive, basic infrastructure has been built. Huge remaining\nwork items are using neural nets to perform the tensor-like factorization\nof sheaves, and to redesign the rule engine to use sheaf-type theorem\nproving techniques.\n\nCurrent work is split between two locations: the \"sheaf\" subdirectory\nin this repo, and the [generate](https://github.com/opencog/generate)\nrepo.\n\nBuilding and Installing\n=======================\nThe Atomspace runs on more-or-less any flavor of GNU/Linux. It does not\nrun on any non-Linux operating systems (except maybe some of the BSD's).\nSorry!\n\nThere are a small number of pre-requisites that must be installed\nbefore it can be built.  Many users will find it easiest to use the\ninstall scripts provided in the [ocpkg repo](https://github.com/opencog/ocpkg).\nSome users may find some success with one of the\n[opencog Docker containers](https://github.com/opencog/docker).\nDevelopers interested in working on the AtomSpace must be able to build\nit manually. If you can't do that, all hope is lost.\n\n### Prerequisites\n\nTo build the OpenCog AtomSpace, the packages listed below are required.\nEssentially all Linux distributions will provide these packages.\n\n###### boost\n* C++ utilities package.\n* https://www.boost.org/ | `apt install libboost-dev`\n\n###### cmake\n* Build management tool; v3.0.2 or higher recommended.\n* https://www.cmake.org/ | `apt install cmake`\n\n###### cogutil\n* Common OpenCog C++ utilities.\n* https://github.com/opencog/cogutil\n* It uses exactly the same build procedure as this package. Be sure\n  to `sudo make install` at the end.\n\n###### guile\n* Embedded scheme REPL; version 3.0 or newer required.\n* https://www.gnu.org/software/guile/guile.html\n* For Debian/Ubuntu,  `apt install guile-3.0-dev`\n\n###### cxxtest\n* Unit test framework.\n* Required for running unit tests. Breaking unit tests is verboten!\n* https://cxxtest.com/ | `apt install cxxtest`\n\n### Optional Prerequisites\n\nThe following packages are optional. If they are not installed, some\noptional parts of the AtomSpace will not be built.  The `cmake` command,\nduring the build, will be more precise as to which parts will not be built.\n\n###### Cython\n* C bindings for Python. (Cython version 0.23 or newer)\n* Recommended, as many users enjoy using python.\n* https://cython.org | `apt install cython`\n\n###### Haskell\n* Haskell bindings (experimental).\n* Optional; almost no existing code makes use of Haskell.\n* https://www.haskell.org/\n\n###### OCaml\n* OCaml bindings (experimental).\n* Optional; almost no existing code makes use of OCaml.\n* https://www.ocaml.org/ | `apt install ocaml ocaml-findlib`\n\n###### Postgres\n* Distributed, multi-client networked storage.\n* Needed for \"remembering\" between shutdowns (and for distributed AtomSpace)\n* Optional; The RocksDB backend is recommended. Use the cogserver to get a\n  distributed atomspace.\n* https://postgres.org | `apt install postgresql postgresql-client libpq-dev`\n\n### Building AtomSpace\n\nBe sure to install the pre-requisites first!\nPerform the following steps at the shell prompt:\n```\n    cd to project root dir\n    mkdir build\n    cd build\n    cmake ..\n    make -j4\n    sudo make install\n    make -j4 check\n```\nLibraries will be built into subdirectories within build, mirroring\nthe structure of the source directory root.\n\n\n### Unit tests\n\nTo build and run the unit tests, from the `./build` directory enter\n(after building opencog as above):\n```\n    make -j4 check\n```\nMost tests (just not the database tests) can be run in parallel:\n```\n    make -j4 check ARGS=-j4\n```\nThe database tests *will* fail if run in parallel: they will step on\none-another, since they all set and clear the same database tables.\n\nSpecific subsets of the unit tests can be run:\n```\n    make test_atomese\n    make test_atomspace\n    make test_guile\n    make test_join\n    make test_python\n    make test_query\n```\n\n### Install\n\nAfter building, you MUST install the atomspace.\n```\n    sudo make install\n```\n\nWriting Atomese\n===============\nAtomese -- that is -- all of the different Atom types, can be thought\nof as the primary API to the AtomSpace.  Atoms can, of course, be\ncreated and manipulated with Atomese; but, in practice, programmers\nwill work with either Scheme (guile), Python, C++, OCaml or Haskell.\n\nThe simplest, most complete and extensive interface to Atoms and the\nAtomspace is via scheme, and specifically, the GNU Guile scheme\nimplementation.  An extensive set of examples can be found in the\n[`/examples/atomspace`](/examples/atomspace) and the\n[`/examples/pattern-matcher`](/examples/pattern-matcher) directories.\n\nPython is more familiar than scheme to most programmers, and it offers\nanother way of interfacing to the atomspace. Unfortunately, it is not\nas easy and simple to use as scheme; it also has various technical issues.\nThus, it is significantly less-used than scheme in the OpenCog project.\nNone-the-less, it remains vital for various applications. See the\n[`/examples/python`](/examples/python) directory for how to use python\nwith the AtomSpace.\n\nTODO - Notes - Open Projects\n============================\n* Porting to Android: multiple issues described in\n  [cogroid/b-obstacles](https://github.com/cogroid/b-obstacles).\n  See also [bug 2995](https://github.com/opencog/atomspace/issues/2995).\n\n* Bug: Crashes on Arm v7a:\n  [bug 2944](https://github.com/opencog/atomspace/issues/2944).\n\nInteresting Reading\n===================\n* Seems that the AtomSpace is no longer alone in the hypergraph world!\n  As of 2022, one can find a python library called\n  [HyperNetX](https://hypernetx.readthedocs.io/en/latest/).\n  Their documentation is even eerily similar to our own! Gee, how could\n  that happen?\n\n* Stop the presses! There's more! Extra! Extra! Read all about it!\n  [NetworkX](https://networkx.org/) is a python package for analyzing\n  complex networks.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencog%2Fatomspace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopencog%2Fatomspace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencog%2Fatomspace/lists"}