{"id":19484943,"url":"https://github.com/opencog/sensory","last_synced_at":"2025-04-15T11:54:16.657Z","repository":{"id":234060397,"uuid":"788224604","full_name":"opencog/sensory","owner":"opencog","description":"Low-level sensory I/O Atoms","archived":false,"fork":false,"pushed_at":"2025-02-08T03:31:53.000Z","size":345,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-28T19:45:16.289Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opencog.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-18T02:23:21.000Z","updated_at":"2025-03-06T18:06:36.000Z","dependencies_parsed_at":"2024-05-11T23:22:28.700Z","dependency_job_id":"084e5938-d64e-4735-a477-a2c4a7ce9eed","html_url":"https://github.com/opencog/sensory","commit_stats":null,"previous_names":["opencog/sensory"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fsensory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fsensory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fsensory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencog%2Fsensory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opencog","download_url":"https://codeload.github.com/opencog/sensory/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249067767,"owners_count":21207395,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T20:24:49.290Z","updated_at":"2025-04-15T11:54:16.637Z","avatar_url":"https://github.com/opencog.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"Sensory Atomese\n===============\nThis repo explores how perception and action within an external world\nmight work with the [AtomSpace](https://github.com/opencog/atomspace).\n\nTL;DR: Explores philosophical approaches to perception \u0026 action via\nactual, working code using low-level AtomSpace sensory I/O Atoms.\nA software conception of \"basal cognition\".\nThe experimental lab for this is \"perceiving\" filesystem files,\n\"moving\" through directories, and likewise for IRC chat streams.\n\nThis is part of a group of inter-related research projects:\n* [Agents](https://github.com/opencog/agents), which aims to make use\n  of the structures developed here.\n* [Motor](https://github.com/opencog/motor), which looks at the issues\n  raised below from a different perspective, perhaps simplifying the\n  problem into a more tractable and practical form.\n\nMotivation\n----------\nOpenCog has repeatedly attempted to build \"embodied AI systems\", by\nattaching symbolic and probabilistic AI reasoning systems to assorted\nrobot platforms. These include:\n* The [Hanson Robotics](https://www.hansonrobotics.com/) Sophia robot,\n  via [ROS and Blender](https://github.com/opencog/docker/tree/master/indigo).\n* [Minecraft](https://www.minecraft.net), via MineRL and Malmo, with the\n  Rational OpenCog Controlled Agent,\n  [ROCCA](https://github.com/opencog/rocca).\n* The Unity game engine, to create\n  [pet](https://www.youtube.com/watch?v=FEmpGRLwbqE)\n  [dog](https://www.youtube.com/watch?v=vZtnjKcrdZQ)\n  [avatars](https://www.youtube.com/watch?v=of-BahzS8qQ)\n  that\n  [talk](https://www.youtube.com/watch?v=ii-qdubNsx0).\n\nThese are the major efforts; there were half-a-dozen lesser efforts,\nincluding a soccer-playing robothon, held in Ethiopia. All of these\nfailed, although all were educational: the robothon was explictly\narranged by students in Ethiopia as a part of university coursework.\n\nAll of these failed, in large part because not enough effort was\nput into understanding sensing and motion, and far too much on\nreasoning, planning and language. The ideas of sensing and movement\nseem trivial and obvious, and are implemented with brute-force hackery.\nNot worthy of intellectual effort, in contrast to the veneration\ngiven to reasoning and logic.  This is a fundamental mistake.\n\n\nPhilosophical Overview\n----------------------\nThe issue for any agent is being able to perceive the environment that\nit is in, and then being able to interact with this environment.\n\nThe goal in this project is to find a theory of a minimum viable\nlow-level sensory API. Having such a theory would clarify how learning\nsystems might learn how to use it, or, more generally, how to\n\"use things\", how to have a cause-and-effect in the universe.\n\nFor OpenCog, and, specifically, for OpenCog Atomese, all interaction,\nknowledge, data and reasoning is represented with and performed by\nAtoms (stored in a hypergraph database) and Values (transient data\nflowing through a network defined by Atoms).\n\nIt is not hard to generate Atoms, flow some Values around, and perform\nsome action (control a robot, say some words). The hard part is to\n(conceptually, philosophically) understand how an agent can represent\nthe external world (with Atoms and Values), and how it should go about\ndoing things. The agent needs to perceive the environment in some way\nthat results in an AtomSpace representation that is easy for agent\nsubsystems to work with. This perception must include the agent's\nawareness of the different kinds of actions it might be able to perform\nupon interacting with the external world.  That is, before one can\nattempt to model the external world, one must be able to model one's own\nability to perform action. This is where the boundary between the agent\nand the external world lies: the boundary of the controllable.\n\nTraditional human conceptions of senses include sight and hearing;\ntraditional ideas consist of moving (robot) limbs. See, for example,\nthe git repo [opencog/vision](https://github.com/opencog/vision)\nfor OpenCV-based vision Atomese. (Note: It is at version 0.0.2)\n\nThe task being tackled here is at once much simpler and much harder:\nexploring the unix filesystem, and interacting via chat. This might\nsound easy, trivially easy, even, if you're a software developer.\nThe hard part is this: how does an agent know what a \"file\" is?\nWhat a \"directory\" is? Actions it can perform are to walk the directory\ntree; but why? Is it somehow \"fun\" for the agent to walk directories\nand look at files? What should it do next? Read the same file again,\nor maybe try some other file? Will the agent notice that maybe some\nfile has changed? If it notices, what should it do? What does it mean,\npersonally, to the agent, that some file changed? Should it care? Should\nit be curious?\n\nThe agent can create files. Does it notice that it has created them?\nDoes it recognize those files as works of it's own making? Should it\nread them, and admire the contents? Or perform some processing on them?\nOr is this like eating excrement? What part of the \"external world\"\n(the filesystem) is perceived to be truly external, and what part is\n\"part of the agent itself\"? What does it mean to exist and operate in\na world like this? What's the fundamental nature of action and\nperception?\n\nWhen an agent \"looks at\" a file, or \"looks at\" the list of users on\na chat channel, is this an action, or a perception? Both, of course:\nthe agent must make a conscious decision to look (take an action) and\nthen, upon taking that action, sense the results (get the text in the\nfile or the chat text). After this, it must \"perceive\" the results:\nfigure out what they \"mean\".\n\nThese are the questions that seem to matter, for agent design. The code\nin this git repo is some extremely low-level, crude Atomese interfaces\nthat try to expose these issues up into the AtomSpace.\n\nCurrently, four interfaces are being explored: a basic text-terminal,\na single-file reader/writer, a unix filesystem navigator, and an IRC\nchat interface. Hopefully, this is broad enough to expose some of the\ndesign issues. Basically, chat is not like a filesystem: there is a\nlarge variety of IRC commands, there are public channels, there are\nprivate conversations. They are bi-directional.  The kind of sensory\ninformation coming from chat is just different than the sensory\ninformation coming from files (even though, as a clever software\nengineer, one could map chat I/O to a filesystem-style interface.)\nThe point here is not to be \"clever\", but to design action-perception\ncorrectly.  Trying to support very different kinds of sensorimotor\nsystems keeps us honest.\n\nTyped Pipes and Data Processing Networks\n----------------------------------------\nIn unix, there is the conception of a \"pipe\", having two endpoints. A\npair of unix processes can communicate \"data\" across a pipe, merely by\nopening each endpoint, and reading/writing data to it. Internet sockets\nare a special case of pipes, where the connected processes are running\non different computers somewhere on the internet.\n\nUnix pipes are not typed: there is no a priori way of knowing what kind\nof data might come flowing down them. Could be anything. For a pair of\nprocesses to communicate, they must agree on the message set passing\nthrough the pipe. The current solution to this is the IETF RFC's, which\nare a rich collection of human-readable documents describing datastream\nformats at the meta level. In a few rare cases, one can get a machine-\n-readable description of the data. An example of this is the DTD, the\n[Data Type Definition](https://en.wikipedia.org/wiki/Document_type_definition),\nwhich is used by web browsers to figure out what kind of HTML is being\ndelivered (although the DTD is meant to be general enough for \"any\" use.)\nOther examples include [Interface Description\nLanguages](https://en.wikipedia.org/wiki/Interface_description_language),\nthe X.500 and LDAP schemas, as well as SNMP.\n\nHowever, there is no generic way of asking a pipe \"hey mister pipe, what\nare you? What kind of data passes over you?\" or \"how do I communicate\nwith whatever is at the other end of this pipe?\" Usually, these\nquestions are resolved by some sort of hand-shaking and negotiation\nwhen two parties connect.\n\nThe experiment being done here, in this git repo, in this code-base, is\nto assign a type to a pipe. This replaces the earliest stages of\nprotocol negotiation: if a system wishes only connect to pipes of type\n`FOO`, then it can find out what is available by examining (\"looking\nat\") the pipe description. The pipe descrription is a disjoint list\nof connector types: types that caharacterize how a connection can be\nmade.  If this list is\n`BAR+ or FOO+ or BLITZ+`, then we're good: the `or` is a disjunctive-or,\na menu choice of what is being served on that pipe. Upon opening that\npipe, some additional data descriptors might be served up, again in the\nform of a menu choice. If the communicating processes wish to exchange\ntext data, when eventually find `TEXT-` and `TEXT+`, which are two\nconnectors stating \"I'll send you text data\" and \"That's great, because\nI can receive text data\".\n\nSo far, so good. This is just plain-old ordinary computer science, so\nfar. The twist is that these data descriptors are being written as Link\nGrammar (LG) connector types. Link Grammar is a language parser: given a\ncollection of \"words\", to which a collection of connectors are attached,\nthe parser can connect up the connectors to create \"links\". The linkages\nare such that the endpoints always agree as to the type of the\nconnector.\n\nThe twist of using Link Grammar to create linkages changes the focus\nfrom pair-wise, peer-to-peer connections, to a more global network\nconnection perspective. A linkage is possible, only if all of the\nconnectors are connected, only if they are connected in a way that\npreserves the connector types (so that the two endpoints can actually\ntalk to one-another.)\n\nThis kind of capability is not needed for the Internet, or for\npeer-to-peer networks, which is why you don't see this \"in real life\".\nThat's because humans and sysadmins and software developers are smart\nenough to figure out how to connect what to what, and corporate\nexecutives can say \"make it so\". However, machine agents and \"bots\" are\nnot this smart.\n\nSo the aim of this project is to create a sensory-motor system, which\nself-describes using Link Grammar-style disjuncts. Each \"external world\"\n(the unix filesystem, IRC chat, a webcam or microphone, etc.) exposes\na collection of connectors that describe the data coming from that\nsensor (text, images ...) and a collection of connectors that describe\nthe actions that can be taken (move, open, ...) These connector-sets\nare \"affordances\" to the external world: they describe how an agent can\nwork with the sensori-motor interface to \"do things\" in the external\nworld.\n\nWiring\n------\nThe words \"wiring\", \"wire up\" and \"connect up\" are being used\nself-consciously.  This is what one does for electrical and electronic\ncircuits; but it is also what is done for plumbing, say, for a chemical\nrefining plant. One distinguishes the devices connected up, from the\ncurrent flowing on the wires. The current itself has properties:\ndifferent voltages, or, in the case of chemical processing, different\nsubstances.\n\nWiring is conventionally done by specifying netlists: lists of what is\nconnected to what, organized by type of wire. There's a multi-billion\ndollar industry dealing in\n[Electronic Design Automation (EDA)](https://en.wikipedia.org/wiki/Electronic_design_automation)\ntools. Prominent programming languages include\n[VHDL](https://en.wikipedia.org/wiki/VHDL) and\n[Verilog](https://en.wikipedia.org/wiki/Verilog) for digital circuits,\nand\n[SPICE](https://en.wikipedia.org/wiki/SPICE) for analog circuit design.\nVerilog has been adapted for mixed-signal design, and also for\n[synthetic biological circuits](https://en.wikipedia.org/wiki/Synthetic_biological_circuit),\nand so is relatively generic.\n\nThis project is explicitly performing a kind of wiring, but it is using\nAtomese, not Verilog, for that wiring. Could this project be done in\nVerilog? Possibly. When Atomese was being invented, this wasn't\nforeseen.\n\nThere is another example where \"wiring\" is commonly done: in\n[compilers](https://en.wikipedia.org/wiki/Compiler). A compiler takes a\nhigh-level language (C, C++, Java) and converts it to\n[assembly code](https://en.wikipedia.org/wiki/Assembly_language).\nPart of the magic of doing this is to describe CPU hardware, and\nspecifically the assembly instructions, as if they were electronic\ndevices, having inputs and generating outputs. The assembly instructions\nare then \"wired up\", so that data flows correctly through them. Thus, for\nexample, the ADD instruction has two inputs, one output, all of which\nare registers; the result must go into the register that the next\ninstruction is expecting, or must be routed to memory.\n\nExamples of such data-flow descriptions include gcc's\n[Register Transfer Language (RTL)](https://en.wikipedia.org/wiki/Register_transfer_language)\nand [GIMPLE](https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html).\nThere are many more: the idea of an\n[intermediate representation (IR)](https://en.wikipedia.org/wiki/Intermediate_representation)\nlanguage is generic in programming language design.\n\nJust like Verilog is intended for electronic circuits, the IR languages\nare intended for programming languages. Neither of these generalize very\nwell to domains outside of their original specification. Atomese is\nattempting to be the superset or generalization of all of these\ndifferent approaches to wiring. It is trying to capture the generic\nabstraction of \"what is wiring\" and \"what does it mean to hook things\ntogether\".\n\nTwo side-comments: it is not an accident that Atomese resembles GCC's\nRTL. Both capture something fundamental about wiring. There is also\nanother tickling analogy: Atomese Values are meant to be transient\nchanging things, while Atoms in the AtomSpace are meant to be static,\nwith the AtomSpace a repository, a database for these Atoms. This\nresembles the relatioship between CPU registers, where computation takes\nplace, and system RAM, while holds and \"remembers\" stuff. One of the\nbasic rewrite rules in Atomese is the one that moves Values into Atoms,\nstoring them in the AtomSpace, and vice-versa, streams Atoms out of the\nAtomSpace, and into Values.\n\nSome generalizations and generalities: in chemistry, one has literal\natoms, which, like tinker toys or jigsaw-puzzle pieces, can hook up one\nto the other, to form complex molecules. The process is recursive:\ncertain complex molecules, such as DNA and protiens, can further hook up\nand participate in processing. The idea of hooking up and connecting is\npervasive. It's in chemistry certainly, and obviously in mechanical\nlinkages. One builds high-rises by connecting steel girders.\n\nOne of the sub-goals of this project is to understand the generic,\nmathematical, philsophical nature of what it means to \"hook things up\",\nin full abstractness.\n\n\nAuto-wiring and theorem proving\n-------------------------------\nThe goal of describing system blocks with some DTD or IDL language,\nand then hooking them up \"automatically\", is not new. Four approaches\nare worth mentioning.\n\n* [ProLog](https://en.wikipedia.org/wiki/Prolog) and\n  [Answer Set Programming (ASP)](https://en.wikipedia.org/wiki/Answer_set_programming).\n  In these systems, one asserts a collection of facts/assertions, and\n  then says \"let's go\". Prolog uses a chaining algorithm, and ASP uses\n  a [SAT solver](https://en.wikipedia.org/wiki/SAT_solver) to determine\n  a \"solution\" satisfying the constraints embodied in the facts. One\n  does _not_ get back the actual inference chain that was used. For\n  this project, we want to know *how* things got hooked together.\n  We're not interested in a true/false satisfiability answer, but\n  rather in finding out how assemble a processing chain.\n\n* [Automated theorem provers](https://en.wikipedia.org/wiki/Automated_theorem_proving).\n  These are systems that, given a set of facts/assertions, together\n  with a collection of inference rules, provide an actual inference\n  chain, aka a \"proof\", that explicitly gives the steps from the\n  hypothesis (the input claims) to the result (that the input claims\n  are consistent and simultaneously satisfiable.) This gets closer to\n  what we want: we want the chain. However, unlike theorem provers,\n  we're not particularly interested in satisfiability, other than to\n  find out if a partially-assembled pipeline still has some missing\n  end-points. This is why we work with sheaves or jigsaws: we want to\n  know not only how to assemble the jigsaw pieces, but to also know\n  what the remaining, open (unconnected) connectors are. Theorem\n  provers do not provide such info.\n\n* The [SOAR Cognitive Architecture](https://en.wikipedia.org/wiki/Soar_%28cognitive_architecture%29).\n  This is a production system that applies production rules to state.\n  This captures a different aspect of what we want to do here: we want\n  to have data, and to apply rules to that data to transform it.\n  However, our data is not so much \"state\" as it is a \"stream\": think\n  of an audio or video stream. SOAR selects a rule (analogous to an\n  inference rule, in theorem proving) and applies it immediately, to\n  mutate the state. By contrast, we want to think of inference rules as\n  jigsaw pieces: how can they be assembled? Once assempled, then these\n  rules can be applied to not just \"state\" in a single-shot fashion,\n  but repeatedly, to a flowing stream of data (say, video, to find all\n  cats in the video).\n\n* [Programming language compilers](https://en.wikipedia.org/wiki/Compiler).\n  These are able to take high-level specifications and convert them\n  into an equivalent program written in\n  [assembly language](https://en.wikipedia.org/wiki/Assembly_language).\n  The assembly instructions can be thought of as jigsaws, and when they\n  are assembled, the mating rules must be closely followed: the output\n  registers in one instruction must attach to the inputs of another.\n  The resulting program has to be runnable, executable. This is exactly\n  what we want here: an assembly of interconnected jigsaws. However,\n  unlike a compiler, our \"assembly language\" consists of various\n  abstract processing components: filters, transforms, etc. It doesn't\n  run on a (real or virtual) CPU, but on an abstract machine. It is not\n  registers and RAM that the instructions/jigsaws act on, but on\n  sensory data (again: think video/audio).  Worse yet, the instructions\n  aren't even fixed: new ones might get invented at any time. These\n  might supplement or replace old ones. Compilers also want a program,\n  written in a high-level language, as input. In this project, we won't\n  have such a program; our situation is closer too SOAR or ProLog or\n  theorem provers: we have a collection of jigsaws (instructions) to\n  assemble, but no high-level program to specify that assembly.\n\nThe above systems solve some of the aspects of what we want to do here,\nbut only some of them, and not in the format that we actually need.\nThe above should give a flavor of why we're embarked on the crazy\njourney we're on. No one else does this.\n\n\nAutonomous Agents\n-----------------\nThe sensori-motor system is just an interface. In between must lie a\nbunch of data-processing nodes that take \"inputs\" and convert them to\n\"outputs\". There are several ways to do this. One is to hand-code,\nhard-code these connections, to create a \"stimulus-response\" (SRAI)\ntype system. For each input (stimulus) some processing happens,\nand then some output is generated (response). A second way is to create\na dictionary of processing elements, each of which can take assorted\ninputs or outputs, defined by connector types. Link Grammar can then be\nused to obtain valid linkages between them. This approach resembles\nelectronics design automation (EDA): there is a dictionary of parts\n(resistors, capacitors, coils, transistors ... op amps, filters, ...)\neach able to take different kinds of connections. With guidance from the\n(human) user, the EDA tool selects parts from the dictionary, and hooks\nthem up in valid circuits. Here, Link Grammar takes the role of the EDA\ntool, generating only valid linkages. The (human) user still had to\nselect the \"LG words\" or \"EDA parts\", but LG/EDA does the rest,\ngenerating a \"netlist\" (in the case of EDA) or a \"linkage\" (in the case\nof LG).\n\n(Footnote:\n[Stimulus-response](https://en.wikipedia.org/wiki/Stimulus%E2%80%93response_model)\nis an old concept in psychology, dating back to Pavlov. SRAI is the name\ngiven to the rules used by the\n[AIML](https://en.wikipedia.org/wiki/Artificial_Intelligence_Markup_Language)\nchatbot language. The terminology is not accidental.)\n\nWhat if there is no human to guide parts selection and circuit design?\nYou can't just give an EDA tool a BOM (Bill of Materials) and say\n\"design some random circuit out of these parts\". Well, actually, you\ncan, if you use some sort of evolutionary programming system. Such\nsystems (e.g. [as-moses](https://github.com/opencog/as-moses)) are able\nto generate random trees, and then select the best/fittest ones for some\ngiven purpose. A collection of such trees is called a \"random forest\" or\n\"decision tree forest\", and, until a few years ago, random forests were\ncompetitive in the machine-learning industry, equaling the performance\nseen in deep-learning neural nets (DLNN).\n\n(Footnote: A collection of items, each given a score, is termed an\n[\"ensemble\"](https://en.wikipedia.org/wiki/Ensemble_(mathematical_physics))\nin statistical physics. Thus, a\n[random forest](https://en.wikipedia.org/wiki/Random_forest),\nwhere each tree is assigned a real-number fitness score, is an\nensemble of trees.  Ensembles are described by\n[partition functions](https://en.wikipedia.org/wiki/Partition_function_(mathematics)):\nroughly, functions that tell you how many items there are having a given\nscore. Partition functions tend to have a Gaussian (Bell curve)\ndistribution, with the mean of the curve given by an\n[Action](https://en.wikipedia.org/wiki/Action_principles).\nThis last is highly technical: the action describes not only the mean\nbut also how neighboring items are related.  When the ensemble forms a\ncontinuum, the action can be used to obtain equations of motion; these\nare the\n[Hamilton-Jacobi equations](https://en.wikipedia.org/wiki/Hamilton%E2%80%93Jacobi_equation)\nThere is a very rich theory surrounding these, everything from geodesics\non Riemann surfaces to quantum field theory, with applications ranging\nfrom chemical reaction rates to petroleum exploration to\nmeasure-preserving dynamical systems. It is no accident that Bill Friston\nproposes free energy as the fundamental underlying theoretical principle\nthat can be used to understand AGI or general intelligence.  The\nensemble is a powerful concept, and when coupled to a fitness score and\nthe accompanying mathemetcial apparatus of Gibbs free energy and\nBoltzmann distributions, it aappears to be pervasive. The issue here is\nthat we are still grasping at basic principles: the AtomSpace and Atomese\nallow for ensembles of representations of \"mechanical parts\" to be\ncreated, but it is not yet clear how to score them, how to create the\nIsing-like model of interacting, self-assembling components. We are\nstill very far away from being able to write down a generic action for\nAtomese.)\n\nDeep learning now outperforms random forests. Can we (how can we) attach\na DLNN system to the sensori-motor system being described here? Should\nwe, or is this a bad idea? Let's review the situation.\n\n* Yes, maybe hooking up DLNN to the sensory system here is a stupid\n  idea. Maybe it's just technically ill-founded, and there are easier\n  ways of doing this. But I don't know; that's why I'm doing these\n  experiments.\n\n* Has anyone ever built a DLNN for electronic circuit design? That is,\n  taken a training corpus of a million different circuit designs\n  (netlists), and created a new system that will generate new\n  electronic circuits for you? I dunno. Maybe.\n\n* Has anyone done this for software? Yes, GPT-4 (and I guess Microsoft\n  CodePilot) is capable of writing short sequences of valid software to\n  accomplish various tasks.\n\n* How should one think about \"training\"? I like to think of LLM's as\n  high-resolution photo-realistic snapshots of human language. What you\n  \"see\" when you interact with GPT-2 are very detailed models of things\n  that humans have written, things in the training set. What you see\n  in GPT-4 are not just the surface text-strings, but a layer or two\n  deeper into the structure, resembling human reasoning. That is, GPT-2\n  captures base natural language syntax (as a start), plus entities and\n  entity relationships and entity attributes (one layer down, past\n  surface syntax.) GPT-4 does down one more layer, adequate capturing\n  some types of human reasoning (e.g. analogical reasoning about\n  entities). No doubt, GPT-5 will do an even better job of emulating\n  the kinds of human reasoning seen in the training corpus.  Is it\n  \"just emulating\" or is it \"actually doing\"? This is where the\n  industry experts debate, and I will ignore this debate.\n\n* DLNN training is a force-feeding of the training corpus down the\n  gullet of the network. Given some wiring diagram for the DLNN,\n  carefully crafted by human beings to have some specific number of\n  attention heads of a certain width, located at some certain depth,\n  maybe in several places, the training corpus is forced through the\n  circuit, arriving at a weight matrix via gradient descent. Just like a\n  human engineer designs an electronic circuit, so a human engineer\n  designs the network to be trained (using TensorFlow, or whatever).\n\nThe proposal here is to \"learn by walking around\". A decade ago, the MIT\nRobotics Lab (and others) demoed randomly-constructed virtual robots\nthat, starting from nothing, learned how to walk, run, climb, jump,\nnavigate obstacles. The training here is \"learning by doing\", rather\nthan \"here's a training corpus of digitized humans/animals walking,\nrunning, climbing, jumping\". There's no corpus of moves to emulate;\nthere's no single-shot learning of dance-steps from youtube videos.\nThe robots stumble around in an environment, until they figure out\nhow things work, how to get stuff done.\n\nThe proposal here is to do \"the same thing\", but instead of doing it\nin some 3D landscape (Gazebo, Stage/Player, Minecraft...) to instead\ndo it in a generic sensori-motor landscape.\n\nThus, the question becomes: \"What is a generic sensori-motor landscape?\"\nand \"how does a learning system interface to such a thing?\" This git\nrepo is my best attempt to try to understand these two questions, and to\nfind answers to them. Apologies if the current state is underwhelming.\n\n\nSelf-observing systems\n----------------------\nPerception need not be limited to \"the external world\"; one may also\nobserve oneself. Action need not be limited to the movement of limbs; it\ncan also be a control over one's own thoughts. Imagine the case of\n\"stewing in one's own juice's\": a dreamlike state, where you ruminate\nover old memories of pst events. As one does so, one selects, picks and\nchooses: fond memories are noted for their emotional content, are given\nfurther thought; boring memories stay unexamined, unless forced.\n\nThe perception-action system described above can be aimed not just at\nthe external world, but also at internal state. Imagine, for example, a\ntrained LSTM or maybe an LLM that is placed into a \"dream state\", where\nit generates a sequence of free-association outputs. An agent can watch\nover this stream, and, in response to certain outputs, it can \"prompt\"\nthe system, to guide further \"reminiscences\". More directly, it can\n\"choose\" to force the system to concentrate on a specific topic. To\n\"focus one's thoughts\". This kind of hierarchical layering, where an\nagent steers the thoughts of an underlying system results in a form of\nself-awareness and self-control.\n\nThus, in addition to the previously-described perception-action agents\n(e.g. traversing the file system or interacting via chat) one can build\nan agent that monitors the state of a neural network, and then controls\nit, via prompts, or perhaps much more directly with gates (tanh/signmoid\nblending.)\n\n\nRelated ideas\n-------------\nA distantly related set of ideas can be found in the [SOAR Cognitive\nArchitecture](https://en.wikipedia.org/wiki/Soar_%28cognitive_architecture%29)\nfrom Laird, Newell \u0026 Rosenbloom. Examples of SOAR agents can be\nseen in the [SOAR Agent github repo](https://github.com/SoarGroup/Agents/).\nSOAR is a production rule system, and so resembles the AtomSpace query\nsubsystem. The manner in which SOAR rules are applied resembles the\n[OpenCog Unified Rule Engine (URE)](https://github.com/opencog/ure)\n(which is unsupported and now deprecated) and also\n[OpenCog Probabilistic Logic Networks (PLN)](https://github.com/opencog/pln)\n(also unsupported \u0026 deprecated).\n\nThere are many differences between this work and Atomese:\n* SOAR production rules are written in ASCII (in the SOAR language), and\n  are stored in flat files.  AtomSpace production rules are written in\n  Atomese, and stored in the AtomSpace.\n* SOAR production rules are applied to \"state\", and work within a\n  \"context\". This is similar to the AtomSpace, which can store state.\n  However, the AtomSpace can also store the rules themselves (as\n  \"state\") and the rules can be applied to external data streams\n  (e.g. audio, video).\n* SOAR production rules are crafted by humans, encoding knowledge.\n  Atomese rules are meant to be algorithmically generated and\n  assembled.\n* SOAR state mutation is boolean-valued: either something is done, or\n  it isn't. There does not appear to be any concept of Bayesian\n  possible-worlds.\n* SOAR appears to use a very simplistic forward-chaining approach to\n  inference. For any given SOAR state, a collection of possible rules\n  is determined. Of these, one rule is selected, and then it is\n  applied to mutate the state. It has long been recognized that other\n  kinds of chaining is interesting: not just forward chaining, but also\n  backwards. One might not just chain (as in ProLog), but ask for\n  constraint satisfaction (in ASP and automated theorem provers.)\n* The OpenCog URE and PLN elements are capable of performing chaining.\n  They are currently deprecated/obsolescent, for a variety of technical\n  and philosophical reasons.\n\nA nice, quick \u0026 easy overview of SOAR can be found here:\n\"[An Introduction to the Soar Cognitive\nArchitecture](https://acs.ist.psu.edu/ist597/pst-soar%20v14.2.pdf)\",\nTony Kalus and Frank Ritter (2010)\n\n### CGW Wires\nThere was a much earlier attempt at wiring with Atomese, from 2008,\ntermed \"Cog Graphical Wires\" (CGW). It never went anywhere. Right\nidea, wrong time.  The description still sounds sexy, and mirrors\nthe above. It can be found in the now-deleted directory\n[CGW Wires](https://github.com/opencog/atomspace/tree/3f58a2cdd7891da074ee48bd517c7f656ff12b14/opencog/scm/wires)\nThat code was inspired by a paper:\n* ''The Art of the Propagator'', Alexey Radul; Gerald Jay Sussman,\n  MIT Technical Report MIT-CSAIL-TR-2009-002\n  http://dspace.mit.edu/handle/1721.1/44215\n\nI haven't read that paper in over a decade. Perhaps it has some gems.\n\n\n### Status\n***Version 0.3.1*** -- Experimental. Basic demos actually work. Overall\nlow-level parts of the architecture and implementation seem ok-ish. The\nupper-level parts have not yet been designed. The grand questions above\nremain mysterious, but are starting to clarify.\n\nProvides:\n* Basic interactive terminal I/O stream.\n* Basic File I/O stream.\n* Prototype Filesystem navigation stream.\n* Prototype IRC chatbot stream.\n\nThe [Architecture Overview](Architecture.md) provides a more detailed\nand specific description of how the system is supposed to look like, and\nhow it is to work, when it gets farther along. The\n[Design Diary](Design.md) documents the thought process used to obtain\ncode that actually works and does what it needs to do.\n\nSee the [examples](examples) directory for working examples.\n\nThe [AtomSpace Bridge](https://github.com/opencog/atomspace-bridge)\nprovides an API between the AtomSpace and SQL. It almost conforms to\nthe system design here, but not quite. It should be ported over to\nthe interfaces here.\n\nThe [Vision subsystem](https://github.com/opencog/vision) provides\nan Atomese API for OpenCV. It is a proof-of-concept. It should be\nported over to the interfaces here.\n\n### Design Overview\nThe Atomese agent framework needs to have some way of interacting\nwith it's environment. Obviously, reading, writing, seeing, hearing.\nMore narrowly: the ability to read a text file in the local file system.\nThe ability to read directory contents, to move throuogh directories.\nThe ability to behave as a chatbot, e.g. on IRC, but also as a\njavascript chatbot running in a web-page. Also possibly running\nfree on twitter, discord, youtube.\n\nThe goal here is to prove the very lowest layers, just the glue,\nto convert that stuff into Atomese Atoms that higher-layer Atomese\nagents make use of to communicate with, interact with the external\nworld.\n\nThe [Architecture Overview](Architecture.md) provides a detailed\ndescription of how this can work.  A general overview can be found\nin the AGI 2022 paper:\n[Purely Symbolic Induction of Structure](https://github.com/opencog/learn/tree/master/learn-lang-diary/agi-2022/grammar-induction.pdf).\n\nGeneral system architecture is discussed in a number of places,\nincluding the various PDF and LyX files located at:\n* [AtomSpace Sheaves \u0026 Graphs](https://github.com/opencog/atomspace/tree/master/opencog/sheaf)\n* [OpenCog Learn Project](https://github.com/opencog/learn) and\n  especially the \"diary\" subdirectory there.\n\nSee also:\n* [Atomese Agents Project](https://github.com/opencog/agents). This is\n  in the pre-prototype phase, but is the current focus of attention.\n\n### Design specifics\nDetails of the design in this git repo are explored in several places:\n\n* [Architecture](Architecture.md) -- Architecture overview.\n* [Design Overview](Design.md) -- Current design \u0026 TODO List.\n* [IRChatStream](opencog/atoms/irc/README.md) -- IRC chat design.\n* [TextFileStream](opencog/atoms/filedir/README.md) -- Directory navigation design.\n* [TerminalStream](opencog/atoms/terminal/README.md) -- Interactive terminal design.\n\n### Build and Install\nThis git repo follows the same directory structure and coding\nconventions used in other OpenCog/AtomSpace projects. This cannot be\ncompiled before installing the prerequisite\n[AtomSpace](https://github.com/opencog/atomspace). So build and\ninstall that first.\n\nThen:\n```\nmkdir build; cd build; cmake ..\nmake -j\nsudo make install\n```\n\n### Examples\nSee the [examples](examples) directory. The simplest example is for\npinging text between two xterms. Other examples include opening,\nreading \u0026 writing a single text file, navigating the file system,\nand a basic IRC echobot.\n\nIt will probably be useful to read the\n[Architecture Overview](Architecture.md) first.\n\n***Important*** All of this is pre-alpha! These examples are too\nlow-level; the intent is to eventually automate the process for hooking\nup sensors to motors. Basic design work continues. But for now, these\nshow some of the low-level infrastructure; the high-level stuff is still\nmissing.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencog%2Fsensory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopencog%2Fsensory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencog%2Fsensory/lists"}