gB:Channel Theory

Collection	GlossariumBITri
Author	Julio Ostalé
Editor	Julio Ostalé
Year	2010
Volume	1
Number	1
ID	86
Object type	Theory
Domain	Computer Science Logic Semantics
es	teoría de canales
fr	théorie des canaux
de	Kanaltheorie

Channel Theory (also known as the Theory of Information Channels, the Theory of Information Flow or simply IF-Theory) is a logico-mathematical theory that models the flow of information among components of a so-called “distributed system”. Barwise and Seligman (1997) is the standard source. There are previous versions of the theory that are acknowledged by the same name; in the last section we will deal with that problem.

Formulation of the subject matter

There is a fundamental question that channel theory tries to answer: “How is it possible for one thing to carry information about another?” (Barwise and Seligman 1997: xi). Since entities convey information about each other as far as they are classified by abstract states, and moreover the conveyed information depends also on certain background of connections (between things) and regularities (between abstract states), any answer to a particular instance of the previous question has to fit the following scheme (Barwise and Seligman 1997: 13).

Information report:

The fact that a is in the abstract state F carries the information that b is in the abstract state G with respect to certain relationships that link a and b on the one hand, F and G on the other.

It does not matter what a, b, F, G are. It might be the case that a, b are objects and F, G are properties (as in monary predicate logic); perhaps a, b are situations whereas F, G are situation types (as in situation theory); maybe a, b are different instants a system goes by, while F, G are system descriptions in the form of tuples consisting of numbers (as in mathematical modelling). The point is that every part of a distributed system consists of a collection of tokens {a₁, a₂,...} as well as a collection of types {F₁, F₂,...}; both collections relate to each other by means of a classificatory relation, giving rise to items of the form “a is F”.

This account of information reports goes back to Dretske (1981). It was partially developed in the theory of situations of Barwise and Perry (1983), which Devlin (1991) updates. In situation theory, regularities betweenF and G were studied under the name of “constraints”, but physical connections between a and b were hardly taken into account. Restrictions serve to explain appropriately the relativity of information flow, while the combination of restrictions and connections seems to be the key to understand his fallibility.

Information flow in a distributed system

Even though information is not defined, it is assumed as something that “flows” among the components of a system. Such components may be distant from one another in time and space; furthermore, they can be very different one to each other. That is why it is said that the system is “distributed” (in computer science this term has another meaning). Example: all the noticeboards, travel tickets and trains that make up a railway network form together a distributed system.

There are systematic correlations among components in every distributed system. They are “regularities” that support the system's information flow, which in turn can be modelled by different theoretical constructs we call “information channels”.

There are four principles of information flow. They lead the mathematical development of the theory.

Information flow results from regularities in a distributed system.
Information flow crucially involves both types and their particulars.
It is by virtue of regularities among connections that information about some components of a distributed system carries information about other components.
The regularities of a given distributed system are relative to its analysis in terms of information channels.

Let us see how to formalize the concepts of distributed system and information channel in such a way that they match the above four principles.

Information channels

Parts of a distributed system are modelled as classifications. A classification A is a structure (A,T,R) where A and T are non-empty sets of tokens and types respectively, and R is a relation from A to T. There might be tokens classified by several types, as well as types that classify several tokens. If a is in A and t is inT, then eRt means that a is of type t.

A classification provides the vocabulary (via T) and the context (via R) whereby it is possible to speak about each component of the system. Typically, different tokens of a classification can be seen as the same physical system across different time points; types would be state descriptions of the system.

Two classifications A₁=(A₁,T₁,R₁) and A₂=(A₂,T₂,R₂) can be related one to another by means of an infomorphism f from A₁ to A₂, where A₁ is the domain and A₂ the codomain of f. Intuitively, an infomorphism is a “part-to-whole” informational relationship. It is built up by two functions f = (f⁺,f^–) that go in opposite directions (see diagram) and fulfill the following condition: f^–(a)R₁t if and only if aR₂f⁺(t) for all a in A₂ and t in T₁. This implies that the image of type t says in A₂ what t says in A₁.

(1)

true

Vertical lines represent classificatory relations; horizontal arrows are functions. Since the direction of f⁺determines that of f we can also write:

(2)

true

We do not consider subscripts in R₁ and R₂ whenever it does not give rise to misunderstanding. In the diagrams we can do without the expressions R₁ and R₂.

Barwise and Seligman (1997: 34, 76) define an information channel as a collection of infomorphisms that share the same codomain. We can also say that a channel consists of a set {A₁,...,A_n} of classifications that represent the parts of the distributed system, a classification C (the core) that represents the system as a hole, and a set of infomorphisms {f₁,...,f_n} that go from each of the parts onto C. Classifications in {A₁,...,A_n} can be repeated. Tokens in C are the connections of the system: of every c in C it is said that it connects the tokens to which c is related by means of {f₁,...,f_n}. Parts {A₁,...,A_n} inform about each other as long as they all are parts of C.

Every channel models those conditions that make the information flow possible in a distributed system, which in turn can be modelled by different information channels. A distributed system D is a collection of elements informing about each other. Formally, D consists of an indexed class cla(D) of classifications together with a class inf(D) of infomorphisms whose domains and codomains are all in cla(D).

An information channel K covers a distributed system D if and only if cla(D) are the classifications of the channel and for every infomorphism f in inf(D) there are infomorphisms from both the domain and codomain of f to the core of K such that the diagram formed by these three infomorphisms commutes. The underlying idea is that all classifications in the distributed system are informational parts of the core whose channel covers the system. In Barwise and Seligman (1997: 89-97) it is shown how to construct an information channel out of a distributed system.

An information channel with four components could be e.g. a flashlight of which we consider the bulb (A₁), the switch (A₂), the batteries (A₃) and the case (A₄). The corresponding diagram:

(3)

true

Information flows across the channel: switch being ON and battery being charged inform that the bulb is lite unless the case is broken; battery working properly informs that the bulb can be either lite or unlite, etc.

It is possible to simplify a channel so that it contains only two classifications and one infomorphism. In order to do that we get together its parts A₁, A₂, A₃... in a sole classification by means of a “sum” that generates the classification [A₁+A₂+A₃...] where all the information that the parts of the channel previously contained separately is now contained in a single classification. In the case of a channel with two parts:

(4)

true

Tokens of [A₁+A₂] are ordered pairs that combine all the tokens in A₁ and A₂. The type set of [A₁+A₂] is the disjoint union generated by types in A₁ and A₂. A token is of certain type if any of its components was of that type. Infomorphisms from the parts to the sum and from the sum to the core are defined so that the diagram commutes.

Information flow: the ideal case

Information channels tell us why the information flows within a distributed system; which are the conditions of possibility of information. The logical apparatus we present in this section and the next one is suitable for studying how that information flows.

Every classification A is equipped with its own “theory”, namely the class of regularities among types that are supported by the tokens. How to formalize this idea of regularity that depends on the idea of classification? IfT₁, T₂ are subsets of T, then a token a of A satisfies the pair (T₁,T₂) if and only if aRt for all t in T₁ implies aRt for some t in T₂. Every pair (T₁,T₂) satisfied by some token is a regularity.

The theory Th(A) generated by A is a structure (T, ⇒) consisting of the set T of types in A together with a consequence relation ⇒ comprised by all regularities in A. Given a theory, we write T₁⇒T₂ and say that T₁ implies T₂ whenever (T₁,T₂) is a regularity of the theory. Relation ⇒ obey the logical properties of identity, monotony and cut that characterize deductive inference.

Once we have the concepts of classification, theory, infomorphism and information channel, it is feasible to try out a first analysis of information flow. Let be given a channel K wherein two classifications A₁ and A₂ inform one about another in virtue of their informational memebership C. The diagram looks like this:

(5)

true

Initial proposal: Let a₁ be of type t₁ in A₁ and a₂ of type t₂ in A₂. Then a₁'s being of type t₁ in A₁ informs that a₂ is of type t₂ in A₂, relatively to the channel K, if and only if a₁ and a₂are connected through some token in C and moreoverf⁺(t₁) implies f⁺(t₂) in the teory Th(C) (Barwise and Seligman 1997: 35).

This first analysis bears in mind regularities in C instead of regularities among the parts of the system. This is because we have adopted a viewpoint external to this system, assuming as well that we are given complete information about its regularities. We have idetified that information with Th(C). But in practice it is unlikely, if not impossible, that we know all these regularities. That's why it is convenient to revise the previous analysis: we have to assume an internal viewpoint with respect to the system, wherein we are so to speak considering just a part of the system; from observation of that part -together with our incomplete and fallible knowledge of the system as a whole- we have to extract information about other parts of the system. How to do this? By means of local logics.

Information flow: the practical case

Given a classification A, its theory Th(A) is sound and complete. But we can consider logical systems associated to A that are neither sound nor complete. It is precisely what local logics are all about. Motivation for considering such systems comes from the study of situations in which we have the theory of a “proximal” classification A₁ but we want to reason about a “distal” classification A₂ from what we know about A₁. Example: we drive a car and the proximal classification consists of the speedometer, counting machine, gasoline indicator, and so on, whereas the distal classification is the engine.

(6)

true

In general, for all infomorphism f from A to B there are two rules Intro-f and Elim-f for moving regularities from A to B and from B to A respectively. Intro-f translates T₁^–f ⇒T₂^–f in A into T₁ ⇒ T₂ in B. Elim-f translates T₁^f⇒T₂^f in B into T₁ ⇒ T₂ in A. By means of Intro-f, validity is preserved, while non-validity is not; by means of Elim-f, non-validity is preserved, while validity is not. Closer analysis of rules Intro-f and Elim-f suggests that we should generalize the concept of theory in order to cover logical systems that are possibly unsound or incomplete.

In the car example, as we apply Intro-f₁ to the theory Th(A₁) we get a consistent theory that might not be complete, and as we apply Elim-f₂ to that theory we get a third one (this time over A₂) that might be unsound or incomplete or both.

A local logic L = (A, ⇒, N) consists of a classification A, a binary relation => on type sets from A satisfying identity, monotony and cut, as well as a subset N of “normal” tokens from A satisfying all pairs (T₁,T₂) such that T₁⇒T₂. Logic L is sound if every token is normal; it is complete if for every pair (T₁,T₂) satisfied by every normal token, it is true that T₁⇒T₂. The sound and complete local logic of A is Log(A), which is but a generalization of Th(A).

If we have in the previous diagram a local logic L(A₁) associated to A₁, we can define the logic f₁[L(A₁)] generated by Intro-f₁ from L(A₁) and associated to C, as well as the logic f₂^–[f₁[L(A₁)]] generated by Elim-f₂ from the former logic and associated to A₂. The last logic might be unsound or incomplete, but it is all we have to reason about A₂ from the starting point of the local logic L(A₁). In general, it can be proved that every local logic associated to a classification is the local logic induced by some binary channel, i.e. for every classification A₂ and local logic associated to it there exists a classificationA₁ and a channel linking both classifications such that the local logic associated to A₂ is of the form f₂^–[f₁[Log(A₁)]].

Does this fact bear any relation at all with our logical model of information flow? Let us suppose there is a channel equipped with two components, as in the former diagram, but this time we do not have Log(C) for the core C. What we have is a local logic L on C that might be either unsound or incomplete. Hence C can be seen as the distant classification of a new channel whose core is C' and whose “proximal” classification O (for observer) supports the logic Log(O), which is the logic the observer uses to reason C. It turns out that L on Cequals to g₂^–1[g₁[Log(O)]]

(7)

true

As we take L instead of Th(C) we overcome the initial proposal because we assume now that our knowledge about C is incomplete and fallible, for it is the knowledge of an observer that tries to acquire information about A2from direct access to A₁. But we still have to define the flow of information on the basis of regularities taking place among parts of the system, not on the basis (as in the initial proposal) of regularities among images of types within the core of the channel. In order to move forward we have to simplify the channel by means of the sum operation until we get a channel made up by a single component A and a single infomorphism f (otherwise we should have regularities within each part of the system instead of regularities among those parts). At this point we use the rule Elim-f for getting f^–1(L), a local logic on A that happens to be the “distributed logic” or logic that codifies the information flow of the channel. In other words:

Basic proposal: Given a channel with only one infomorphism, its distributed logic is the inverse image of the local logic associated to the core (Barwise and Seligman 1997: 183).

This proposal is somehow less explicit than the previous one in that it does not mention the “information report” of the first section. However, it is obviously coherent with such a scheme. To see it you only have to work out the basic proposal having into account the concepts involved in the sum of classifications.

Fallibility in the flow of information

Whether a pair of types (T₁,T₂) is a regularity or not, with respect to a distributed system, depends on the information channel we use to model the very system (recall principle 4). And this in turn depends on the analysis we adopt, as well as the tokens and types being assumed. As soon as we fix a distributed system, changing the channel might imply a change in the regularities we are to accept, hence a change in the flow of information our logical model captures.

A way of restricting the number of regularities in a channel K is to “refine” it by means of a channel K that has the same parts as 'K yet a different core C'' that lies between C' and the parts in such a way that the following diagram commutes.

(8)

true

A straightforward case is that where functions in r are identities and C' contains more tokens than C. From this case it should be obvious that, the more refined a channel, the more reliable the information it supports, since the number of connections between tokens of different parts of the system increases. With respect to the types: by Intro-r every regularity in K is a regularity in 'K as well; now, by Elim-r not every regularity in K is a regularity in K'. This means that whenever a regularity inK fails (because of exceptions) we do not have to seek alternative logics but alternative channels.

Suppose that A₁ classifies the switch of a flashlight at different times, whereas A₂ classifies the bulb of the very same flashlight again at those times. If C does not take into account connections with the battery, it is a regularity that the image of the type “on” implies the image of the type “lite”. As soon as a flashlight with a discharged battery plays the role of a counterexample to that regularity, we do not have to seek a new logic (e.g. non-monotonic), rather we should try to define a new core C' that takes into account the battery as a relevant component of the flashlight, thus the new channel would not admit as a regularity that the image of “on” implies the image of “lite”.

Two versions of the theory

There are two versions of channel theory. The second one is a development of the first one, which in turn stems from situation theory. Both versions originate in the collaborative work of Jon Barwise and Jerry Seligman during the 1990s.

First version. The first published paper is Barwise (1992). There it is suggested that situation theory cannot explain fallibility in the information flow because it considers relationships between types of situations without paying attention to relationships between concrete situations. Such relations are introduced and the resulting model is analyzed. Barwise (1993) is a much more sophisticated exposition. Seligman (1990, 1991a, 1991b) had developed very similar ideas to those of Barwise independently. From collaboration of these two authors arise the technical paper Barwise and Seligman (1993) and the more philosophical Barwise and Seligman (1994). This version of the theory was summarized in the survey paper Moss and Seligman (1994).

Second version. The first and still standard reference is Barwise and Seligman (1997), where the previous version of the theory is reformulated in the mathematical framework of category theory, in particular the theory of Chu spaces (Barr 1979; Pratt 1999). Algebraic constructions over Chu spaces provide the semantics of the theory. Barwise (1997) investigate linkages to modal logic, whereas Barwise (1999) is an application of the theory to the study of non-monotonic reasoning. Seligman (2009), in turn, is an attempt of merging the second version of channel theory with Shannon's statistical theory of signal transmission and codification (1948).

Pérez-Montoro (2000, 2007) takes the viewpoint of information content in his comprehensive survey of Shannon, Dretske, situation theory and the first version of channel theory. Restall (2005) deals with the first version of the theory from a logical perspective. Some recent surveys of information theories, like Devlin (2001) or Bremer and Cohnitz (2004), devote a separate chapter to the second version of channel theory.

Related resources

Ontologos: http://www.ontologos.org/
The Information Flow Framework: http://suo.ieee.org/IFF/

References

BARR, M. (1979). *-Autonomous Categories, with an Appendix by Po Hsiang Chu. Lecture Notes in Mathematics 752. Heidelberg: Springer-Verlag.
BARWISE, J. (1992). “Information links in domain theory”. In: BROOKES, S. et al. (eds.). Mathematical Foundations of Programming Semantics, Lecture Notes in Computer Science 598, pp. 168–192. Berlin / Heidelberg / New York: Springer-Verlag.
BARWISE, J. (1993). “Constraints, channels and the flow of information”. In: ACZEL, P. et al. (eds.). Situation Theory and Its Applications. Volume 3, CSLI Lecture Notes 37, pp. 3–27. Stanford: CSLI Publications.
BARWISE, J. (1997). “Information and Impossibilities”. Notre Dame Journal of Formal Logic, Vol. 38(4), pp. 488–515.
BARWISE, J. (1999). “State-spaces, local logics and non-monotonicity”. In: MOSS, L. S. et al. (eds.). Logic, Language and Computation. Volume 2, CSLI Lecture Notes 96, pp. 1–20. Stanford: CSLI Publications.
BARWISE, J. & PERRY, J. (1983). Situations and Attitudes. Cambridge, MA: Bradford Books / The MIT Press.
BARWISE, J. & SELIGMAN, J. (1993). “Imperfect Information Flow”. In: VARDI, M. (ed.). Proceedings. Eight Annual IEEE Symposium on Logic in Computer Science, pp. 252–260. Montreal: IEEE Computer Society Press.
BARWISE, J. & SELIGMAN, J. (1994). “The rights and wrongs of natural regularity”. In: TOMBERLIN, J. E. (ed.). Philosophical Perspectives, 8, Logic and language, pp. 331–364. Atascadero, CA: Ridgeview.
BARWISE, J. & SELIGMAN, J. (1997). Information Flow: The Logic of Distributed Systems. Cambridge: Cambridge University Press.
BREMER, M. & COHNITZ, D. (2004). Information and Information Flow. Frankfurt / Lancaster: Ontos Verlag.
DEVLIN, K. (1991). Logic and Information. Cambridge: Cambridge University Press.
DEVLIN, K. (2001). The Mathematics of Information. [Online] Helsinki (Finland): European School of Logic, Language and Information. <http://www.helsinki.fi/esslli/courses/Logicinfo.html> [Consulted: 18/12/2009]
DRETSKE, F. (1981). Knowledge and the Flow of Information. Cambridge, MA: The MIT Press.
MOSS, L. S. & SELIGMAN, J. (1994). “Classification domains and information links: a brief survey”. In: VAN EIJCK, J. & VISSER, A. (eds.). Logic and Information Flow, pp. 112–124. Cambridge, MA / London: The MIT Press.
PÉREZ-MONTORO, M. (2000). El fenómeno de la información. Una aproximación conceptual al flujo informativo. Madrid: Trotta.
PÉREZ-MONTORO, M. (2007). The Phenomenon of Information. A Conceptual Approach to Information Flow. Medford, NJ: The Scarecrow Press, Inc. [English translation of Pérez-Montoro (2000).]
PRATT, V. (1999). Chu Spaces. [Online] Coimbra (Portugal): School on Category Theory and Applications. <http://boole.stanford.edu/pub/coimbra.pdf> [Consulted: 18/12/2009]
RESTALL, G. (2005). “Logics, situations and channels”. Journal of Cognitive Science, Vol. 6, pp. 125–150. [Publisher in 2006. A previous version entitled “Notes on Situation Theory and Channel Theory” was available online in 1996 at author’s web page]
SELIGMAN, J. (1990). Perspectives: A Relativistic Approach to the Theory of Information. PhD Thesis. Centre for Cognitive Studies. Edinburgh: University of Edinburgh.
SELIGMAN, J. (1991a). “Perspectives in Situation Theory”. In: COOPER, R. et al. (eds.). Situation Theory and Its Applications. Volume 1, CSLI Lecture Notes 22, pp. 147–191. Stanford: CSLI Publications.
SELIGMAN, J. (1991b). “Physical situations and information flow”. In: BARWISE, J. et al. (eds.). Situation Theory and Its Applications. Volume 2, CSLI Lecture Notes 26, pp. 257–292. Stanford: CSLI Publications.
SELIGMAN, J. (2009). “Channels: From Logic to Probability”. In: SOMMARUGA, G. (ed.). Formal Theories of Information, pp. 193–233. Berlin / Heidelberg / New York: Springer-Verlag.
SHANNON, C. E. (1948). “A Mathematical Theory of Communication”. Bell System Technical Journal, Vol. 27 (July, October), pp. 379–423, 623–656.