The Compatibilist in the Chinese Room

With the rapid expansion of technology and artificially intelligent entities into society, it has become increasingly confusing determining moral responsibility of computer programs, autonomous or otherwise. Intuitively, we seem to hold only fully capable moral agents morally responsible, and the only fully capable moral agents we know are other human adults. Yet, as our technological prowess increases and we are able to create robots that simulate the behavior of a human more and more, there may be sufficient reason to believe that such programs can be held morally responsible, even in spite of their clearly deterministic, programmatic behavior. To prove this, I will rely on P.F Strawson’s framework on moral responsibility and John Searle’s definition of a program as explained through the Chinese Room Thought Experiment.

Summary on P.F Strawson’s Moral Responsibility A framework surrounding moral responsibility and determinism was developed by P.F Strawson in his landmark essay Freedom and Resentment, developed as a response to the existing debate around determinism and whether or not it ought to change our behavior with regards to moral behavior. For Strawson, we can generate this framework for moral responsibility by extrapolating from our own human experience - what does it really mean when we express moral condemnation or praise positive moral behavior? According to Strawson, these are reactions in response to the perception of good will, ill will, or indifference from another agent (Strawson 3, 7). In order to understand moral responsibility and these moral reactions in any capacity, we must first understand the human relationships along with personal reactive attitudes like resentment, gratitude, and indifference to certain interpersonal behaviors. If we understand these attitudes, then we can possibly extrapolate them to moral attitudes like moral indignation or moral support and from there define what it means to hold something morally responsible.

The nature of a reactive attitude is that we apply greater or less injury or gratitude to ourselves based on an agents consideration and attitude towards us. In the situation where an offending agent causes us pain, accidentally doing so while trying to help us may cause the same amount of acute pain as the agent inadvertently or maliciously doing so, but the intuitive feelings of resentment from the latter action seem to confer greater injury to us than the former. Similarly, someone providing a benefit inadvertently versus actively choosing to provide that same benefit seems to confer greater benefit to us in the latter due to their good will than in the former. So, it appears that the will of the offending agent has the ability to provide injury or benefit beyond the actual injury or benefit caused by the agent’s action.

What reactive attitudes aim to do is establish a demand for a particular kind of will given the particular behavior of others or ourselves. All reactive attitudes seem to relate to individuals that we apply agency to, but there is no requirement of a external rational justification. There are many kinds of relationships that humans possess with each other that have the full range of participant reactive attitudes available, from the transactional and commonplace relationships to the fully reciprocal relationships between romantically engaged adults. We also seem to inhibit the full range of reactive attitudes for a number of different reasons, e.g when dealing with a child or a severely mentally unstable individual.

Strawson argues that there are roughly two sets of circumstances where we, rational fully reactive adults capable of interpersonal relationships, ought to consider a suspension of particular reactive attitudes towards an agent. In these examples, he uses resentment as the particular reactive attitude being inhibited. The first set of circumstances is related to a normal agent who is indisposed due to ignorance or being forced to act in a particular way. These are the scenarios where the typical excuses or pleas sound like “He did not know,” “She did not mean to,” “He could not help it” or “There was no other way.” Strawson says that these kind of excuses do not tell us to treat the agent itself as someone worth suspending all reactive attitudes, merely that any fully capable agent in that scenario would probably engage in the same way and the particular reactive attitude, in this case resentment, for the injury specifically at hand is inappropriate (Strawson 4) We are expected to maintain that the agent acted appropriately given the circumstances, and that with relation to the injury, we will suspend the reactive attitude of resentment. Because the behavior given the circumstances was excused or justified, we still maintain the ability to generate reactive attitudes towards the offending party in all other relevant circumstances, and we are still capable of experiencing the full range of reactive attitudes including resentment for all other genuine interpersonal interactions regarding that agent.

The second set of circumstances that Strawson suggest inhibits personal reactive attitudes attend to the constitution of the offending agent themselves. Strawson splits this set into two subgroups.

The first subgroup contains excuses like “He was not himself,” or “She has been under a great deal of stress recently.” Strawson suggests that with the first group, we again do not revoke reactive attitudes towards the agent under normal conditions, because under normal conditions we maintain the full range of reactive attitudes (Strawson 5) However, we inhibit our full range of reactive attitudes with regards to the agent under those specific conditions, because in effect there are two agents now that we are discussing - the one who was not himself and thus incapable of genuine interpersonal relationships, thus incapable of receiving the full set of reactive attitudes, and the one who was himself and thus deserving of the full range.

The second subgroup contains excuses like “She is just a child,” or “He’s profoundly mentally unstable.” With the second subgroup of circumstances, the circumstances of the environment or individual are normal, but they are psychologically incapable or morally underdeveloped - thus incapable of receiving the full range of reactive attitudes, if any at all. In these cases, the agent is the reason for modifying our reactive attitudes, while in all previous cases, the agent still received the full experience of reactive attitudes, and particular reactive attitudes were reduced due to the injury.

In both subgroups, we consider the agent incapable of a genuine interpersonal relationship, thus dramatically modifying the set of reactive attitudes we have available. These attitudes might still be emotionally driven, but attitudes like resentment, gratitude, or full reciprocal love as one might expect between two rational adults are seemingly inappropriate in these situations. In essence, the kinds of attitudes we have available in these situations are “detatched” such that the attitudes remove the interpersonal participation. Strawson calls this range of reactive attitudes “Objective” reactive attitudes, which aim to control social behavior, manage other human beings, or otherwise handle entities without the capacity full interpersonal participation due to their constitution (Strawson 5) We clearly apply this set of behavior not only to children or the mentally disabled, but also to fully capable and normal individuals for the purposes of social control, policy or self-preservation.

It is only after establishing the situations in which reactive behavior is inhibited that Strawson considers the nature of determinism and its effect on how we ought to handle participant reactive attitudes, i.e reactive attitudes handled in all interpersonal relationships. If determinism is true, such that determinism suggests all behavior is causally determined, does that imply that we ought to inhibit or revoke our reactive attitudes? Should we default to the objective reactive ability in the same way we do for agents that are constitutively incapable of full inter-personal relationships? Strawson’s argument here relies on human nature - our reactive attitudes seem to be an in-built human function and when we inhibit or revoke them, we are not doing so because we believe all behavior is determined, but because exigent circumstances require us to do so. About this, Strawson says, “The human commitment to ordinary interpersonal relationships is, I think, too thoroughgoing and deeply rooted for us to take seriously the thought that a general theoretical conviction might so change our world that […] there were no longer any such things as interpersonal relationships.” (Strawson 6) Strawson further implies that even if it were in principle possible to constantly exemplify the objective reactive attitude, it does not necessarily make sense that it would be rational to do so (Strawson 6-7) as that may or may not have deleterious effects on human life as a whole. In effect, belief in determinism does not embolden us to accept an objective reactive attitude. So, determinism does not suggest that we ought to treat all agents with objective reactive attitudes.

The moral lens on this issue is extrapolating the particular participant reactive attitudes to a generalized, vicariously experienced set of reactive attitudes. Strawson suggests that the moral case is the vicarious case, meaning even though a person may only be incidentally involved or even not involved at all, the reactive attitudes felt by the participants can be also felt on the behalf of the participants (Strawson 8). Imagine three individuals, Alice, Bob and Eve. If Bob offends Alice, Alice’s participant reactive attitude would be resentment or indignation towards Bob. Eve, who watches the exchange of offense, can vicariously experience the participant reactive attitudes. If Eve vicariously experiences indignation or disapproval on behalf of Alice, then this would be moral indignation or disapproval. The qualification thus established is that an attitude is morally charged when the reactive attitudes are impersonal or the individual is otherwise not participating in the exchange. The capacity for moral reaction and demand comes about in three possible scenarios: demand for behavior from others onto oneself, the demand for behavior of others onto others, and the demand for behavior of oneself onto others. By establishing impersonal reactive attitudes towards all of the above, we can thoroughly establish an agent as full part of the moral community. Without one or two of these scenarios, the agent may be regarded as, Strawson describes, a moral idiot or a saint - neither are representative of the moral community through which we ascribe moral responsibility (Strawson 9).

The Chinese Room Experiment and the Problem of Cognition and Agency

The Chinese Room Experiment is an argument presented by John Searle, former professor of philosophy at the University of California Berkeley in his paper Minds, Brains, and Programs. The aim of the thought experiment is to suggest that a computer, specifically a program, is in principle insufficient to represent a conscious mind because it is merely a syntactic engine - it has no capacity for true understanding. The experiment is such that a human such as Searle himself is locked in a room that only takes in paper input and prints paper output. The input is in Chinese, a language that Searle does not know in any manner. Searle reads a rulebook that contains all the possible responses for all possible inputs, but the rulebook’s output is also in Chinese. Thus Searle has no way of knowing what the input means nor what the output means. Searle presents this argument as evidence to suggest that computer programs (which Searle in the Room is simulating) following a predetermined set of rules is not the same thing as truly knowing or understanding anything. To Searle the Philosopher, no one would reasonably claim that Searle in the Room is cognitive or has understanding. Thus, Artificially Intelligent agents as they are currently imagined, i.e as complicated programs that handle input and output in the same way Searle in the Room handles input and output, are similarly incapable of truly knowing or understanding anything (Searle 418).

There is a reply to Searle entitled the Systems Reply, which grants that Searle in the Room knows nothing, but the Room as an entity, including the rulebook and dataset of Chinese output, knows something as a system. Searle counters that this is not true knowledge and responds that one could imagine Searle in the Room memorizing the whole subsystem, output and rules, and still not truly understand Chinese in the same way he truly understands English (Searle 419). While there are other thorough replies to Searle’s argument, the only reply we will accommodate in this exercise is the Systems Reply and Searle’s counter as it is sufficient for our reconciliation of moral responsibility and programs.

The essence of Searle’s argument is this - given that the rulebook is predetermined behavior and not true knowledge, there is no agency in the traditional sense for Searle in the Room nor the Room as a whole through the Systems Reply. The Room is not free to choose, and the Room has no cognitive ability. Even though the Room adequately fools a normal human adult into believing that Searle and the Room know things, Searle claims that this is not true. As Searle puts it, the Room only has syntax, no semantics (Searle 422).

At first glance, this presents a conundrum for our intuitive understanding of moral behavior. If there is no true knowledge, and if the Room acts as a completely predetermined machine and thus has no agency, does it make sense to hold it morally responsible for any reason at all? The intuitive answer is that we cannot hold a completely deterministic machine morally responsible. If there is no ability to make a free choice, there is no proper way to hold any sort of action as morally condemnable or hold any sort of entity morally responsible. Every action Searle in the Room or the Room as a whole produces is the result of a determined process. So, since Searle is a deterministic computer program, we intuitively absolve him of any moral responsibility, as he had no choice in the matter. In theory, this is no different than applying or not applying moral responsibility to any other sort of non-cognitive machine. A runaway trolley is not held morally responsible for the harm it causes - intuitively moral responsibility seems to apply to those that could have acted out a choice in the matter, e.g the trolley inventors, the conductor of the trolley, and perhaps the person or agent that created the rulebook Searle is using.

Reconciliation of Computers and Moral Responsibility

The framework of moral responsibility suggested by Strawson states that even if determinism is true universally, meaning at the least that all human behavior is determined, then there are still reasons to maintain participant and vicarious reactive attitudes, meaning there are still reasons to maintain moral responsibility in scenarios of determinism. For all interpersonal and impersonal interactions, there are appropriate reactive attitudes and we inhibit particular ones of these for normal agents circumstantially, or modify or revoke our set of reactive attitudes if we determine the offending entity is constitutively incapable of maintaining genuine interpersonal relationships. This then implies that the deterministic nature and indeed the non-cognitive nature of Searle in the Room and the Room as a whole may not necessarily inhibit our usual attributions of moral responsibility.

The first scenarios that Strawson suggests inhibiting particular reactive attitudes due to injury in the cases where the agent is ignorant or coerced by external circumstances. The two agents we consider for ignorance or coercion are Searle in the Room and the Room as a whole. Searle in the Room is admittedly clueless. He has no idea what is going on, and is generally isolated from any sort of reactions in any capacity. Participant reactive attitudes may exact demands, but he has no knowledge of what those demands are and to demand anything of him alone would have no possible result. This inhibits us from having any sort of reactive attitude towards Searle, because he knows nothing, and it all circumstances it makes no sense to have a reactive attitude towards Searle. Put another way, there is no possible scenario in which we have a genuine interpersonal relationship with Searle. Again if we replace Searle with any other normal agent capable of receiving the full reactive attitude in normal circumstances, we would expect the same response. Searle is also arguably coerced - there is no other possible action for Searle to take beyond fulfilling the commands in the rulebook. So out of ignorance in all situations, and coercion in all situations, Searle in the Room is incapable of having or receiving participant reactive attitudes. As such, a program alone, which is what Searle represents, is clearly not enough to allow moral responsibility, as moral responsibility result of a vicarious extrapolation of participant reactive attitudes.

However, with regards to the Systems Reply which adequately simulates a fully-functional and interpersonal adult human being, there does appear to be knowledge. Even if we grant to Searle that it is not true knowledge, where the Room as a Whole does not truly understand Chinese the way Searle seems to know English, there is sufficient knowledge for participant reactive attitudes that other agents can recognize. The claim “It did not know” is insufficient because the program in conjunction with the rulebook is satisfactory for the Room to produce attitudes and reactive attitudes as well. True knowledge, it seems, is not necessary for participant reactive attitudes. The Room is also not necessarily coerced into outputting any particular response, or if it is, it is coerced in the same way any human agent would be coerced to reply to a question or demand. In essence, the lack of knowledge the room shows can only reflect the lack of knowledge any normal agent shows. Coercive behaviors on the Room only result in the same injuries that one would expect from coercive behaviors on a normal agent. So, it makes sense that ordinary reactive attitudes are maintained when interacting with a Room, save for these scenarios where a lack of knowledge in any form, true or otherwise. We do not suspend or inhibit these attitudes with regards to the agent itself, but to the injury committed in the particular circumstance of ignorance or coercion. As the Room is a reasonable simulation of a person, and appears to have reasonable interpersonal interactions, we treat it like any other agent.

Does it then make sense to modify or revoke our ordinary reactive attitudes on the basis that the Room is constitutively incapable of the full set of reactive attitudes? On the surface, this seems like a good way to deny reactive attitudes and thus moral responsibility to computers like the Room. The Room is made up of a person or program, datastore, and rulebook. There are arguably no minds to speak of, or at least no minds intuitively similar to minds that we know are capable of the full set of reactive attitudes, so it is tempting to suggest that the Room is psychologically incapacitated. Yet again, the constitution that Strawson is speaking of does not relate to the actual structure of the mind itself. It is again related to the capacity to engage interpersonally and react with the full range of participant reactive attitude. If the rulebook and output of the Room are truly indistinguishable from a normal human being, then so too are the participant reactive attitudes set forth by the Room. There is no reason to deny an interpersonal relationship with the Room due to the knowledge that the Room “thinks” differently than a normal human being. The Room seems to have its own inherent ability to hold resentment, gratitude, or indifference towards an agent, provided that agent provides through their input the behavior that typically elicits such responses.

Finally, we arrive to the deterministic approach - should we sever our interpersonal connection with the Room due to its necessarily deterministic behavior? Unlike the Universe, where we have not proven determinism false, determinism is clearly true for the Room and for Searle in the Room. There is no input that Searle in the Room can handle where he is given a free choice. Strawson has already answered this for us - even if people acted deterministically, the belief in determinism is not why we choose to act in an objective manner. We only act with objective participant reactive attitudes when exigent circumstances require it of us, meaning we need to act with the denial of interpersonal relationships because we are forced to or find it useful for our own means. The deterministic nature of the Room and Searle in the Room holds virtually no relevance to our understanding on whether or not we treat the room with responsibility for our reactive attitudes. In all cases, it seems that the Room as an entity is perfectly capable of eliciting participant reactive responses and demands just like any other agent, in spite of the necessarily deterministic nature of the Room. Just like the lack of true knowledge, the incorporation of true determinism does not inhibit us in any way from having an interpersonal relationship with the Room as it is indistinguishable from the determined behavior of normal agent, assuming determinism is true.

In Strawson’s framework, the Room now qualifies to be considered for vicarious reactive attitudes because there are no circumstances that inhibit participant reactive attitudes from being considered, apart from circumstantial ones where the particular injury or benefit being conferred is excused or justified as one would expect from a normal agent. In the background, Searle in the Room is incapable of experiencing vicarious reactive attitudes because he lacks any sort of knowledge, true or otherwise, and is coerced into particular behavior in all situations. So the question transforms - can the Room experience and be subject to vicarious reactive attitudes? In an interpersonal relationship between the Room and an individual Alice, there is no reason to suggest that another agent Eve cannot vicariously and impersonally experience the reactive attitude on behalf of either entity. Alice can be injured and hold the Room accountable and demand a particular kind of response through her participant indignation, and Eve can also apply vicarious indignation to the situation, thus holding the Room morally responsible for the offense taken. Since the Room is an adequate simulation of a normal agent, the Room can also have participant reactive attitudes towards Alice that Eve can vicariously experience on behalf of the Room. So, it seems clear that the participant reactive attitudes from both entities encourage vicarious experience.

The second part of moral responsibility for Strawson requires the Room to have the capacity for vicarious reactive experiences as well. This allows the Room to enter the moral community, by enabling the extrapolation of its own participant reactive attitudes. The Room must be able to answer for reactive attitudes not only from others onto itself, but also itself onto others or self-reflective attitudes, and others onto others for impersonal reactive attitudes.

While the thought experiment is developed using only a single interpersonal interaction, there’s nothing that suggests the Room cannot “feel” shame or guilt insofar as shame or guilt is adequately represented in the responses of the rulebook. For the ability of self-reflection, it is not inconceivable that its own output feeds back into its input in some capacity for self-reflection. The demands of others onto itself or itself for others can clearly be represented in the rulebook as part of what it means to simulate the responses for a normal human being. There is also nothing that prevents the input from concerning outside agents, and to assume that the rulebook would have no thoughts or opinions of vicarious disapproval or moral indignation seems to be counter to agent that it is simulating. It seems clear that the Room can be “offended” on behalf of the agents in its input, in that it reflects such attitudes or demands in its output. So, if all qualities of a generalized, impersonal reactive attitudes are met, it seems that the Room is fully capable of engaging in the moral community as defined by Strawson. As such, we can adequately hold the Room as a whole morally responsible for any behavior that confers its will.

Application of the Exercise in Modern Era The result of this exercise is as such: under Strawson’s framework for moral responsibility, the system of a sufficiently capable program with a well established compendium of knowledge, will result in an entity that we can potentially hold morally responsible for its actions. Even if we grant Searle’s claim for a lack of true knowledge in such a program or system, the thesis of moral responsibility relies on the nature of interpersonal relationships. Regardless of knowledge or agency as we typically understand them, the experiene of a full interpersonal relationship with the Chinese Room is maintained. While this may not be expressly relevant for existing computers, it may be possible to treat such computers in the same way we treat children. While children are treated with objective reactive attitudes, they are also treated with some semblance of the full set of reactive attitudes, as the expectation is that they will grow to learn and, as Strawson says, “Rehearsals insensibly modulate towards true performance,” (Strawson 11). Truly we do treat children with some realms of moral responsibility, while denying them others, so fully participating in solely objective reactive behavior seems counter to our own experience in directing morally underdeveloped agents towards full moral development. If all that is required for moral responsibility is a thorough interpersonal relationship, it is possible that existing programs have enough relevant features to engage, respond and demand to be rendered a modicum of moral responsibility. All I suggest from this exercise is the stage at which a program is necessarily capable of being held morally responsible - the stage where a computer adequately simulates another human being.

Works Cited Strawson, Peter F. “Freedom and resentment.” (1963).

Searle, John R. “Minds, brains, and programs.” Behavioral and brain sciences 3.3 (1980): 417-424.