Publication Cover
Inquiry
An Interdisciplinary Journal of Philosophy
Latest Articles
2,042
Views
3
CrossRef citations to date
0
Altmetric
Research Article

How to deal with risks of AI suffering

ORCID Icon
Received 28 Feb 2023, Accepted 11 Jul 2023, Published online: 22 Jul 2023

ABSTRACT

We might create artificial systems which can suffer. Since AI suffering might potentially be astronomical, the moral stakes are huge. Thus, we need an approach which tells us what to do about the risk of AI suffering. I argue that such an approach should ideally satisfy four desiderata: beneficence, action-guidance, feasibility and consistency with our epistemic situation. Scientific approaches to AI suffering risk hold that we can improve our scientific understanding of AI, and AI suffering in particular, to decrease AI suffering risks. However, such approaches tend to conflict with either the desideratum of consistency with our epistemic situation or with feasibility. Thus, we also need an explicitly ethical approach to AI suffering risk. Such an approach tells us what to do in the light of profound scientific uncertainty about AI suffering. After discussing multiple views, I express support for a hybrid approach. This approach is partly based on the maximization of expected value and partly on a deliberative approach to decision-making.

1. AI suffering risk

1.1. Introduction

Suffering is bad. This is why, ceteris paribus, there are strong moral reasons to prevent suffering. Moreover, typically, those moral reasons are stronger when the amount of suffering at stake is higher. So far, I have only uttered truisms, I hope. However, coupled with the supposition that we might develop artificially intelligent systems capable of suffering, these truisms may have deeply counterintuitive implications. They entail that there are strong moral reasons to reduce the risk of causing suffering in artificial intelligences (‘AI suffering’). This paper is about AI suffering risks, and how to best reduce them.

In Section 1, I will delineate the problem of AI suffering. I will explain why the risk is very high and outline some desiderata for potential solutions. In Section 2, I will discuss approaches to AI suffering risks which place their hope in scientific progress. I characterize challenges to and limits of these approaches, as they tend to violate some important desiderata. In Sections 3 and 4, I evaluate different ethical-practical approaches to AI suffering risk. These approaches aim to tell us how to confront and ideally reduce the risk of AI suffering, given strong epistemic and practical constraints. My preferred solution is a hybrid approach which combines elements of expected value theory, deliberative approaches to decision-making and scientific-theoretical reasoning. Section 5 concludes.

1.2. The extent of the risk

Let us begin with some terminological clarification. To refer to the kind of entities which are the subject of this paper, I will interchangeably use the terms ‘machine’ and ‘AI’. Prototypical cases are deep neural networks and classical rule-based AI, but the term includes all artefacts which have some degree of what we might want to call intelligence (Coelho Mollo Citation2022). I am fine with extending the use of the term ‘AI’ to things like whole-brain emulations (Mandelbaum Citation2022; Sandberg Citation2013) or neural organoids (Birch and Browning Citation2021; Niikawa et al. Citation2022) but some of the claims of this paper might apply to these non-prototypical cases less, or only given some qualifications.

I understand sentience to be the capacity to have phenomenally conscious experiences with a valence. Experiences are phenomenally conscious when there is something it is like to have them (Nagel Citation1974), i.e. when they subjectively feel like something. Valenced experiences are experiences that feel good (positive valence) or bad (negative valence), e.g. pain, fear, joy or relief (Crump et al. Citation2022). I assume that conscious experiences with a negative valence, at least when it is sufficiently strong, constitute suffering.Footnote1 According to an almost unanimous consensus, sentience is sufficient for moral status, i.e. for beings to matter morally for their own sake (Kriegel Citation2019; Nussbaum Citation2007; Schukraft Citation2020; Singer Citation2011).Footnote2 Thus, when machines can have negative conscious experiences, they can be subject to morally relevant harm.

In recent years, a discussion on machine rights, machine sentience and human obligations to machines has emerged and subsequently gained some momentum (Dung Citation2022c; Müller Citation2021; Saad and Bradley Citation2022; Schwitzgebel and Garza Citation2015; Shevlin Citation2021a). Perhaps most drastically, Metzinger (Citation2021) argues for ‘strictly banning all research that directly aims at or knowingly risks the emergence of artificial consciousness’. Importantly, not all researchers believe that having sentience is a prerequisite for having moral status (Danaher Citation2020; Gunkel Citation2018; Ladak Citation2023). However, since it seems clear that sentience is sufficient (even if not necessary) for moral status and since suffering seems to ground particularly strong moral demands, I will focus on AI sentience and particularly AI suffering in what follows.

Why should we think that the risk of AI suffering is very high? We can say that a risk is high when the product of its probability of occurrence and the harm caused by it, conditional on its occurrence, is large. When thinking about the probability that humans create machine suffering, the feature which stands out is the uncertainty involved. The risk that humanity causes machine suffering depends on both the speed and shape of future progress in AI research and questions about the material basis of conscious experience. Both elements are highly uncertain.

On the one hand, scientific progress is notoriously hard to predict. While the progress in AI research over the last decade has been rapid, it is nevertheless unclear whether we should expect it to further accelerate in the future, to decelerate or to continue at its current pace. Besides, what form it will take, which AI capabilities will be most affected, is an open question. On the other, it is highly controversial which cognitive capacities or computational processes suffice for consciousness. For this reason, we don’t know what it would take for an AI to be sentient (Chalmers Citation2023). Hence, it is not clear to what extent specific technological advances, even if we could predict them, would bring us closer towards AI sentience. In total, it is very hard to estimate the chance that we will create sentient AI within the foreseeable future, e.g. this century.

This assessment provides little comfort. If we neither understand consciousness nor the mechanisms underlying scientific progress well enough to derive trustworthy predictions, there is no reason to think that AI sentience is particularly unlikely. As all approaches for rational decision-making under uncertainty agree (MacAskill, Bykvist, and Ord Citation2020; Shriver Citation2020; Steele and Stefánsson Citation2020), the fact that a risk is uncertain is no reason to discount it.

The harm that might be caused by creating AI suffering is vast, almost incomprehensible. The main reason is that, with sufficient computing power, it could be very cheap to copy artificial beings. If some of them can suffer, we could easily create a situation where trillions or more are suffering. The possibility of cheap copying of sentient beings would be especially worrisome if sentience can be instantiated in digital minds (Shulman and Bostrom Citation2021).Footnote3 For instance, suppose that large language models like ChatGPT were sentient. If, say, each conversation about an unpleasant topic would cause ChatGPT to suffer, the resulting suffering would be enormous.

Since there is a chance that we would treat sentient digital minds with complete disregard, for instance because we don’t recognize their suffering, each individual mind could suffer massively. In total, their suffering might surpass even the horrors of factory farming by many orders of magnitude.

In conclusion, the probability of AI suffering is uncertain but non-negligible while the harm which might ensue is enormous. Thus, the risk is very high. How can we reduce this risk?

1.3. Desiderata for an approach to AI suffering risk

What we need is an approach for how to reduce the risk of AI suffering. In this sub-section, I will outline four desiderata for such an approach. First, the approach should be beneficent. That is, it should reduce the risk that machine suffering, especially on a massive scale, occurs as much as possible. Simultaneously, it should not increase the probability of other very bad outcomes and, as far as possible, not decrease the chance of very good outcomes. In particular, progress in AI might significantly accelerate scientific progress and economic growth and thereby enhance human wellbeing. As far as possible, an approach to AI suffering risk should not impede this beneficial potential of AI. As an approximation, we can say that the approach should fare well by utilitarian lights.

Second, the approach should be action-guiding. It should offer us a sufficiently concrete idea of what to do to reduce AI suffering risk. If an approach is too vague to recommend specific courses of action or to make it clear how the approach evaluates specific actions and situations, it is insufficient.

Third, the approach should be consistent with our epistemic situation. We have already noted the deep uncertainty which surrounds questions of AI sentience and suffering. There are severe limitations to our knowledge about relevant scientific facts. An approach to the reduction of AI suffering risk has to acknowledge this uncertainty and be robust to variation in plausible empirical views.

Fourth, the approach should be feasible. For some relevant agents (e.g. individuals, research groups, AI companies or governments), it should be practically possible to implement the actions recommended by the approach. More precisely, we can say that the feasibility of a proposed approach relative to a particular agent consists in the probability that, given that this particular agent tries to implement the approach, the approach is implemented. Feasibility simpliciter then depends on both agent-relative feasibility and the level of difficulty of convincing the agent to try to implement a particular approach. For instance, if a proposed change will definitely occur given that oneself tries to bring it about (e.g. that my own household starts to recycle), then it has maximum feasibility. If a change might only occur when the entire US government tries to enact it, then it is less feasible.

Why is feasibility desirable? Ceteris paribus, an approach is preferable if it is more feasible because feasibility increases the probability that advocating for this approach will be successful. An infeasible approach might be a good idea, but not useful in practice.

As we proceed, I make a particularly important assumption: I assume that an approach which requires a massive, general slowdown of AI progress, or a general moratorium on AI research, is currently infeasible. Since many competing states, especially the US and China, aim to make progress in AI research and there are significant economic and geopolitical incentives to continue (Armstrong, Bostrom, and Shulman Citation2016; Cave and ÓhÉigeartaigh Citation2018), it is doubtful whether the requisite extraordinary amount of global coordination is achievable.Footnote4

Based on our definition of feasibility, we can say that convincing, e.g., the US government to attempt to halt or substantially slow down progress in AI in general is quite hard and that, even if successful, the US government only has limited power over global progress in AI research.

Several clarifications are in order: Feasibility comes in degrees. I do not claim that halting progress in AI research is impossible. However, it is currently hard to accomplish. Thus, approaches are more feasible if they do not require halting all progress in AI research. Moreover, I do not claim that approaches are useless when they are relatively infeasible. However, feasibility is an advantage. In particular, we should have some approaches for dealing with AI suffering risk which are highly feasible. Otherwise, the probability that artificial suffering will be created is too high. Finally, while a general slowdown, moratorium or stop of AI research strikes me as hard to implement, small, localized changes to the trajectory of AI research do not necessarily count as infeasible.Footnote5

To summarize, approaches to AI suffering risk should be beneficent, action-guiding, consistent with our epistemic situation and feasible. There are at least two kinds of approaches. First, approaches can be scientific-theoretical. According to scientific approaches, we should address AI suffering risk by making scientific progress. For instance, such approaches might recommend a certain scientific theory of consciousness or a methodological proposal for investigating machine sentience. This way, they aim to reduce uncertainty about empirical questions relevant to AI suffering. The rationale is: If we have a better theoretical understanding of AI suffering, then we are better positioned to prevent it. Second, approaches can be ethical-practical. Such approaches aim to tell us what we should do to avoid AI suffering, given our lack of theoretical understanding of machine sentience. The approaches of this kind I will focus on propose a certain method for ethical decision-making under uncertainty of AI sentience.

Both approaches should be pursued simultaneously. For it seems quite certain that both approaches have some kind of legitimacy. Both, gaining a better theoretical understanding of AI suffering as well as thinking about what to do in the absence of this understanding, seem like reasonable – and complementary – courses of action.

That being said, in the next section, I will argue that scientific approaches are severely limited. In the foreseeable future, they cannot help us much to decrease AI suffering risk. The reason is that there are formidable obstacles which stand in the way of a scientific account of AI suffering. I will describe the core problems responsible for why we should expect that the scientific uncertainty regarding AI suffering will be persistent and hard to reduce. Moreover, I will emphasize the deficits of the approach to AI suffering proposed by Saad and Bradley (Citation2022). This paves the way for an extended discussion of ethical-practical approaches to AI suffering.

2. The limitations of scientific approaches to AI sufferings risk

2.1. Which machines are sentient?

In this sub-section, I will describe challenges to the scientific investigation of AI sentience. This examination will show that it is, for the foreseeable future, impossible to attain sufficient knowledge to eliminate the risk of inadvertently causing AI suffering. What knowledge about AI suffering do we need to avoid AI suffering risks? We do not need a complete theoretical account of AI suffering. In particular, we do not need deep explanations of AI suffering. Instead, we need an answer to a slightly simpler question: When, in which conditions, would an AI suffer? Since we can prevent AI suffering by not building machines which can suffer, this boils down to the question: Which kinds of AI systems have the capacity for suffering? As explained above, we can simplify this to: Which kinds of AI systems are sentient?

To answer this question, we need indicators of sentience. These are observable or measurable features which robustly correlate with sentience, and which may be present in some AI systems. Such a robust correlation across many contexts and possible systems requires a causal connection. Hence, we are looking for features which are either causally upstream from sentience, i.e. a cause of sentience, or causally downstream from it, i.e. caused by it.

Let us first look at causes of sentience. Since sentience requires phenomenal consciousness, knowing when machines are conscious is necessary for knowing when machines are sentient. The causal processes responsible for consciousness are characterized by so-called theories of consciousness (Graziano et al. Citation2020; Lau Citation2022; Seth and Bayne Citation2022). Such theories attempt to capture the psychological, computational or neural processes underlying consciousness in humans.

In the debate on animal consciousness, it has often been noted that the utility of theories of consciousness for thinking about non-human consciousness is limited (Birch Citation2022; Dung Citation2022a). There are three problems. First, most obviously, there are deep controversies about the correct theory of consciousness which will not be resolved anytime soon (Irvine Citation2012; Schwitzgebel Citation2020). Second, theories of consciousness are typically not specific enough to have definite implications for which machines are conscious (Birch Citation2022; Shevlin Citation2021b). Insofar as one finds ways to attribute definite predictions for machine consciousness to such theories, those predictions are not always plausible. In particular, according to some obvious interpretations, most prominent theories implausibly imply that small networks comprising fewer than ten units are conscious (Doerig, Schurger, and Herzog Citation2021; Herzog, Esfeld, and Gerstner Citation2007). Third, necessary conditions for consciousness in humans might not transfer to animals and machines.Footnote6 There might be different ways of being conscious, i.e. consciousness might evolve (Birch and Andrews Citation2023; Godfrey-Smith Citation2016) or be created using different mechanisms. If so, we might miss suffering in an AI if we presume that it must be based on similar mechanisms as in humans.

In conclusion, we don’t know which theory of consciousness is true and, even if we did, those theories are currently too vague to make determinate predictions about the distribution of AI consciousness and, even if they weren’t, those predictions might not be correct since AI systems might be conscious in virtue of different mechanisms than humans.

So, let us look at factors causally downstream from sentience. Since we want to infer from the presence of some feature that a machine is sentient, we need features which can only be present, or whose presence is much more likely, if the machine is sentient. Arguably, there are some behaviours where we have good grounds for believing that they require consciousness in humans and other animals, e.g. certain forms of learning (Birch Citation2022; Birch, Ginsburg, and Jablonka Citation2020). However, this is partially because animals share some parts of their phylogenetic history and their biological makeup which provides us with (defeasible) grounds for supposing that similar behaviour is caused by similar processes (Tye Citation2017). This background similarity is not shared between humans, other animals and machines. Thus, it is currently not possible to make confident assertions about which features of AI may indicate consciousness.

Suppose someone has come up with a plausible downstream indicator of AI consciousness. There are three further, more principled challenges. First, to decisively say whether a plausible indicator is actually valid, we need to know which processes are involved in the production of this indicator. Without a deeper theoretical understanding of consciousness and how it functions in machines, this is impossible. Since applying theories of consciousness to AI without empirical guidance is problematic, as we have just seen, there arises some circularity. Whether this is a vicious circle or allows for some fruitful form of bootstrapping (Chang Citation2004; Shevlin Citation2021b) remains to be seen.

Second, indicators of AI consciousness can be ‘gamed’ (Birch and Andrews Citation2023; Shevlin Citation2020). It is typically possible to build AI systems with the intention to possess a particular indicator such that they exhibit the indicator but not the deep architectural-functional features the indicator was supposed to track. That is, typically when someone proposes a property as an indicator of AI consciousness ‘we can game the system by constructing degenerate examples of systems exhibiting that property that we don’t intuitively think of as sentient’ (Tomasik Citation2014).

Third, there are non-functionalist views of consciousness. According to some authors, machines could possess the exact same causal organization as humans but would nevertheless not be conscious, because they are not made from the same material (Searle Citation2017). Thus, those machines would exhibit all putative behavioural and functional indicators of consciousness without being conscious. If such machines are nomologically possible, then putative indicators of AI consciousness based on behaviour or internal functional organization are invalid. It seems like, by definition, there is no empirical evidence which can disprove this view (Prinz Citation2003). In response, one might just reject this non-functionalist view as implausible. However, the problem is deeper.

Most people think that not every system which shows human-like behaviour across a wide range of situations is conscious. That is, a mere, gigantic lookup table is plausibly not conscious (Block Citation1981). Internal organization should matter. This creates a problem: If you think that internal organization matters for consciousness, you have to decide what the appropriate degree of functional abstraction is. For instance, how similar do machines have to be to humans to also have human-like consciousness? Does their internal processing have to correspond to human neural processing in its low-level details, or are coarse-grained functional similarities sufficient? Since this question can arise with respect to machines which display the same behaviour as humans in all situations, there appears to be no empirical way to settle the issue. However, how the issue is settled will affect which machines are deemed conscious.

In conclusion, finding trustworthy indicators causally downstream of sentience is currently not possible. Moreover, there are deep methodological and metaphysical obstacles to a resolution of this problem.

For this reason, we currently don’t know what good evidence for or against attributions of sentience to AIs consists in. If someone asks whether a particular AI, for instance a large-language model, is conscious, we have no checklist of credible indicators we can resort to. While there may be some general features which broadly speak in favour of (or against) attributions of consciousness, e.g. (the absence of) the capacity for domain-general reasoning, they are too vague and their significance is too unclear to help much with concrete problem cases. Instead, we broadly rely on our general intuitions. In the case of current language-models at least, this leads most people to believe that they are not sentient.

Moreover, our overview indicates that this situation will likely persist. Since there are foundational obstacles impeding progress on AI consciousness, we will not be able to make illuminating, confident and scientifically informed assessments of AI sentience in the near future. Thus, purely scientific approaches to AI suffering risks are not satisfactory. Specifically, they violate our third desideratum: they are not consistent with our epistemic limits. Given the difficulties in measuring AI suffering, it will – at least in the near future – not be possible to prevent AI suffering by reliably identifying all and only those AI systems which (can) suffer. Therefore, we will now look at an alternative approach.

2.2. Transparent AI and differential technological development

This approach is championed by Saad and Bradley (Citation2022). It is a hybrid which combines elements of a scientific approach, proposing a specific way to assess sentience in machines, with ethical-practical guidelines for which kinds of AI systems to develop.

On the view of Saad and Bradley, at the heart of the problem is that ‘unless new methods are developed, digital minds will likely be epistemically inaccessible to us. As things stand, we have little insight into the inner workings of candidate digital minds’. Thus, their view agrees with the previous argument. In the foreseeable future, we are unable to develop general criteria for when AI systems are capable of suffering.

From this, they draw the lesson that the currently dominant AI paradigms are the culprit. In particular, if we continue to build AI with current machine learning approaches, there is no way to tell whether those AIs will be sentient and when they will suffer. Thus, as a path to digital minds, machine learning is ‘morally treacherous’ (Saad and Bradley Citation2022, 18). For this reason, the thought continues, we should shift away from machine learning and prioritize other approaches to advanced AI which do not threaten to create digital minds capable of suffering which are epistemically inaccessible to us.

What could this approach to epistemically transparent advanced AI be? The authors propose whole brain emulation (WBE) (Sandberg Citation2013). WBE aims to create a digital system which is functionally isomorphous to a brain. This functional duplicate is created by first scanning the human brain and then uploading the scan to a computer to get a software model of the human brain. Subsequently, by running an emulation algorithm on the model, the model produces brain-equivalent outputs. Thus, the WBE functionally mirrors human brains, to a selected level of detail.Footnote7 If some form of functionalism is true, then some WBE of a conscious human brain would also be conscious.

According to Saad and Bradley, WBEs are epistemically transparent to us because they are functionally connected to human brains. Two systems are functionally connected iff there is a gradual transformation of one to the other that preserves fine-grained functional organization. Based on dancing qualia arguments (Chalmers Citation1996, ch. 7), the authors argue that, in functionally connected systems, the same types of functional states correspond to conscious experience. Thus, insofar as we know which human brain state and behaviour is associated with which kind of conscious experience, we know which internal and external states of a functionally connected system, like a WBE, are associated with which kinds of conscious experience.

Note that this entails that WBEs are conscious and capable of suffering. However, assuming the previous argument, functional connectedness allows us to figure out which states of a WBE correspond to suffering. When we can measure WBE suffering, we can see to it that we place WBEs in situations in which they don’t suffer. Nevertheless, even granting assumptions about the implications of functional connectedness, there is a residual risk that, for whatever reason, humans will not treat WBEs well and thereby cause suffering. To make this worry vivid, consider that factory farming or slavery existed, or still exist, even though humans knew that those practices cause suffering.

I will skip over some details of the approach by Saad and Bradley, in particular their interesting discussion on how the criterion of functional connectedness could be applied beyond strict emulations, i.e. functional copies, of human brains. I agree with them that building sentient AI via WBE is, all other things being equal, preferable to creating sentient AI via machine learning. Moreover, while Saad and Bradley propose a scientific approach to AI suffering risk, their approach does respect our epistemic situation. It concedes that typical state of the art AI models are epistemically opaque and urges us instead to prioritize systems whose sentience we can realistically detect.

However, the key limitation of their approach is that it violates the feasibility desideratum. To prevent risks from AI suffering, their approach recommends to ‘deprioritize’ machine learning (and other AI paradigms, e.g. classical symbolic AI) in favour of WBE. However, since almost all current AI research and virtually all AI research which is currently commercially relevant is independent of WBE, this shift in prioritization is only slightly more feasible than halting, or substantially slowing down, technical progress in AI in general.

Saad and Bradley allude to arguments according to which WBE can eventually lead to very powerful general artificial intelligence and might even lead to it faster than other paradigms. However, this view is apparently not shared by most AI researchers, firms and stakeholders which concentrate their attention and funding on machine learning. Thus, even if we make the (very) big assumption that WBE is the most promising avenue for advances in AI,Footnote8 it still seems quite unclear whether it is feasible to convince enough key decision-makers and researchers of this to significantly slow down progress in other domains of AI and sufficiently speed up progress in WBE.

Thus, while the approach of Saad and Bradley is an important contribution, we also need approaches which are more clearly feasible. To achieve this, we need to go beyond scientific approaches to AI suffering risk. For, as we have seen, we have no reliable means of investigating sentience in most kinds of AI systems and it is hard to differentially slow down currently dominant AI paradigms while speeding up others. This is the reason why scientific approaches struggle to be consistent with our epistemic situation, while many approaches are relatively infeasible. In the next section, I will start to evaluate ethical approaches for reducing AI suffering risk given limits in knowledge and in power to change the trajectory of technological development.

3. Ethical approaches to AI suffering risks

3.1. A practical-ethical perspective

While I appreciate the need for further research on AI suffering and am sympathetic to attempts to slow down current progress in AI, because of the moral hazards it poses, I will propose a different, but complementary, approach. In light of the problems of scientific approaches, I will now turn to ethical approaches. They set aside the scientific question ‘Which kinds of AI systems can suffer?’ in favour of the practical-ethical question ‘How should we respond to the possibility of AI suffering?’ For instance, as impressive AI systems are released every month and some people claim that they are sentient, how should key decision-makers and other individuals respond? That being said, scientific knowledge about AI suffering is of course relevant to this practical question. My claim, however, is that a reflection on ethical approaches will be the main way to make progress on AI suffering risk in the foreseeable future because we should not expect much scientific progress, due to the challenges confronting the science of AI sentience.

What to do about the risk of AI suffering in a specific decision-situation will of course depend on the peculiarities of the case at hand, e.g. the set of possible actions and other ethical constraints. Hence, it is not possible to give an entirely general but useful answer to the question of how to respond to the possibility of AI suffering. The approaches I will discuss are general strategies for dealing with uncertainty about AI sentience. Thus, they do not address all ethical issues relevant to AI suffering. This is because, even if we had certainty about which AI systems are sentient and suffer, there would still be ethical questions with respect to them (just as there are ethical questions regarding our obligations to other humans). Nevertheless, such an approach would solve a core part of the challenge posed by the possibility of AI suffering. In particular, we could use this approach to figure out how to respond to risks of extreme AI suffering.

Consequently, general accounts of ethical decision-making under uncertainty constitute a natural point of departure. For instance, work on how to approach animal ethics in the light of uncertainty about the distribution of animal sentience (Shriver Citation2020) can provide inspiration.

It is noteworthy that uncertainty about AI sentience provides no reason to abandon accounts of ethics according to which sentience and the capacity for conscious experience matter. As Dung (Citation2022c) has argued, facts about what determines moral status and value, i.e. whether it is sentience or something else, seem to be independent of the question whether properties such as sentience are epistemically accessible.

In this and the next section, I will explore and evaluate six ethical approaches to AI suffering risk.

3.2. Anthropocentrism, agnosticism, precaution and maximizing expected value

As with scientific approaches, we can evaluate ethical approaches to AI suffering risk with our four desiderata. So, how can we deal with the risk of AI suffering, given the constraints on knowledge about AI suffering and on influence on the trajectory of technological development mentioned earlier? Let’s call the first option the ‘anthropocentric approach’.

  1. The anthropocentric approach: In cases of uncertainty about AI sentience, treat the AI as incapable of suffering, unless there is strong evidence to the contrary.

This approach is anthropocentric in that it directs us to err on the side of neglecting the interests of machines in favour of the interests of humans (and potentially other morally relevant beings, e.g. animals) in cases of uncertainty. The main deficit of this approach is that it is not beneficent (desideratum 1) because it incurs a high chance of massive machine suffering. This is because the inductive risk is reversed to what the approach supposes.

There are two types of inductive risk here: treating a machine as capable of suffering when it is not (false positive) and treating a machine as not capable of suffering when it is (false negative). The risk is asymmetrical: false negatives are worse than false positives. For, as outlined earlier, false negatives might lead to astronomical AI suffering. False positives are not without costs since they might cause us to not employ AI systems which would have been beneficial. However, the potential damage is nowhere near as high. Since the anthropocentric approach advises us to increase the number of false negatives in order to minimize the number of false positives, its net effect (in expectation) is detrimental.

Let’s look at another candidate approach:

  • (II) The agnostic approach: In cases of uncertainty about AI sentience, leave it as an open question to be settled by future science whether those AI systems are actually sentient.

The agnostic approach does not presume that machines which might be sentient are not sentient, therefore it avoids the problem of the anthropocentrism approach. However, it is not action-guiding (desideratum 2). It does not tell us what to do before we have settled relevant scientific questions about AI sentience and suffering. Yet, in the meantime, we cannot avoid making decisions which depend on attributions of AI sentience.

Suppose someone builds and deploys an AI which some people believe is sentient but is treated in a way which would cause suffering in humans. Let us assume that, if the AI were sentient, the best decision would be to prohibit use of this AI, and if the AI were not sentient, the best decision would be to do nothing. This decision situation requires us to make a choice sensitive to considerations of AI sentience; postponing the decision is not an option. In particular, to avoid any action motivated by concern about machine suffering until more scientific knowledge is available just means adopting the anthropocentric approach in practice, if not in name.

If it’s wrong to err on the side of missing that machines are sentient and if agnosticism is not an option, perhaps we should instead err on the side of attributing sentience and suffering to machines which have none.

  • (III) The precautionary approach: In cases of uncertainty about AI sentience, treat the AI as capable of suffering, unless there is strong evidence to the contrary.

This approach is our first serious contender. Although no universally accepted definition exists, the precautionary principle has been influential in many domains of policy and ethics, particularly in regards to environmental and health threats (Chappell Citation2022; Nordgren Citation2023; Steel Citation2014). Especially relevant, an analogous precautionary approach has been suggested (Birch Citation2017) and discussed (Knutsson and Munthe Citation2017; Shriver Citation2020) in the context of animal welfare law.

The main challenge to the precautionary approach to AI suffering risk consists in the beneficence desideratum. Since with respect to attributions of suffering false negatives are worse than false positives, the precautionary approach is superior to the anthropocentric approach. Nevertheless, depending on how large the class of systems is whose sentience we treat as uncertain, the costs of treating those uncertain cases as sentient might be extremely high.Footnote9 For instance, some people – famously the former google engineer Blake Lemoine – claimed that certain large language models are sentient. If we apply the precautionary approach, then a plausible consequence is that it is wrong to build such models.

However, given this, it stands to reason that similarly sophisticated systems in other domains should also not be build. Large parts of future AI research, involving the creation of more advanced systems, might then be impermissible. Assuming that AI progress has much potential to increase human wellbeing, this would be a disadvantage of the precautionary approach. Moreover, this brings the precautionary approach in tension with the feasibility desideratum.

How can adherents of the precautionary approach respond? The problem of the precautionary approach seems to be that it threatens to multiply the number of AI systems to be treated as morally relevant too much. To prevent this, there needs to be some substantive necessary condition for viewing a machine as a candidate for sentience, i.e. as something whose sentience we are uncertain of. We might say that there at least needs to be some credible evidence of sentience to justify a presumption in favour of the machine being sentient. However, due to our limited understanding of indicators of AI suffering, it is hard to see how we can ascertain sufficiently precisely what counts as ‘credible’ evidence. Without any such condition, the precautionary approach may be too cautious, counting too many AI systems as morally relevant.

The strengths and weaknesses of the precautionary approach can be seen more clearly by contrasting it with a further alternative.

  • (IV) The probability-adjusted moral status (PAMS) approach: In cases of uncertainty about AI sentience, discount the interests the AI has, if it is sentient, by its subjective probability of sentience.

This approach has been championed by Chan (Citation2011) in the context of attributions of moral status in the light of uncertainty about animal sentience. In addition, it is entailed by a decision-theory which counsels the maximization of expected value. To see the approach at work, suppose – for instance – that we have a degree of belief of 50% that a given AI is sentient. In this case, we ought – ceteris paribus – to weigh the suffering it has if it turns out to be sentient half as much as harm to beings whose sentience we are certain of.

In theory, this alleviates the beneficence worry the precautionary approach is confronted with. For the PAMS approach does not treat all AI systems which have a chance to be sentient as fully morally relevant. Their interests are downweighted in proportion to the probability of sentience. Hence, if we think that most current and foreseeable future AI systems have a very low probability of sentience, it is less clear that the PAMS approach demands massive restrictions to current and future AI research. On the other hand, the PAMS approach is sensitive to the stakes, i.e. how much suffering would be caused by treating a type of machine as sentient, and to the probability of sentience. Thus, it conforms to the plausible principle that sometimes low-probability risks can be very important.Footnote10

The problem of the precautionary approach was to make a well-founded determination of what counts as sufficient to say that it is uncertain whether a machine is sentient. Similarly, with the PAMS approach, it is not easy to see how we can form non-arbitrary degrees of beliefs regarding sentience. If we don’t know what counts as good evidence of sentience, what should our credence be that, e.g. current large-language models are sentient? Deciding this might feel just like randomly picking a number.

On the other hand, I would imagine that most people would express a credence of 10% and lower that current language models are sentient. Almost everyone would presumably say something below 50%. So, the choice would not be purely random after all. Nevertheless, given that the science of machine sentience cannot give us clear guidance, it seems that proponents of the PAMS approach need an account of how these credences should be set.

We will come back to this open problem later but let us first look at another challenge. Since the suffering at stake is potentially astronomical, the PAMS approach might entail that much of AI research should be prohibited, even if the probabilities of sentience and suffering involved are very small. Thus, one might be tempted to say that the PAMS approach, too, violates the feasibility desideratum.

However, in this case, I would argue that it is not a problem for the approach if it entails that the best course of action is to completely halt AI research. For the approach – in contrast to the precautionary approach or Saad and Bradley’s approach – also gives us various more feasible recommendations. Based on the subjective probability of suffering and the amount of potential suffering at stake, the PAMS approach delivers an ordering of possible actions based on their reduction in expected AI suffering.

It might be the case that the action which promises the highest reduction in AI suffering risk turns out to be relatively infeasible (let’s say, to stop all AI research). However, in this case, we can just move to the next-best feasible action. Thus, even if the favoured intervention of the PAMS approach is (relatively) infeasible, the approach is still applicable. The approach tells us to prioritize stopping the creation of machines where the expected suffering is the highest. In doing this, we can focus on interventions which are feasible. This contrasts with Saad and Bradley’s approach which does not tell us what to do if its primary recommendation turns out to be infeasible.

Thus, to recap, the precautionary and the PAMS approach share the twin problem that the approaches require some way to determine what counts as credible evidence of sentience or suffering or what an appropriate degree of belief in sentience or suffering is. If this problem were solved, one can debate which approach would be superior. An advantage is that the PAMS approach is safe with respect to the feasibility desideratum. On the level of ideal theory, the PAMS approach is the most beneficent because it is sensitive to differences in probability of sentience which might often be decision relevant. Since it takes into account more kinds of relevant information than the precautionary approach, the PAMS approach is superior.

However, one may argue that the precautionary approach can still be useful in practice as a heuristic or action-guiding rule. For instance, it has been argued that obeying the precautionary principle has a good empirical track record in other domains (Steel Citation2014). Since there is (to my knowledge) no work which investigates empirically whether aiming to follow an expected value or a precautionary approach tends to lead to better results, this remains as a question for further research.

In the next section, we will first discuss another alternative: the deliberative approach. This will set the stage for my final proposal, namely the explicit deliberation approach. This approach is a refinement of the deliberative approach. However, it should best be used to inform what we should count as relevant evidence of AI sentience or what our subjective credence that a particular AI is sentient should be. Thus, it can and should be joined with the precautionary or the PAMS account.

4. Deliberation and AI suffering

4.1. The deliberative approach

Consider another approach.

  1. The deliberative approach: In cases of uncertainty about AI sentience, act in the way which is, or would be, recommended by a group of informed and reasonable persons after a thoughtful deliberative process.

This approach also has many predecessors, for instance in discourse ethics (Habermas Citation1984; Citation1987). An approach of this kind is often suggested in domains where scientific experts cannot give us clear guidance, as with debates shrouded in scientific uncertainty or sensitive to disagreement about values. For instance, Alexandrova (Citation2017, ch. 4) proposes that the question what the correct notion of wellbeing is should be settled by deliberation.

Prima facie, when the objective evidence does not lend itself to one clear decision, it is attractive to defer the decision instead to a reason-guided process. Importantly, for this to be the case, both the participants and the deliberative process as such have to obey certain constraints: First, the participants of the deliberation, be they experts or laypeople, need to have access to the relevant background knowledge and the most important arguments. Second, the participants should generally be open-minded and reasonable. For instance, they should not have conflicts of interest. Third, the deliberative procedure should allow for reasoned and free discussion, e.g. not be structured very hierarchically.

However, a sceptic could be puzzled by the deliberative approach. After all, we granted that the entire scientific community does not know of good indicators of AI suffering and that what we should do with respect to AI suffering risks depends on knowledge about such indicators. Given this, how should outsourcing decisions to such a deliberative process help? Whatever the decision people come up with, it will not be justified because everyone lacks the knowledge that could justify such a decision.

In many other domains, proponents of a deliberative approach have an answer. In case the deliberators represent the people affected by the relevant decisions, basing the decision on deliberation can be justified non-instrumentally. For instance, many hold that a democratic political organization is not only justified instrumentally by the assumption that democracies produce the best policies. Instead, many people hold that democratic policies are also non-instrumentally – procedurally – justified because citizens have the right to govern themselves (Christiano and Bajaj Citation2022).

Yet, this argument does not transfer to a deliberative approach to AI suffering risk. The main stakeholders are not humans, but the AI systems which might suffer and they, particularly when we talk about future systems, cannot participate in the deliberation. A human-made democratic decision cannot legitimize suffering inflicted on other beings, such as machines. Since a non-instrumental justification of the deliberative approach is therefore ruled out, a proponent of the deliberative approach needs to argue that this approach can be expected to produce the best outcomes.

Thus, to justify the deliberative approach, one needs to argue that such a deliberative process produces better results than directly following the recommendations of scientific or other experts. This view might be taken to be in conflict with our epistemic limits. If scientific experts lack knowledge and credible evidence regarding AI sentience, then the participants in the deliberation lack this knowledge as well. If what we should do depends on such knowledge, then there is no reason to think that the course of action arrived at via such a deliberative process will be appropriate. Moreover, there is always the risk that the deliberators overlook crucial arguments, misunderstand a relevant approach or weigh some consideration implausibly high or low.

I reject the deliberative approach as outlined this section. To better be able to handle the challenge just raised, I will introduce a different version of the deliberative approach.

4.2. The explicit deliberation approach

I side with the following approach.

  • (VI) The explicit deliberation (ED) approach: First, use the method of the deliberative approach to elicit judgements of the subjective probability of sentience for particular kinds of AI systems. Second, make the factors underlying their judgements explicit. Third, use judgements resulting from deliberation and explicit factors to mutually inform each other.

The ED approach is related to the deliberative approach. Again, part of the problem is deferred to a group of informed and reasonable people engaging in deliberation. But there are three differences: We established that, given a legitimate process for setting degrees of beliefs in the sentience of AI systems, the PAMS approach can be used. Thus, the ED approach limits the scope of the deliberation to the project of forming these credences.Footnote11 Besides, since the question how likely it is that particular kinds of AI systems are sentient is closer to a scientific question than an ethical or political one, the relevant group of deliberators should consist chiefly of scientific domain experts.

Finally, we don’t stop at collecting the judgements experts make about probabilities of AI sentience but try to work out which judgements underlie their assessment. Recall that the main worry about the deliberative approach was that, whatever their judgements, the deliberators have to base their judgement on some arguments, putative pieces of evidence and so forth. If our worry is that there is no good evidence of AI sentience, then it is hard to see how their judgements can be justified. Moreover, unbeknownst to us, the deliberators might base their judgement on inferences that are invalid, premises that are flawed, theories that are incoherent etc.

The ED approach aims to make the factors which are used in deliberation explicit. There are two complementary ways of achieving this: First, one should ask deliberators about the key reasons for their judgement. Second, by attending the deliberation itself, one can discern what the key considerations – in general and for individual deliberators – which are put forward seem to be.

This strategy has multiple advantages. That is, there are many uses for such an explicit list of factors and their weightings. To partially address the problem of the deliberative approach, the ED approach allows us to scrutinize the reasons underlying the judgements of deliberators. Thus, we can notice inconsistencies, implausible assumptions, conflicts with scientific evidence etc. If such fallacies are detected, we can either downweight the judgement of this deliberator accordingly or estimate what his credence would have been, if not for the errors.

For instance, we may conclude that a deliberator bases his judgements on five kinds of factors, in various weightings: (1) What different theories of consciousness suggest for the probability that a machine is sentient, (2) a specific list of putative behavioural signs of sentience, (3) the machine’s degree of general intelligence (Shevlin Citation2020), (4) the degree to which a machine is ‘gamed’ towards various putative signs of consciousness and (5) how intuitively and informally ‘impressive’ the feats of the machine appear. Suppose we can clearly say that the deliberator has misconceptions about, e.g. the general intelligence of the AI in question or the scientific support of a particular theory of consciousness. In these cases, we might either conclude that the judgement of the deliberator should receive less weight, especially if the misconceptions have been very impactful, or try to adjust for how these misconceptions influenced the resulting degree of belief.

Moreover, there is a lot of room for fruitful bootstrapping where the process of deliberation and the list of factors mutually refine and inspire each other. For instance, after having scrutinized which of these factors seem reasonable, we can deliver the refined list of factors as input for future deliberation. We and the deliberators themselves can check whether their credences are consistent with and supported by the list of factors they think judgements of AI sentience should be based on. Also, we can explicitly reflect on the list of factors and try to come up with improvements. Thus, in a virtuous epistemic cycle, deliberation processes can lead to improvements in our list of factors but can also itself be enhanced by such improvements.

How can the explicit deliberation approach satisfy our desiderata? To answer this question, it is instructive to first describe how it differs from a scientific approach. According to a scientific approach, we try to obtain knowledge about AI sentience which we then use to reduce AI suffering risk. Here, the epistemic standards of science are in play. Thus, hypotheses about AI sentience arrived at via scientific approaches need, at all steps, to be supported by relevant empirical evidence or robust theoretical reasoning. I have argued that, given our current epistemic limitations, judgements about the distribution of AI sentience cannot fulfil these epistemic strictures. They don’t have much scientific merit.

When we use the ED approach, we are not after hypotheses which are supported enough to demand scientific recognition. Instead, we want to find the best available procedure to determine reasonable degrees of beliefs or standards of evidence for using the PAMS or the precautionary approach, respectively. Even in the absence of strong scientific evidence, it makes sense to ask what one’s best guess regarding the sentience of various AI systems should be. Put differently: Scientific approaches aim to reduce our uncertainty regarding AI sentience, while the ED approach aims to make it explicit.

Non-systematically asking a few experts for their degrees of beliefs does not seem to be the best approach. For one thing, there might be much individual variation between the judgements of experts. Second, there is no reason to think that they are well-calibrated (Seidenfeld Citation1985; Tetlock and Gardner Citation2015), since scientists usually don’t have any practice in giving probabilistic beliefs or forecasts. Besides, experts often just refuse to assert degrees of beliefs. Third, there might be certain false assumptions made or relevant considerations missed by individual experts.

The ED approach aims to alleviate these problems. First, it recommends aiming to assemble a group of experts which is diverse along various lines – academic specialization, theoretical commitments as well as some demographic properties – to ensure that few perspectives are neglected. Second, the setting, including discussion and peer criticism, forces deliberators to seriously reflect on how to best make these probabilistic assessments. Third, in virtue of the deliberation as well as making explicit the factors guiding the deliberators’ judgements, missed and false assumptions can be corrected.

These differences explain how the ED approach and the PAMS approach can jointly meet all desiderata. They recommend a clear method for determining action: Create a deliberative procedure, obeying various constraints, to determine which degrees of beliefs regarding AI sentience to use. Subsequently, use these degrees of beliefs as inputs to the PAMS approach. Thus, the combined approach is action-guiding. Second, the approach does not require us to meet scientific standards of rigour, just to elicit best guesses about AI suffering. In other words, the approach does not presuppose that we improve our epistemic situation, only that we get more clarity about it. Therefore, it is consistent with our epistemic situation. Third, since the approach uses a method for eliciting degrees of beliefs which are – compared to alternativesFootnote12 – well-informed by the available (limited) knowledge about AI suffering and since it employs the PAMS approach, it is beneficent. Fourth, in virtue of the PAMS approach, the approach delivers predictions on where risks of AI suffering are highest. Thus, it enables us to generate an ordering of actions based on their expected reduction of AI suffering. If the otherwise most promising actions turn out to be too infeasible, the approach guides us in choosing which more feasible actions to pursue. This is why the approach does not presuppose the capacity to halt AI research in general, even if it would turn out that this is the best action. Hence, the approach also satisfies the feasibility desideratum. I conclude that the combination of the ED approach and the PAMS approach is the best known method for addressing risks of AI suffering.

5. Conclusion

This paper is based on the observation that we might create artificial systems which can suffer, and that machine suffering might potentially be enormous such that the moral stakes are huge. Thus, we need an approach which tells us what to do about the risk of machine suffering. I argued that such an approach should ideally satisfy four desiderata: beneficence, action-guidance, feasibility and consistency with our epistemic situation. Scientific approaches to AI suffering risk hold that we can improve our scientific understanding of AI, and AI suffering in particular, to decrease AI suffering risks. However, I argued that such approaches tend to conflict with either the desideratum of consistency with our epistemic situation or with feasibility.

Thus, we also need an ethical approach to AI suffering risks. After rejecting anthropocentrism and agnosticism, I have expressed sympathy for the precautionary and the probability-adjusted moral status approach, while favouring the latter. Nevertheless, my final conclusion was that this approach needs to be complemented by an approach which makes it possible to form justified degrees of belief in the presence of sentience in various AI systems. In this regard, the explicit deliberation approach turned out to be superior to the basic deliberative approach.

Acknowledgement

Open Access funding for this publication was generously provided by the Sentience Institute.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Notes

1 Some people have a more demanding concept of suffering than merely negatively valenced experience (e.g. Agarwal and Edelman Citation2020). For instance, one might think that very brief pain does not count as suffering (Dennett Citation1995). Another problem case is a mildly painful experience which essentially involves intense pleasure (Saad and Bradley Citation2022). These terminological differences do not matter here, as everyone agrees that strongly unpleasant, long-lasting experiences which are regarded as bad by their subjects count as suffering and are pro tanto bad for their subjects.

2 Although for a recent discussion on the normative significance of sentience, see Dung (Citation2022b, CitationForthcoming) and Kammerer (Citation2019, Citation2022).

3 Moreover, there is not much reason to think that humans are near the upper bound of the capacity for suffering. Some sentient AI systems might have faster conscious experience, and thus experience more unpleasant subjective moments per objective unit of time, and may have more strongly negative states (Saad and Bradley Citation2022). Although see Višak (Citation2022) for arguments that all animals, and therefore maybe also AI systems, have an equal capacity for welfare.

4 For non-academic discussion of the pros and cons of aiming to slow down AI research in the context of worries about advanced AI systems which might pose a catastrophic or an existential risk to humanity, see Alexander (Citation2022) and Grace (Citation2022).

5 For instance, efforts to prohibit certain kinds of autonomous weapons seem realistic (Müller Citation2016). In general, it is reasonable to not treat all technological changes as inevitable, but consider which kinds of changes are desirable (Bolte, Vandemeulebroucke, and van Wynsberghe Citation2022). This is compatible with and complementary to considering which kinds of technological developments can realistically be prevented, and by whom.

6 This limitation might not obtain for the integrated-information theory (IIT) of consciousness, since proponents of IIT typically believe that, in virtue of its axiomatic justification, IIT necessarily holds for all conscious beings (Massimini and Tononi Citation2018; Negro Citation2020; Tononi and Koch Citation2015). However, it is dubious whether IIT’s axiomatic justification is sufficient to support this strong claim (Bayne Citation2018).

7 Which level of detail is selected might depend on which level of functional organization is most relevant for various mental properties, like intentionality. This is currently controversial (Mandelbaum Citation2022).

8 See Mandelbaum (Citation2022) for a sceptical view.

9 For a discussion of the costs of building models which are treated as bearers of moral status, see Danaher (CitationForthcoming).

10 Someone might argue that it is a problem that the PAMS approach is fanatical, i.e., its recommendations can be dominated by arbitrarily small probabilities of sufficiently bad suffering. Since this is a general objection against expected value theory, expected value theory is in spite of this objection the most popular decision theory and the debate is quite elaborate (Wilkinson Citation2022), I will not discuss this objection here.

11 Someone who prefers the precautionary approach to the PAMS approach will instead ask deliberators to come up with judgements about which features are sufficiently credible evidence of sentience that it is appropriate to be uncertain about sentience in a machine and thus to apply the precautionary approach.

12 Alternatives would be: determining degrees of beliefs randomly, asking individual experts, etc.

References

  • Agarwal, A., and S. Edelman. 2020. “Functionally Effective Conscious AI Without Suffering.” Journal of Artificial Intelligence and Consciousness 7 (1): 39–50. doi:10.1142/S2705078520300030.
  • Alexander, S. 2022, August 8. “Why Not Slow AI Progress? [Substack Newsletter].” Astral Codex Ten. https://astralcodexten.substack.com/p/why-not-slow-ai-progress.
  • Alexandrova, A. 2017. A Philosophy for the Science of Well-Being (Vol. 1). Oxford University Press. doi:10.1093/oso/9780199300518.001.0001.
  • Armstrong, S., N. Bostrom, and C. Shulman. 2016. “Racing to the Precipice: A Model of Artificial Intelligence Development.” AI & SOCIETY 31 (2): 201–206. doi:10.1007/s00146-015-0590-y.
  • Bayne, T. 2018. “On the Axiomatic Foundations of the Integrated Information Theory of Consciousness.” Neuroscience of Consciousness 2018 (1), doi:10.1093/nc/niy007.
  • Birch, J. 2017. “Animal Sentience and the Precautionary Principle.” Animal Sentience 2 (16), doi:10.51291/2377-7478.1200.
  • Birch, J. 2022. “The Search for Invertebrate Consciousness.” Noûs 56 (1): 133–153. doi:10.1111/nous.12351.
  • Birch, J., and K. Andrews. 2023. What has Feelings? Aeon. https://aeon.co/essays/to-understand-ai-sentience-first-understand-it-in-animals.
  • Birch, J., and H. Browning. 2021. “Neural Organoids and the Precautionary Principle.” The American Journal of Bioethics 21 (1): 56–58. doi:10.1080/15265161.2020.1845858.
  • Birch, J., S. Ginsburg, and E. Jablonka. 2020. “Unlimited Associative Learning and the Origins of Consciousness: A Primer and Some Predictions.” Biology & Philosophy 35 (6): 56. doi:10.1007/s10539-020-09772-0.
  • Block, N. 1981. “Psychologism and Behaviorism.” The Philosophical Review 90 (1): 5. doi:10.2307/2184371.
  • Bolte, L., T. Vandemeulebroucke, and A. van Wynsberghe. 2022. “From an Ethics of Carefulness to an Ethics of Desirability: Going Beyond Current Ethics Approaches to Sustainable AI.” Sustainability 14 (8): Article 8. doi:10.3390/su14084472.
  • Cave, S., and S. S. ÓhÉigeartaigh. 2018. “An AI Race for Strategic Advantage: Rhetoric and Risks.” Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 36–40. doi:10.1145/3278721.3278780.
  • Chalmers, D. J. 1996. The Conscious Mind: In Search of a Fundamental Theory. Vol. 4, 609–612. Oxford: Oxford University Press.
  • Chalmers, D. J. 2023. Could a Large Language Model be Conscious? (arXiv:2303.07103). arXiv. doi:10.48550/arXiv.2303.07103.
  • Chan, K. M. A. 2011. “Ethical Extensionism Under Uncertainty of Sentience: Duties to Non-Human Organisms Without Drawing a Line.” Environmental Values 20 (3): 323–346. doi:10.3197/096327111X13077055165983.
  • Chang, H. 2004. Inventing Temperature: Measurement and Scientific Progress. Oxford University Press. doi:10.1093/0195171276.001.0001.
  • Chappell, R. Y. 2022. “Pandemic Ethics and Status Quo Risk.” Public Health Ethics 15 (1): 64–73. doi:10.1093/phe/phab031.
  • Christiano, T., and S. Bajaj. 2022. “Democracy.” In The Stanford Encyclopedia of Philosophy (Spring 2022), edited by E. N. Zalta. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/spr2022/entries/democracy/.
  • Coelho Mollo, D. 2022. “Intelligent Behaviour.” Erkenntnis, 1–18. doi:10.1007/s10670-022-00552-8.
  • Crump, A., H. Browning, A. Schnell, C. Burn, and J. Birch. 2022. “Sentience in Decapod Crustaceans: A General Framework and Review of the Evidence.” Animal Sentience 7 (32), doi:10.51291/2377-7478.1691.
  • Danaher, J. 2020. “Welcoming Robots into the Moral Circle: A Defence of Ethical Behaviourism.” Science and Engineering Ethics 26 (4): 2023–2049. doi:10.1007/s11948-019-00119-x.
  • Danaher, J. Forthcoming. Moral Uncertainty and Our Relationships with Unknown Minds. Cambridge Quarterly of Healthcare Ethics. https://philarchive.org/rec/DANMUA-2.
  • Dennett, D. C. 1995. “Animal Consciousness: What Matters and Why?” Social Research: An International Quarterly 62: 691–710.
  • Doerig, A., A. Schurger, and M. H. Herzog. 2021. “Hard Criteria for Empirical Theories of Consciousness.” Cognitive Neuroscience 12 (2): 41–62. doi:10.1080/17588928.2020.1772214.
  • Dung, L. 2022a. “Assessing Tests of Animal Consciousness.” Consciousness and Cognition 105: 103410. doi:10.1016/j.concog.2022.103410.
  • Dung, L. 2022b. “Does Illusionism Imply Skepticism of Animal Consciousness?” Synthese 200 (3): 238. doi:10.1007/s11229-022-03710-1.
  • Dung, L. 2022c. “Why the Epistemic Objection Against Using Sentience as Criterion of Moral Status is Flawed.” Science and Engineering Ethics 28 (6): 51. doi:10.1007/s11948-022-00408-y.
  • Dung, L. Forthcoming. “Preserving the Normative Significance of Sentience.” Journal of Consciousness Studies.
  • Godfrey-Smith, P. 2016. “Mind, Matter, and Metabolism.” Journal of Philosophy 113 (10): 481–506. doi:10.5840/jphil20161131034.
  • Grace, K. 2022. “Let’s Think About Slowing Down AI.” EA Forum. https://forum.effectivealtruism.org/posts/vwK3v3Mekf6Jjpeep/let-s-think-about-slowing-down-ai-1.
  • Graziano, M. S. A., A. Guterstam, B. J. Bio, and A. I. Wilterson. 2020. “Toward a Standard Model of Consciousness: Reconciling the Attention Schema, Global Workspace, Higher-Order Thought, and Illusionist Theories.” Cognitive Neuropsychology 37 (3–4): 155–172. doi:10.1080/02643294.2019.1670630.
  • Gunkel, D. J. 2018. Robot Rights. The MIT Press. doi:10.7551/mitpress/11444.001.0001.
  • Habermas, J. 1984. The Theory of Communicative Action. Vol. I: Reason and the Rationalization of Society. Beacon.
  • Habermas, J. 1987. The Theory of Communicative Action. Vol. II: Lifeworld and System. Beacon.
  • Herzog, M. H., M. Esfeld, and W. Gerstner. 2007. “Consciousness & the Small Network Argument.” Neural Networks 20 (9): 1054–1056. doi:10.1016/j.neunet.2007.09.001.
  • Irvine, E. 2012. Consciousness as a Scientific Concept: A Philosophy of Science Perspective. Dordrecht: Springer.
  • Kammerer, F. 2019. “The Normative Challenge for Illusionist Views of Consciousness.” Ergo, an Open Access Journal of Philosophy 6. doi:10.3998/ergo.12405314.0006.032.
  • Kammerer, F. 2022. “Ethics Without Sentience. Facing Up to the Probable Insignificance of Phenomenal Consciousness.” Journal of Consciousness Studies 29 (3-4): 180–204. https://doi.org/10.53765/20512201.29.3.180
  • Knutsson, S., and C. Munthe. 2017. “A Virtue of Precaution Regarding the Moral Status of Animals with Uncertain Sentience.” Journal of Agricultural and Environmental Ethics 30 (2): 213–224. doi:10.1007/s10806-017-9662-y.
  • Kriegel, U. 2019. “The Value of Consciousness.” Analysis 79 (3): 503–520. doi:10.1093/analys/anz045.
  • Ladak, A. 2023. What would Qualify an Artificial Intelligence for Moral Standing? AI and Ethics. doi:10.1007/s43681-023-00260-1.
  • Lau, H. 2022. In Consciousness we Trust: The Cognitive Neuroscience of Subjective Experience. Oxford: Oxford University Press.
  • MacAskill, W., K. Bykvist, and T. Ord. 2020. Moral Uncertainty. 1st ed. Oxford University Press. doi:10.1093/oso/9780198722274.001.0001.
  • Mandelbaum, E. 2022. “Everything and More: The Prospects of Whole Brain Emulation.” The Journal of Philosophy 119 (8): 444–459. doi:10.5840/jphil2022119830.
  • Massimini, M., and G. Tononi. 2018. Sizing up Consciousness (Vol. 1). Oxford University Press. doi:10.1093/oso/9780198728443.001.0001.
  • Metzinger, T. 2021. “Artificial Suffering: An Argument for a Global Moratorium on Synthetic Phenomenology.” Journal of Artificial Intelligence and Consciousness 8 (1): 43–66. doi:10.1142/S270507852150003X.
  • Müller, V. C. 2016. “Autonomous Killer Robots Are Probably Good News.” In Drones and Responsibility: Legal, Philosophical and Socio-Technical Perspectives on the use of Remotely Controlled Weapons, edited by E. D. Nucci and F. S. de Sio, 67–81. London: Ashgate. https://philarchive.org/rec/MLLAKR.
  • Müller, V. C. 2021. “Is it Time for Robot Rights? Moral Status in Artificial Entities.” Ethics and Information Technology 23 (4): 579–587. doi:10.1007/s10676-021-09596-w.
  • Nagel, T. 1974. “What is It Like to Be a Bat?” The Philosophical Review 83 (4): 435–450. doi:10.2307/2183914.
  • Negro, N. 2020. “Phenomenology-first Versus Third-Person Approaches in the Science of Consciousness: The Case of the Integrated Information Theory and the Unfolding Argument.” Phenomenology and the Cognitive Sciences 19 (5): 979–996. doi:10.1007/s11097-020-09681-3.
  • Niikawa, T., Y. Hayashi, J. Shepherd, and T. Sawai. 2022. “Human Brain Organoids and Consciousness.” Neuroethics 15 (1): 5. doi:10.1007/s12152-022-09483-1.
  • Nordgren, A. 2023. “Pandemics and the Precautionary Principle: An Analysis Taking the Swedish Corona Commission’s Report as a Point of Departure.” Medicine, Health Care and Philosophy 26: 163–173. doi:10.1007/s11019-023-10139-x.
  • Nussbaum, M. C. 2007. Frontiers of Justice: Disability, Nationality, Species Membership. Harvard: Harvard University Press.
  • Prinz, J. 2003. “Level-Headed Mysterianism and Artificial Experience.” Journal of Consciousness Studies 10 (4–5): 111–132.
  • Saad, B., and A. Bradley. 2022. “Digital Suffering: Why it’s a Problem and how to Prevent it.” Inquiry 0 (0): 1–36. doi:10.1080/0020174X.2022.2144442.
  • Sandberg, A. 2013. “Feasibility of Whole Brain Emulation.” In Philosophy and Theory of Artificial Intelligence, edited by V. C. Müller, 251–264. Springer. doi:10.1007/978-3-642-31674-6_19.
  • Schukraft, J. 2020. “Comparisons of Capacity for Welfare and Moral Status Across Species.” Rethink Priorities. https://rethinkpriorities.org/publications/comparisons-of-capacity-for-welfare-and-moral-status-across-species.
  • Schwitzgebel, E. 2020. “Is There Something It’s Like to be a Garden Snail.” Philosophical Topics 48 (1): 39–63. doi:10.5840/philtopics20204813.
  • Schwitzgebel, E., and M. Garza. 2015. “A Defense of the Rights of Artificial Intelligences.” Midwest Studies in Philosophy 39 (1): 98–119. doi:10.1111/misp.12032.
  • Searle, J. 2017. “Biological Naturalism.” In The Blackwell Companion to Consciousness. 1st ed., 327–336, edited by S. Schneider and M. Velmans. Wiley. doi:10.1002/9781119132363.ch23.
  • Seidenfeld, T. 1985. “Calibration, Coherence, and Scoring Rules.” Philosophy of Science 52 (2): 274–294. doi:10.1086/289244.
  • Seth, A. K., and T. Bayne. 2022. “Theories of Consciousness.” Nature Reviews Neuroscience 23 (7): Article 7. doi:10.1038/s41583-022-00587-4.
  • Shevlin, H. 2020. “General Intelligence: An Ecumenical Heuristic for Artificial Consciousness Research?” Journal of Artificial Intelligence and Consciousness, doi:10.17863/CAM.52059.
  • Shevlin, H. 2021a. “How Could We Know When a Robot was a Moral Patient?” Cambridge Quarterly of Healthcare Ethics 30 (3): 459–471. doi:10.1017/S0963180120001012.
  • Shevlin, H. 2021b. “Non-Human Consciousness and the Specificity Problem: A Modest Theoretical Proposal.” Mind & Language 36 (2): 297–314. doi:10.1111/mila.12338.
  • Shriver, A. J. 2020. “The Role of Neuroscience in Precise, Precautionary, and Probabilistic Accounts of Sentience.” In Neuroethics and Nonhuman Animals, edited by L. S. M. Johnson, A. Fenton, and A. Shriver, 221–233. Springer International Publishing. doi:10.1007/978-3-030-31011-0_13.
  • Shulman, C., and N. Bostrom. 2021. “Sharing the World with Digital Minds.” In Rethinking Moral Status, edited by S. Clarke, H. Zohny, and J. Savulescu, 306–326. Oxford University Press. doi:10.1093/oso/9780192894076.003.0018.
  • Singer, P. 2011. Practical Ethics. 3rd ed. Cambridge University Press. doi:10.1017/CBO9780511975950.
  • Steel, D. 2014. Philosophy and the Precautionary Principle: Science, Evidence, and Environmental Policy. Cambridge University Press. doi:10.1017/CBO9781139939652.
  • Steele, K., and H. O. Stefánsson. 2020. “Decision Theory.” In The Stanford Encyclopedia of Philosophy (Winter 2020), edited by E. N. Zalta. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2020/entries/decision-theory/.
  • Tetlock, P. E., and D. Gardner. 2015. Superforecasting: The Art and Science of Prediction. New York: Crown.
  • Tomasik, B. 2014. Do Artificial Reinforcement-Learning Agents Matter Morally? ArXiv:1410.8233 [Cs]. http://arxiv.org/abs/1410.8233.
  • Tononi, G., and C. Koch. 2015. “Consciousness: Here, There and Everywhere?” Philosophical Transactions of the Royal Society B: Biological Sciences 370 (1668), doi:10.1098/rstb.2014.0167.
  • Tye, M. 2017. Tense Bees and Shell-Shocked Crabs: Are Animals Conscious? Oxford University Press. doi:10.1093/acprof:oso/9780190278014.001.0001.
  • Višak, T. 2022. Capacity for Welfare Across Species. Oxford: Oxford University Press.
  • Wilkinson, H. 2022. “In Defense of Fanaticism.” Ethics 132 (2): 445–477. doi:10.1086/716869.