Talk:Friendly artificial intelligence

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Suggestion of inclusion of Examples of Friendly Artificial Intelligence in fiction[edit]

To expand on this subject, it would be worthwhile linking to details of Friendly AI specifically addressed in literature.

One example of this is "Turing Evolved" by David Kitson - an e-book only novel which examines what kind of test would be necessary to determine if an AI is friendly and more so, if it possesses humanity - such that it can be relied upon to protect humans and act as their advocate and protector, even while being fully capable of initiating any action that a human might, including taking a life. The novel also examines other aspects of Friendly AI including how they can be created and developed as well as tested and what issues may be faced, both political and practical, given that a Friendly AI might be given considerable power or capability through technology. — Preceding unsigned comment added by 203.13.16.51 (talk) 13:34, 28 February 2012 (UTC)[reply]

Older comments[edit]

someone wrote:

Do not ask for a short definition. Even after you read CFAI it is very likely that you will only begin to understand Friendliness.

That's fine, we'll give it a long definition but try to be concise. 207.112.29.2 02:28, 29 Oct 2003 (UTC)

This article was previously listed for deletion as it was titled Friendliness Theory which only got 40 Google hits. Angela thought it was not well known enough or was primary research. 207.112.29.2 said it was not primary research or new [1]. The Wikipedia:Google Test was concluded to be fairly irrelevant here as it was a specialized topic.

Well, it was an excellent article for a first time poster, but I have to say that I didnt understand what the subject is. Is this about a robot, like in the movie AI?

Also, the question at the end let me wondering. Usually we do not post that kind of questions here. Take Salvador Sánchez for example: I was the original article poster and the question there would have been, what if he hadn't died? Would he have beaten Alexis Arguello or Wilfredo Gomez in a rematch? But since Wikipedia is an encyclopedia, Im not sure about the method of posting questions to make people think.

Other than that, good article! Godspeed at wikipedia and God bless you!

Sincerely yours, Antonio Sexyas* Martin


Article looking good now, but I suggest a rename to Friendly artificial intelligence, as suggested by Fuzheado on VfD. Friendliness Theory is too general a title, the current title would lead one expect some general psychological theory about friendliness in general and "Theory" is a little grandiose for what is essentially a speculative notion. The term "Friendliness Theory" could still be mentioned in the body of the text, since those interested sometimes use that term, but not the title. --Lexor 22:05, 29 Oct 2003 (UTC)

Move completed. --Lexor 22:21, 29 Oct 2003 (UTC)
As mentioned on VfD, it seems that friendliness theory differs from friendly AI in the important respect that it is about the value of being friendly toward an AI, rather than about the AI itself being friendly -- but of course it is supposed to lead to the AI itself being friendly. This tells me there is still room for friendliness theory in general psychology. Waveguy 04:47, 30 Oct 2003 (UTC)

I've made some revisions to the page, including adding some discussion of the capitalization issue and fixing the capitalization in the article (although I suspect I'll have to watch the article carefully to make sure errors don't creep back in). I think the analogy of teenagers is rather weak, but I don't have an alternative to pull out at the moment. I don't know where that came from; maybe the person who posted it can tell us. --G Gordon Worley III 10:12, 3 Jul 2004 (UTC)

I second that the analogy of teenagers is rather off-track; for it is not clear whether the AI or the humankind is the teenager, and the secretive behaviour caused by parental tyranny has no counterpart in the FAI scenario. Suggest removal or rewrite of the entire paragraph. --Autrijus 17:46, 2004 Aug 14 (UTC)

I feel that this page is still close to a candidate for deletion on the grounds of it describing what is apparently one man's pet project.

The subject is the loquacious, but content-light writings of Yudkowsky on Friendliness. He provides us with the vision that future AI will be based on seed AI running on Baysian probablistic principles influenced by programmers' statements that start the AI off from a "Friendly" oriented viewpoint, which will therefore tend to persist.

Once this is extracted, it seems all he has to say is why this viewpoint is unsurprisingly incompatible with the layman's perception of what an AI might be.

Yudkowsky's idea of software is several decades out of date. For example, the idea of software reading and changing its own LISP source code is old-fashioned: modern software is composed of collaborating objects whose changing properties may affect their behaviour.

If you feel this page has a place, then two paragraphs grate:

  • Why emphasize that "Friendliness" has a capital letter. It is much closer in meaning to "friendliness" than is, for example, "Chaos" (of Chaos Theory) and "chaos", and certainly close enough to make sense of the paper.
  • Friendliness can only be described as an "enormously complicated subject" because Yudkowsky isn't particularly good at communicating the key ideas. Even the "brief introduction" spans several pages, and the longer document goes off onto many wild irrelevant tangents like "computronium".

Quirkie 22:04, 20 Apr 2005 (UTC)

Reference please[edit]

I'd like a reference -- here in talk or in the text -- for the last line of the article, "Yudkowsky later criticized this proposal by remarking that such a utility function would be better satisfied by tiling the Solar System with microscopic smiling mannequins than by making existing humans happier." I ask both for the strength of the article and for my own curiosity as a newcomer to all this. I know this is supposed to be beyond the singularity and all, but if they're arguing about it, it's got to make some sense. How can a super-human intelligence capable of "tiling the the solar system" be incapable of telling the difference btwn living humans and "microscopic smiling mannequins" of its own creation? This is incoherent to me. Thx for the article, and any help on this. "alyosha" (talk) 04:37, 30 December 2005 (UTC)[reply]

Found it, or close enough, here on the SL4 list. Wow. "alyosha" (talk) 22:56, 30 December 2005 (UTC)[reply]

Merge this article with Seed AI[edit]

This and and Seed AI should be merged into one article. Tdewey 00:41, 30 October 2006 (UTC)[reply]

What's the point?[edit]

From the wiki page, As Oxford philosopher Nick Bostrom puts it:

"Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'"

So, assuming that superintelligent being can achieve whatever goal it has, why would we want to try to endow it with human friendly goals? I'm guessing that a superintelligent being could set its own goals, even goals about other goals (like, my goal as a superintelligent being is to come up with 13 impossible ideas before breakfast, and to get rid of that silly human-friendly goal). Who's to stop it? I'm not superintelligent, so I probably couldn't. I'm not smart enough to even try.

Does anyone have any links to thinks about this sort of thing?

www.orionsarm.com is an interesting hard-sci-fi about this sort of thing, if anyone wants to take a look.130.254.148.97 19:21, 19 April 2007 (UTC)[reply]

Why would the AI decide to go away from the human-friendly goal? If those goals are properly designed then to do so is learly not in line with the goal that the AI had at the time it decided to change those goals. 176.11.60.115 (talk) 09:18, 10 May 2012 (UTC)[reply]

Under the criticism section one might add this point. I would have no idea where to find a citation, but hasn't it been posited before that even with a "friendliness" component built into strong AI it could still become problematic. The theory of strong AI seems to be based on the idea of a truly sentient computer, and making it friendly seems to be building in "instincts" for lack of a better word. But humans disregard instincts all the time, if a computer could actually think, could actually grow intellectually, for lack of a better word, would its instincts continue to control it?. And would the computers it creates have as strong of "instincts". I won't ramble any more since its not what the talk page is for, but I know others have put forth this theory in the past and I just wan't to improve the article. So please forgive my expounding of pet theory's.Colin 8 17:21, 22 April 2007 (UTC)[reply]

remove this article[edit]

The term "artificial intelligence" (AI) refers to a technical field, part of computer science. None of the material mentioned in this article is recognized as meaningful or technically interesting within AI. The "Institute" mentioned here is not a recognized research center and exists only in virtual reality. None of the ideas of "singularity" or "friendliness" have any scientific merit, and none of them have been published or seriously discussed in the professional AI literature. This whole topic is of interest only as a concern of a science-fiction fringe cult. Described in any other way, in particular as trhough it were serious science, it has no place in an encyclopedia.—Preceding unsigned comment added by Patherick (talkcontribs) 19:52, 10 August 2007 (UTC)[reply]

Well other than Ray Kurzweil who both refers to the "singularity" and is very much a part of the "professional" AI community. The whole of the Star Trek articles here in Wikipedia are obviously too... "of interest only as a concern of a science-fiction fringe cult". In the end this *all* does have a place in an encyclopedia. Ttiotsw 18:41, 26 August 2007 (UTC)[reply]
This is a dumb, ignorant, short sighted bigoted remark. You obviously totally missed the point of wikipedia and the point of this topic.--Procrastinating@talk2me 23:07, 26 August 2007 (UTC)[reply]
I know this comment is like 15 years old but how tf is that bigoted??? Ocemccool (talk) 08:09, 25 May 2022 (UTC)[reply]
I won't make any judgement as to whether or not this article is encyclopedic content, however, the entire article is almost completely unsourced. If this is going to stay a part of Wikipedia, it needs vastly more verifiable sources. -LesPaul75talk 15:06, 7 September 2011 (UTC)[reply]

A way of making AI moral[edit]

Making successful solutions BOTTOM-UP: The typical error of many is to try to solve problems by building from the habitual building blocks that they have solution attempts which they then try out. TOP-DOWN: The correct ideal way to solve problems would be to begin in the opposite way: from the idea of a solution: figure out what you need, then figure out how to build such things: what structures you need, what kind of building blocks are available to build those structures and what kind of structures you can build from such building blocks (Only this last phase is availabke for you if you try the other way around.). Then just build those structures that you need for the solution. Ready!

Using the latter method you can start with non-technical point of view, like rational moral, and slowly put it to more mechanical language: moral: taking care of one's social environment, of the society and of the world at large: safeguarding the real health of these: safeguarding the full optimised functioning of biological wholes: taking care of full functioning times 1 instead of evil: needless breaking: producing brokedness of biological wholes: full functioning times 0

Thus if you have a goal oriented machine, capable of forming a vector sum like picture of the world, you can let it optimise its own functioning toward its own goals by allying with (the health of) the rest of the world, so producing an excellent moral without any compulsions and without any moral to begin with. This is so simple that maybe it could be done with any artificial intelligence capable of counting in order to guide its own actions, i.e. capable of mathematically optimising its actions toward its goals... But of course this does not quarantee that the AI would understand how to safeguard the good health of the world: for that there is the need of a picture of the world and a good understanding. But luckily this seems to show that increasing the intelligence of machines could increase their safety to humans and not the other way around like many are afraid of if they have read stories of robots running wild and ruining the social lives of humans and maybe threatening the whole human kind.InsectIntelligence (talk) 06:23, 18 November 2007 (UTC)[reply]

Re: Reference please[edit]

Found the 'correct' link to the reference for the closing comment, but have to run now (used up all my time looking). Reference is [2]:

"Not only does this one get satisfied by euphoride, it gets satisfied by quintillions of tiny little micromachined mannequins. Of course, it will appear to work for as long as the AI does not have the physical ability to replace humans with tiny little mannequins, or for as long as the AI calculates it cannot win such a battle once begun. A nice, invisible, silent kill."

I can see why someone paraphrased it...

Could someone convert this to a real ref? Shenme (talk) 19:27, 2 August 2009 (UTC)[reply]

Notability[edit]

This article deals with ideas primarily put forward by Eliezer Yudkowsky and his privately-funded istitute and have not received substantial acceptance in either the academic community or the computer industry. I suggest merging with the Machine Intelligence Research Institute article. — Preceding unsigned comment added by 131.114.88.193 (talk) 17:15, 7 March 2014 (UTC)[reply]

Deletion nomination[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Since the subject appears to be non-notable and/or original research, I propose to delete the article. Although the general issue of constraing AIs to prevent dangerous behaviors is notable, and is the subject of Machine ethics, this article mostly deals with this "Friendliness theory" or "Frendly AI theory" or "Coherent Extrapolated Volition" which are neologisms that refer to concepts put forward by Yudkowsky and his institute, which didn't receive significant recognition in academic or otherwise notable sources.

This nomination has been completed at Wikipedia:Articles for deletion/Friendly artificial intelligence. Please direct all comments there or, if the discussion has since been closed, in a new deletion nomination. Thanks, Mz7 (talk) 15:20, 30 March 2014 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Lede Sentence has misleading link[edit]

The lede sentence of this article states: "A Friendly Artificial Intelligence or FAI is an artificial intelligence (AI) that has a positive rather than negative effect on humanity." There is a conflict between the use of the noun phrase "artificial intelligence" as the predicative nominative of the lede, in which "artificial intelligence" implies an artificially intelligent system or being, and the link to artificial intelligence, which is the article on a discipline or subject. The link is not consistent with its usage. I suggest that the lede be changed to something like "a superintelligence that has a positive rather than negative impact on humanity", even if that doesn't directly reference artificial intelligence. Robert McClenon (talk) 03:07, 31 March 2014 (UTC)[reply]

History[edit]

Is it worth briefly mentioning the work of I.J. Good in the opening paragraph to set some sort of historical context? Perhaps something on the lines of: "Its origins can be traced to British mathematician and cryptologist I.J. Good, who envisaged a combination of Moore's law and the advent of recursively self-improving software-based minds culminating in an ultra-rapid Intelligence Explosion." I.J. Good, "Speculations Concerning the First Ultraintelligent Machine" (HTML), Advances in Computers, vol. 6, 1965. --Davidcpearce (talk) 10:15, 2 April 2014 (UTC)[reply]

That would constitute WP:SYNTH unless it was explicitly stated in verifiable sources that this concept is directly connected to it in an unambiguous way. --Lightbound talk 22:23, 2 April 2014 (UTC)[reply]
I'm actually having trouble finding any sources on Friendly AI that don't make this link. The link is already present in primary sources like yudkowsky.net, so it certainly isn't an original synthesis. Chalmers, Wallach/Allen, and Russell/Norvig all discuss Friendly AI in the context of discussing the intelligence explosion thesis. -Silence (talk) 19:27, 3 April 2014 (UTC)[reply]
I.J. Good didn't write about the topic that this page concerns, "Friendly AI". Thus, attributing that connection must be from secondary sources which analyze and synthesize his original work. Those sources must be explicit in their wording to avoid WP:SYNTH here. A "connection to the intelligence explosion" is vague, and a few sentences is not going to be sufficient to avoid this. A strong collection of notable, independent, secondary sources kills my objection. --Lightbound talk 21:37, 3 April 2014 (UTC)[reply]
Yes, it does. As I said, most published sources make this connection explicitly. 'Original synthesis' of X and Y applies when both X and Y fail to refer to each other. -Silence (talk) 06:40, 13 July 2014 (UTC)[reply]

I just drafted a new article about the FLI/Musk/Hawking 2015 Open Letter on Artificial Intelligence; feel free to contribute and edit. Rolf H Nelson (talk) 03:21, 24 April 2015 (UTC)[reply]

"Friendliness" of AI systems is a generic concept, not one that was invented recently[edit]

Full disclosure: I (Richard Loosemore) am a researcher directly involved in this field, with relevant publications, etc.

I feel very strongly that the title and tenor of this article are grossly misleading in the weight that they give to the particular ideas of Yudkowsky, Bostrum, and other recently active individuals. The article begins, for example, with the statement that the term "friendly artificial intelligence" was coined by Yudkowsky, making it seem that the concept was significantly affected or even created when the coining happened.

The general concept of the friendliness of artificial intelligence has not only been one of the greatest, most long-running themes in all of science fiction, it has also been a constant background topic among artificial intelligence researchers and those who observe the field from the outside. There have been many speculations on the topics of "what AI systems will do when they are really invented" and "whether AI systems will be good or bad for humanity", so why does this article give the impressions that the topic sprang into existence with the speculations of one person?

This is not to reopen the recent deletion discussion, but to point out that as long as this article is mostly about MIRI, FHI and their speculations, it should be entitled "MIRI and FHI speculations about Artificial Intelligence".

I believe the article would be valid if it included references to how today's AI systems are controlled (planning systems, so called) and to the various speculations about how future AGI systems might be controlled. In that context, AI (actually AGI) friendliness becomes an issue of what the stance or morality of the AGI will be once it attains a level of intelligence that makes its friendliness a meaningful concept, and a matter of great importance for the future of humanity.LimitingFactor (talk) 19:28, 30 April 2015 (UTC)[reply]

+1. SIAI (now MIRI) coined the present usage of the term, but their ideas all have a long history (even if they tend not to advertise it) - David Gerard (talk) 09:28, 1 May 2015 (UTC)[reply]

Moved content from deleted Oracle (AI) to here[edit]

I moved content from User:Krunchyman's Oracle (AI) to here as it fits within the class of systems designed to be "an artificial general intelligence (AGI) that would have a positive effect on humanity". Rolf H Nelson (talk) 03:39, 6 April 2018 (UTC)[reply]

It all sounds good to me. Krunchyman (talk) 14:05, 7 April 2018 (UTC)[reply]

Possible merger with AI control problem[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
To not merge, given the distinct scope and emerging importance of the topics. Klbrain (talk) 17:27, 19 September 2020 (UTC)[reply]

I think there's a decent case to be made that this article ought to be merged with AI control problem. The whole idea of Friendly AI is a reaction to the technical problem of programming an AI with human values. Both articles could use some work, and merging them together could result in a meatier and more useful article. Montgolfière (talk) 12:25, 6 August 2019 (UTC)[reply]

I am tentatively and conditionally in favor of this merger. The AI control problem article would have to be the primary article, with FAI content integrated into it. It would be important to ensure that no valuable content is lost, of course. The FAI article and the term itself are outdated, and some of the content in the FAI article, such as the discussion on oracles, belongs in the control problem article. Much of the FAI article could be moved to a history section. WeyerStudentOfAgrippa (talk) 15:03, 5 February 2020 (UTC)[reply]
This does not seem to be going anywhere. I propose that the merge template be removed. WeyerStudentOfAgrippa (talk) 11:38, 30 March 2020 (UTC)[reply]
Benign AI is a discrete form of AGI. In that it is postulated to be an answer to the AI control problem, it could have a subsection on that page. However. Benign AGI is a notable solution in its own right. So, it merits a discrete page. My problem is with the title of the page. It should be generalized to 'Benign' and specified to 'Artificial General Intelligence'. So, I oppose the merge and propose renaming the page 'Benign Artificial General Intelligence'. Johncdraper (talk) 12:31, 30 March 2020 (UTC)[reply]
How many sources use that term? WeyerStudentOfAgrippa (talk) 18:41, 30 March 2020 (UTC)[reply]
You've got me there. Unfortunately, not many, at least not in titles or abstracts, outside sci fi. So, the more common term, is Baum's, following Tegmark: 'Beneficial Artificial General Intelligence' (https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22beneficial+artificial+general+intelligence%22&btnG=), which is a subsset of 'Beneficial Artificial Intelligence' (https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22beneficial+artificial+intelligence%22&btnG=). It's also called a 'Nanny AI' (https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22nanny+AI%22&btnG=). So, let me modify my proposed name change to 'Beneficial Artificial General Intelligence', which would at least reflect that we are indeed talking about an AGI, which the term 'Friendly AI' does not reflect. If this name change went through, Yudkowsky's approach and definition would still form the main part of the article, but it would open the article up to more approaches in main sections, instead of having to string them together, which is what I have been doing. Johncdraper (talk) 08:35, 31 March 2020 (UTC)[reply]
"Beneficial artificial Intelligence" and "Friendly artificial intelligence" aren't going to cause confusion in this context; we don't need to caveat it as "general intelligence" since there's currently no widespread discussion of anything called "beneficial narrow artifical intelligence". Rolf H Nelson (talk) 06:29, 2 April 2020 (UTC)[reply]
Much of Friendly artificial intelligence can be dismantled or discarded. The relevant organization should IMHO be:
  • existential risk from artificial general intelligence: parent article
  • -superintelligence: child article. Should contain in-depth discussion of whether superintelligence is feasible; xrisk should be only be summarized in this article rather than discussed in depth.
  • -AI takeover: child article. Should contain in-depth discussion of scenarios and the debate about whether a superintelligence could take over if it existed.
  • -AI control problem: child article. The core of the article is technical approaches to solving the control/alignment/safety problem in AGI, but it's also influenced by "Concrete problems in AI safety" (DeepMind, 2016), which advocates a broader discipline. Maybe rename to "AI safety engineering" someday.
  • --AI box: child article to AI control problem

Rolf H Nelson (talk) 06:28, 2 April 2020 (UTC)[reply]

Rolf H Nelson You're looking at this from an existential risk management perspective, like Bostrom. But, Bostrom stresses AI superintelligence is only one form of superintelligence - non-AI superintelligence could be augmented human intelligence, through genetic manipulation or human-brain interfaces, as Bostrom (following Banks) suggested. This whole problem of Wikipedia page organization, coupled to an article I am writing on AI, is why I am looking at taxonomies both of risk and of risk reduction. I think we need to have a consensus based on best academic practices, although I appreciate we are an encyclopedia. Do you have access to these articles, or could you send me an email I could send them to you at? https://doi.org/10.1007/s00146-018-0845-5 and also https://iopscience.iop.org/article/10.1088/0031-8949/90/6/069501 See Barrett and Baum's article here: https://sethbaum.com/ac/2017_AI-Pathways.html — Preceding unsigned comment added by Johncdraper (talkcontribs) 08:15, 2 April 2020 (UTC)[reply]
The first paper is specifically called "Classification of Global Catastrophic Risks Connected with Artificial Intelligence", confirming that augmented human superintelligence can be shoehorned under the fuzzy umbrella of artificial intelligence for most purposes. Rolf H Nelson (talk) 03:53, 3 April 2020 (UTC)[reply]
I can buy that. My biggest issue with your hierarchy now would be whether the AI control problem should even at this early stage be divided into two, one on the problem itself and other on solutions (Risk management approaches to the control problem?). Friendly artificial intelligence is a notable contribution in the latter category.Johncdraper (talk) 07:55, 3 April 2020 (UTC)[reply]
I'm not sure I follow what you mean by dividing AI control problem; I do want policy considerations from Friendly AI separate from the technical AI control problem article if that's what you mean. I just now added some material based on the baum article to existential risk from artificial general intelligence. Is there a lot of other policy material that you want to add there (or elsewhere?) Rolf H Nelson (talk) 05:13, 5 April 2020 (UTC)[reply]
Rolf H Nelson, in terms of dividing the AI control problem, what I am suggesting is dividing it into separate problem/responses pages. This (https://iopscience.iop.org/article/10.1088/0031-8949/90/1/018001/pdf) is one of the most cited articles out there. See the Google Scholar cited count here: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Responses+to+catastrophic+AGI+risk%3A+a+survey&btnG=. The SCOPUS count is 24. See especially the summary on p. 25. So, right now, capability control and motivation control are on the AI control page. I am suggesting a new page, Responses to the AI Control Problem, which would be divided into societal proposals and then design proposals, with design proposals being divided into external constraints (capability control) and internal constraints (motivation selection). The Friendly AI then fits in as a solution in intrinsic design. In any case, the AI control problem page still needs a separate 'Responses' main heading, with 'Capability control' and 'Motivation selection' being subheadings under that. I am going to divide that now to show you what it looks like. See AI_control_problem. Johncdraper (talk) 09:54, 5 April 2020 (UTC)[reply]
Seems unnecessary. The control problem article is already quite short and could probably be shortened further. WeyerStudentOfAgrippa (talk) 13:40, 5 April 2020 (UTC)[reply]
I think it makes sense to limit the scope of the control problem article to the technical problems/responses. Maybe "societal" responses could go in your regulation article? WeyerStudentOfAgrippa (talk) 13:46, 5 April 2020 (UTC)[reply]
Johncdraper Maybe providing some concrete sources would help. Some policy proposals don't involve (IMHO improbable) scenarios of formal regulation (at least in the usual meaning of "a rule or directive made and maintained by an authority"). Rolf H Nelson (talk) 01:15, 6 April 2020 (UTC)[reply]
Or, to be more specific (I guess you've already provided some sources), a draft of some sentences you want to put in. It might be a moot point if we can't get consensus that the information is well-sourced enough to include at the level desired. As always, you can also be WP:BOLD and just try to insert it anyway if you prefer. Rolf H Nelson (talk) 01:18, 6 April 2020 (UTC)[reply]
Rolf H Nelson and WeyerStudentOfAgrippa, Okay, so what I have done is expand on the social approaches to the AI control problem over at the Regulation of AI page. Please check that out. Johncdraper (talk) 09:55, 6 April 2020 (UTC)[reply]
The additions to Regulation of AI seem fine. It's only if you had more about non-governmental actions, that I would worry about finding a separate home for it. Rolf H Nelson (talk) 06:10, 8 April 2020 (UTC)[reply]
I'm not sure what you mean by "Friendly AI" separate from an aligned AI. Yudkowsdky wrote in 2001 "A “Friendly AI” is an AI that takes actions that are, on the whole, beneficial to humans and humanity", which sounds in c. 2020 terminology like "aligned ai". The current Friendly AI article talks about CEV and safe self-modification, both of which fit fine under alignment. Rolf H Nelson (talk) 07:07, 8 April 2020 (UTC)[reply]
Nitpick: Alignment (particularly as defined by Christiano) focuses merely on getting AI to basically want to try to help humans. It's only one part of the larger problem of getting AI to act in reliably beneficial ways. WeyerStudentOfAgrippa (talk) 09:45, 8 April 2020 (UTC)[reply]
As of August 2020, I am tending to think this should stay as a separate page, with a sub-entry on the AI control problem 'main page'. Johncdraper (talk) 10:42, 13 August 2020 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Redirect from AI Nanny][edit]

There is a page redirect from AI Nanny on this page. However, there is now an AI Nanny section on the AI control problem page. What do we do about the redirect? Johncdraper (talk) 18:36, 20 April 2020 (UTC)[reply]

I don't understand the question, are you asking how to edit a redirect on Wikipedia, or are you asking where the AI Nanny redirect should now point? Rolf H Nelson (talk) 02:35, 23 April 2020 (UTC)[reply]
I wouldn't edit the redirect without consensus, so it's the second question. You're an admin, so I would automatically defer to you on a structural (versus content) issue. I'm just flagging it, and suggesting it should be changed. Johncdraper (talk) 07:40, 23 April 2020 (UTC)[reply]
I'm not an admin, and anyway, there's no particular need for caution when editing an obscure redirect, just point it where most of the AI nanny content currently resides since your changes. You're ahead of the curve if you noticed the redirect existed in the first place. Rolf H Nelson (talk) 02:39, 24 April 2020 (UTC)[reply]