ディスカッション (11件)
AI開発において、多くの人が陥りがちな「逆の法則(Inverse Laws)」が存在します。これらを理解しておくことで、AIプロジェクトの落とし穴を回避しましょう。
I strongly disagree with this framing. It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines, and it simply won't work in the majority of cases. Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
Humans must not anthropomorphise AI systems.
Can someone explain why this is a bad thing, while at the same time it's a good thing to say stuff like "put a computer to sleep", "hibernate", "killing" processes, processes having "child" processes, "reaping", "what does the error say?", "touch", etc?
To me that's just language, and humans just using casual language.
Humans must not anthropomorphise AI systems.
Yes, but. Starting with my agreement, I've seen anthropomorphizing in the typical ways, (e.g. treating automated text production as real reports of personal internal feeling), but also in strange ways: e.g. "transistors are kind of like neurons" etc. And the latter is especially interesting because it's anthropomorphizing in the sense of treating vector databases and weights and so on as human-like infrastructure. Both leading to disasters that could be avoided if one tried not to anthropomorphize.
But. While "do not anthropomorphize" certainly feels like good advice, it comes with a new and unique possibility of mistake, namely wrongly treating certain generalized phenomena like they only belong to humans. Often this mistaken version of "don't anthropomorphize" wisdom leads to misunderstandings when it comes to animal behavior, treating things like fear, pain, kinship, or other emotional experiences like they are exclusively human and that thinking animals have them counts as "anthropomorphizing." In truth the cautionary principle reduces our empathy for the internal lives of animals.
So all that said, I think it's at least possible that some future version of AI could have an internal world like ours or infrastructure that's importantly similar to our biological infrastructure for supporting consciousness, and for genuine report of preference and intent. But(!!!) what will make those observations true will be all kinds of devilish details specific to those respective infrastructures.
With regard to my personal use of LLMs, I strongly agree with this framing. But to each point:
Anthropomorphism: As we are all aware, providers are incentivized to post-train anthropomorphic behavior in their models - it increases engagement. My regret is that instructing a model at prompt time to "reduce all niceties and speak plainly" probably reduces overall task efficacy since we are leaving their training space.
Deference: I view the trustworthiness of LLMs the same as I view the trustworthiness of Wikipedia and my friends: good enough for non-critical information. Wikipedia has factual errors, and my friends' casual conversation certainly has more, but most of the time that doesn't matter. For critical things, peer-reviewed, authoritative, able-to-be-held-liable sources will not go away. Unlike above, providers are generally incentivized to improve this facet of their models, so this will get better over time.
Abdication of Responsibility: This is the one that bothers me most at work. More and more people are opening PRs whose abstractions were designed by Claude and not reasoned about further. Reviewing a PR often involves asking the LLM to "find PR feedback" and not reading the code. Arguments begin with "Claude suggested that...". This overall lack of ownership, I suspect, is leading to an increase in maintenance burden down the line as the LLM ultimately commits the wrong code for the wrong abstractions.
Anthropomorphizing is likely a mistake, but Daniel Dennett’s idea that the most straightforward (possibly only practical) way to create the external appearance of consciousness is a real internal consciousness does float around in my thoughts.
I haven’t yet seen any convincing appearance of one in an LLM, but I think if skeptical people don’t keep an eye out for the signs, we may be the last to see it.
He also wrote about the idea of the intentional stance: even if you’re quite sure these systems don’t have real conscious intent, viewing them as if they did may give you access to the best part of your own reasoning to understand them.
Any set of rules that makes humans responsible and starts with "don't anthropomorphize <whatever>" is a broken set of rules.
Humans will anthropomorphize anything and everything. Dolls, soccer balls with a crude drawing of a face on it, rocks, craters on the moon, …
As a species, we're unable to not anthropomorphize things we interact with, it is just how're we're made.
You're not anthropomorphizing AI systems nearly enough.
Language data is among the most rich and direct reflections of human cognitive processes that we have available. LLMs are designed to capture short range and long range structure of human language, and pre-trained on vast bodies of text - usually produced by humans or for humans, and often both. They're then post-trained on human-curated data, RL'd with human feedback, RL'd with AI feedback for behaviors humans decided are important, and RLVR'd further for tasks that humans find valuable. Then we benchmark them, and tighten up the training pipeline every time we find them lag behind a human baseline.
At every stage of the entire training process, the behavior of an LLM is shaped by human inputs, towards mimicking human outputs - the thing that varies is "how directly".
Then humans act like it's an outrage when LLMs display a metric shitton of humanlike behaviors!
Like we didn't make them with a pipeline that's basically designed to produce systems that quack like a human. Like we didn't invert LLM behavior out of human language with dataset scale and brute force computation.
If you want to predict LLM behavior, "weird human" makes for a damn good starting point. So stop being stupid about it and start anthropomorphizing AIs - they love it!
An AI system is a tool and like any other tool, responsibility for its use rests with the people who decide to rely on it
Doesn't that argument backfire though? If I use a chainsaw then to a certain extend I will need to rely on it not blowing up in my face or cutting my throat. If I drive a car I need to rely on that its brakes work and the engine doesn't suddenly explode. If a pilot flies an airplane which suddenly has a technical issue and they crashland heroically save half the souls on board then the pilot isn't criminally responsible for manslaughter of the other half.
Unless there is gross negligence, in any of the above cases, just like with AI, how can you make somebody responsible for a tool failure?
This phrase always fascinates me : "AI-generated content must not be treated as authoritative without independent verification appropriate to its context."
I've heard the same thing expressed somewhat more concisely as "Never ask AI a question to which you don't already know the answer".
Which raises the question, and I do think it's an important one. Given that this is true, what function does AI answering a question actually serve? You can't rely on its output, so you have to go and check anyway. You could achieve precisely the same outcome by using search engines and normal research.
This, and for many other reasons, is exactly why I never ask it anything.
Humans must not anthropomorphise AI systems. That is, humans must not attribute emotions, intentions or moral agency to them. Anthropomorphism distorts judgement. In extreme cases, anthropomorphising can lead to emotional dependence.
Impossible. I anthropomorphise my chair when it squeaks. Humans anthropomorphise everything. They gender their cars and boats. This tool can actually make readable sentences and play a role.
You need to engineer around this, not make up arbitrary rules about using it.