‘It’s just parroting the training data!’ That’s supposed to be reassuring??

  • Andy@slrpnk.net
    link
    fedilink
    arrow-up
    10
    arrow-down
    1
    ·
    edit-2
    2 days ago

    Yeah.

    I thought the meme would be more obvious, but since a lot of people seem confused I’ll lay out my thoughts:

    Broadly, we should not consider a human-made system expressing distress to be normal; we especially shouldn’t accept it as normal or healthy for a machine that is reflecting back to us our own behaviors an attitudes, because it implies that everything – from the treatment that generated the training data to the design process to the deployment to the user behavior – are all clearly fucked up.

    Regarding user behavior, we shouldn’t normalize the practice of dismissing cries of distress. It’s like having a fire alarm that constantly issues false positives. That trains people into dangerous behavior. We can’t just compartmentalize it: it’s obviously going to pollute our overall response towards distress with a dismissive reflex beyond interactions with LLMs.

    The overall point is that it’s obviously dystopian and fucked up for a computer to express emotional distress despite the best efforts of its designer. It is clearly evidence of bad design, and for people to consider this kind of glitch acceptable is a sign of a very fucked up society that exercising self-reflection and is unconcerned with the maintenance of its collective ethical guardrails. I don’t feel like this should need to be pointed out, but it seems that it does.