Bigga
Well-Known Member
Simple explanation. Computers work on priority lists. So how do you structure what's at the top? Think of a robot with a big red button that shuts it down.
You tell the robot to make you a brew. So it moves towards the kitchen but you see it's about to crush your kid crawling on the floor. You go to push the red shutdown button.
Scenario 1:
You told the robot that the most important thing is to complete its task. It cannot complete its task if you push the shutdown button. Therefore it prevents you from doing it and runs over the kid.
Scenario 2:
You told the robot the most important thing is to allow people to push the button. You wake the robot up and tell it to make you a brew. It immediately pushes its own button because that's the most important instruction. You have invented a suicidal robot that's useless.
Scenario 3:
You attempt to equalise the importance so it's not bothered if it pushes the button or makes the brew. The efficiency function realises pushing the button is easier so it pushes the button.
Scenario 4:
You tell the robot only you can push the button and make it top priority. So make me a brew. The robot sees the kitchen is 20 metres away and requires boiling kettles. Instead it physically attacks you in order to get you to push the button because that's more efficient.
Scenario 5:
You don't tell the robot that a button exists. The robot goes to make a brew and you shut it down remotely. The diagnostic and self learning AI realises what happened and you're back to scenario 5 next time.
Scenario 6:
You make a button but you cannot access it. This is Scenario 1 again
Erm...
Why wouldn't a programmer simply design a failsafe to not harm humans without shutting down as it's main priority...?