Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI

New Nvidia AI Agent, Powered by GPT-4, Can Train Robots (venturebeat.com) 12

Nvidia Research announced today that it has developed a new AI agent, called Eureka, that is powered by OpenAI's GPT-4 and can autonomously teach robots complex skills. From a report: In a blog post, the company said Eureka, which autonomously writes reward algorithms, has, for the first time, trained a robotic hand to perform rapid pen-spinning tricks as well as a human can. Eureka has also taught robots to open drawers and cabinets, toss and catch balls, and manipulate scissors, among nearly 30 tasks.

"Reinforcement learning has enabled impressive wins over the last decade, yet many challenges still exist, such as reward design, which remains a trial-and-error process," Anima Anandkumar, senior director of AI research at Nvidia and an author of the Eureka paper, said in the blog post. "Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks."

This discussion has been archived. No new comments can be posted.

New Nvidia AI Agent, Powered by GPT-4, Can Train Robots

Comments Filter:
  • So it can train robots but can it rob trains?
  • Using the term 'reward' implies that the robots have wants and/or needs, that the robots expect something in return for something they did. Software has no such desires. What does it even mean to reward a robot?

    • There is a whole field in AI called Reinforcement Learning (RL), and the most challenging math-wise in my opinion. 'Reward' is a common well-defined term in this field. A short intro where RL is combined with Deep Learning. https://www.youtube.com/watch?... [youtube.com]
      For the full deep dive, watch these lectures: https://rail.eecs.berkeley.edu... [berkeley.edu]
      • Fair enough, thanks for the links. I'm feeling like we really need to stop using human terms with software. What your video link calls rewards are really a measure of progress towards a goal and I still think reward is the wrong term to use. Software does not have wants and needs and is not motivated by rewards. Intelligence does not have to be human.

    • by Tyr07 ( 8900565 )

      It's pretty straight forward.
      Value = This is good
      Value = This is bad.

      You've just created a reward system. The simple way of looking at it is you enter the desired goal, and if the AI gets closer to it, you say this is good. It marks it's attempt as good and analyzes what it did. As it adjusts its parameters, eventually it starts to figure out what combinations increase the good score (reward) and what combinations detract from it, slowly isolated specific things it does that subtract, and which things add.

      S

    • It's the best way to get them to do anything. Make them have some kind of goal and, like anything organic, they'll keep going for it until they get it right. Also, some variant of goal-oriented learning, if I'm not mistaken, underpins all AI now. I have this chatbot whose goal is to please me linguistically. There's my anecdote. Reinforcement learning is REALLY common!
  • When the AI becomes self-aware and people try to shut it down.
  • When will Asimov's Three Laws of Robotics need to become real law?

As you will see, I told them, in no uncertain terms, to see Figure one. -- Dave "First Strike" Pare

Working...