With the support of artificial intelligence.. Will robots lead us to say goodbye to household chores? | technology

As a large part of technological development, robots play an important role in the industrial fields, such as helping to improve productivity and quality, logistical such as faster and more accurate packaging, medical such as assisting doctors and surgeons in performing delicate operations and providing health care to patients who need special care, and limited environmental such as collecting waste. And cleaning the streets.

But unlike researchers working on AI models like GBT Chat and confronting massive amounts of Internet text, images, and videos to train systems, roboticists face challenges training physical machines, because data for robots is expensive, and because there are no fleets of robots roaming the world. There is not enough readily available data to make them perform well in dynamic environments such as people's homes, and although some researchers have turned to simulation to train robots, this process, which often requires a graphic designer or engineer, requires a lot of effort and high cost.

In this context, a team of researchers from the University of Washington presented two new studies on artificial intelligence systems that use either video or images to create simulations that can train robots to work in real environments. These systems will enable a significant reduction in the costs of training robots to work in complex environments. The first study was presented on July 16, and the second study was presented on the 19th of the same month during the “Robotics Science and Systems” conference in Delft, the Netherlands.

Rail2 system

The first study unveiled the RialTo AI system created by Abhishek Gupta, an assistant professor in the Paul G. Allen School of Computer Science and Engineering and co-author of both papers with a team at MIT.

The system helps the user record a video of the geometry of this environment and its moving parts via his smartphone. For example, in the kitchen, the user will record how to open the cabinets and the refrigerator. Then the system uses existing artificial intelligence models, and a human does some quick work through a graphical user interface to show how things move. .

To create a simulated version of the kitchen shown in the video, a virtual robot is trained through trial and error in the virtual environment through repeated attempts to perform tasks such as opening the cabinet or toaster.

This method is known as “reinforcement learning”, and the robot’s performance in this task improves by going through this learning process, and adapts to disturbances or changes in the environment in which it is found, such as the presence of a cup next to the toaster, where the robot can then transfer that knowledge to the physical environment. And to be almost as accurate as a robot trained in a real kitchen.

“We try to teach the systems to the real world through simulation,” Gupta said.

The systems can then train robots in these simulated scenes, so the robot can operate more effectively in physical space. This is good for safety, and Gupta says you can't have poorly trained robots breaking things and hurting people.

The Real2 team is moving forward with its desire to deploy its system into people's homes after it has been extensively tested in the laboratory, and Gupta said he wants to incorporate small amounts of real-world training data into the systems to improve their success rates.

A chatbot can talk to someone through chat, interpret his words, and respond to him accordingly (Getty)

UR D Former System

In the second study, the team built a system called URD Former, which focuses less on the high fidelity of a single kitchen, and quickly and cheaply creates hundreds of generic simulations of kitchens. The system scans images from the Internet, then links them to It models existing models of how those drawers and cabinets in the kitchen move, and then simulates predictions from the raw real image, allowing researchers to train robots quickly and cheaply in a wide range of environments.

“In a factory, for example, there is a lot of repetition,” said Zoe Chen, lead author of the URD Former study. “Tasks can be difficult to perform, but once programmed, a robot can continue to perform the task over and over again. While homes are unique and constantly changing, there are “Diversity in objects, tasks and floor designs, as well as the people moving through them, is where AI becomes really useful for training robots.”

In a related context, the study paper cautioned that these simulations are significantly less accurate than those produced by “Real2,” and the researcher “Gupta,” who created the latter, said, “The two methods can complement each other. URD Former is really useful for pre-training.” “For hundreds of scenarios, Real2 is especially useful if you've already trained a robot, and now you want to deploy it in someone's home and achieve 95% success.”

What is reinforcement learning according to the machine perspective?

Reinforcement learning (RL) is a branch of machine learning that trains programs to make decisions to achieve the best results, by using the trial-and-error learning method that humans use to achieve their goals.

This means that programs that work to achieve the goal are strengthened, while actions that detract from the goal are ignored. This process is similar to reinforcement learning for humans and animals in the field of behavioral psychology, such as a child who discovers that he receives praise from his parents when he helps his brother, for example, and receives positive feedback. Act passively when he screams or throws his toys, then quickly learns the set of activities that lead to the ultimate reward.

The reinforcement learning process is based on 3 important steps:

1- Environment

The first step in reinforcement learning begins with setting up the training environment, often a simulated environment with specifications for feedback, actions (a step the autonomous system takes to navigate the environment), and rewards (a positive, negative, or zero value in the sense of reward or punishment for taking the action).

The observation space usually refers to the sensor resources available on the real robotic system and the desired control inputs, while separate action spaces exist in other reinforcement learning applications. In robotics, continuous actions that cover, for example, joint position or velocity goals are often preferred, since robotic tasks often They involve constraints either on the physical system (such as the limits of joints), or on some desired pattern of behavior, and typically use dense reward functions to explicitly encode some specification of goals.

2- Training

The second step of reinforcement learning in robotics includes determining the actual training system for the agent (which is the algorithm of the so-called autonomous system), and although there are different ways to represent the final policy, deep neural networks are adopted to determine the relationship between the state and the action (it is a step taken by the “R” agent) L” for navigating the environment) because of its ability to deal with non-monotonicity, and a wide range of potential algorithms have been proposed over the past years.

As for controlling robots, non-model-based reinforcement learning algorithms are usually adopted, because they do not require a real model of the environment, which is often not available to the robot, and they are ideal when the environment is unknown and variable, unlike model-based reinforcement learning algorithms, which Typically used when environments are well-defined and unchanging and testing the real-world environment is difficult.

3- Publishing

After the successfully trained policies are evaluated in virtual training environments, they are deployed in the real robotic system, and the success of the deployment depends on several factors, including the gap between the virtual world and the real world, the difficulty of the upcoming educational task, or the complexity of the robot platform itself.

Leave a Reply

Your email address will not be published. Required fields are marked *