Apr 17 2018 cs.RO
For a natural social human-robot interaction, it is essential for a robot to learn the human-like social skills. However, learning such skills is notoriously hard due to the limited availability of direct instructions from people to teach a robot. In this paper, we propose an intrinsically motivated reinforcement learning framework in which an agent gets the intrinsic motivation-based rewards through the action-conditional predictive model. By using the proposed method, the robot learned the social skills from the human-robot interaction experiences gathered in the real uncontrolled environments. The results indicate that the robot not only acquired human-like social skills but also took more human-like decisions, on a test dataset, than a robot which received direct rewards for the task achievement.
For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world. Each and every day during the 14 days, the system gathered robot interaction experiences with people through a hit-and-trial method and then trained the MDARQN on these experiences using end-to-end reinforcement learning approach. The results of interaction based learning indicate that the robot has learned to respond to complex human behaviors in a perceivable and socially acceptable manner.
For robots to coexist with humans in a social world like ours, it is crucial that they possess human-like social interaction skills. Programming a robot to possess such skills is a challenging task. In this paper, we propose a Multimodal Deep Q-Network (MDQN) to enable a robot to learn human-like interaction skills through a trial and error method. This paper aims to develop a robot that gathers data during its interaction with a human and learns human interaction behaviour from the high-dimensional sensory information using end-to-end reinforcement learning. This paper demonstrates that the robot was able to learn basic interaction skills successfully, after 14 days of interacting with people.
We introduce p-equivalence by asymptotic probabilities, which is a weak almost-equivalence based on zero-one laws in finite model theory. In this paper, we consider the computational complexities of p-equivalence problems for regular languages and provide the following details. First, we give an robustness of p-equivalence and a logical characterization for p-equivalence. The characterization is useful to generate some algorithms for p-equivalence problems by coupling with standard results from descriptive complexity. Second, we give the computational complexities for the p-equivalence problems by the logical characterization. The computational complexities are the same as for the (fully) equivalence problems. Finally, we apply the proofs for p-equivalence to some generalized equivalences.
Nov 18 2015 cs.RO
Probabilistic completeness is an important property in motion planning. Although it has been established with clear assumptions for geometric planners, the panorama of completeness results for kinodynamic planners is still incomplete, as most existing proofs rely on strong assumptions that are difficult, if not impossible, to verify on practical systems. In this paper, we focus on an important class of kinodynamic planners, namely those that interpolate trajectories in the state space. We provide a proof of probabilistic completeness for these planners under assumptions that can be readily verified from the system's equations of motion and the user-defined interpolation function. Our proof relies crucially on a property of interpolated trajectories, termed second-order continuity (SOC), which we show is tightly related to the ability of a planner to benefit from denser sampling. We analyze the impact of this property in simulations on a low-torque pendulum. Our results show that a simple RRT using a second-order continuous interpolation swiftly finds solution, while it is impossible for the same planner using standard Bezier curves (which are not SOC) to find any solution.
Oct 13 2015 cs.RO
We propose a method for checking and enforcing multi-contact stability based on the Zero-tilting Moment Point (ZMP). The key to our development is the generalization of ZMP support areas to take into account (a) frictional constraints and (b) multiple non-coplanar contacts. We introduce and investigate two kinds of ZMP support areas. First, we characterize and provide a fast geometric construction for the support area generated by valid contact forces, with no other constraint on the robot motion. We call this set the full support area. Next, we consider the control of humanoid robots using the Linear Pendulum Mode (LPM). We observe that the constraints stemming from the LPM induce a shrinking of the support area, even for walking on horizontal floors. We propose an algorithm to compute the new area, which we call pendular support area. We show that, in the LPM, having the ZMP in the pendular support area is a necessary and sufficient condition for contact stability. Based on these developments, we implement a whole-body controller and generate feasible multi-contact motions where an HRP-4 humanoid locomotes in challenging multi-contact scenarios.
Jan 21 2015 cs.RO
Humanoid robots locomote by making and breaking contacts with their environment. A crucial problem is therefore to find precise criteria for a given contact to remain stable or to break. For rigid surface contacts, the most general criterion is the Contact Wrench Condition (CWC). To check whether a motion satisfies the CWC, existing approaches take into account a large number of individual contact forces (for instance, one at each vertex of the support polygon), which is computationally costly and prevents the use of efficient inverse-dynamics methods. Here we argue that the CWC can be explicitly computed without reference to individual contact forces, and give closed-form formulae in the case of rectangular surfaces -- which is of practical importance. It turns out that these formulae simply and naturally express three conditions: (i) Coulomb friction on the resultant force, (ii) ZMP inside the support area, and (iii) bounds on the yaw torque. Conditions (i) and (ii) are already known, but condition (iii) is, to the best of our knowledge, novel. It is also of particular interest for biped locomotion, where undesired foot yaw rotations are a known issue. We also show that our formulae yield simpler and faster computations than existing approaches for humanoid motions in single support, and demonstrate their consistency in the OpenHRP simulator.
Nov 18 2014 cs.RO
Path-velocity decomposition is an intuitive yet powerful approach to address the complexity of kinodynamic motion planning. The difficult trajectory planning problem is solved in two separate, simpler, steps: first, find a path in the configuration space that satisfies the geometric constraints (path planning), and second, find a time-parameterization of that path satisfying the kinodynamic constraints. A fundamental requirement is that the path found in the first step should be time-parameterizable. Most existing works fulfill this requirement by enforcing quasi-static constraints in the path planning step, resulting in an important loss in completeness. We propose a method that enables path-velocity decomposition to discover truly dynamic motions, i.e. motions that are not quasi-statically executable. At the heart of the proposed method is a new algorithm -- Admissible Velocity Propagation -- which, given a path and an interval of reachable velocities at the beginning of that path, computes exactly and efficiently the interval of all the velocities the system can reach after traversing the path while respecting the system kinodynamic constraints. Combining this algorithm with usual sampling-based planners then gives rise to a family of new trajectory planners that can appropriately handle kinodynamic constraints while retaining the advantages associated with path-velocity decomposition. We demonstrate the efficiency of the proposed method on some difficult kinodynamic planning problems, where, in particular, quasi-static methods are guaranteed to fail.
A new inverse iteration algorithm that can be used to compute all the eigenvectors of a real symmetric tri-diagonal matrix on parallel computers is developed. The modified Gram-Schmidt orthogonalization is used in the classical inverse iteration. This algorithm is sequential and causes a bottleneck in parallel computing. In this paper, the use of the compact WY representation is proposed in the orthogonalization process of the inverse iteration with the Householder transformation. This change results in drastically reduced synchronization cost in parallel computing. The new algorithm is evaluated on both an 8-core and a 32-core parallel computer, and it is shown that the new algorithm is greatly faster than the classical inverse iteration algorithm in computing all the eigenvectors of matrices with several thousand dimensions.