Apr 17 2018 cs.RO
For a natural social human-robot interaction, it is essential for a robot to learn the human-like social skills. However, learning such skills is notoriously hard due to the limited availability of direct instructions from people to teach a robot. In this paper, we propose an intrinsically motivated reinforcement learning framework in which an agent gets the intrinsic motivation-based rewards through the action-conditional predictive model. By using the proposed method, the robot learned the social skills from the human-robot interaction experiences gathered in the real uncontrolled environments. The results indicate that the robot not only acquired human-like social skills but also took more human-like decisions, on a test dataset, than a robot which received direct rewards for the task achievement.
A new large-scale video dataset for human action recognition, called STAIR Actions is introduced. STAIR Actions contains 100 categories of action labels representing fine-grained everyday home actions so that it can be applied to research in various home tasks such as nursing, caring, and security. In STAIR Actions, each video has a single action label. Moreover, for each action category, there are around 1,000 videos that were obtained from YouTube or produced by crowdsource workers. The duration of each video is mostly five to six seconds. The total number of videos is 102,462. We explain how we constructed STAIR Actions and show the characteristics of STAIR Actions compared to existing datasets for human action recognition. Experiments with three major models for action recognition show that STAIR Actions can train large models and achieve good performance. STAIR Actions can be downloaded from http://actions.stair.center
Predicting conversion rates (CVRs) in display advertising (e.g., predicting the proportion of users who purchase an item (i.e., a conversion) after its corresponding ad is clicked) is important when measuring the effects of ads shown to users and to understanding the interests of the users. There is generally a time delay (i.e., so-called \it delayed feedback) between the ad click and conversion. Owing to the delayed feedback, samples that are converted after an observation period may be treated as negative. To overcome this drawback, CVR prediction assuming that the time delay follows an exponential distribution has been proposed. In practice, however, there is no guarantee that the delay is generated from the exponential distribution, and the best distribution with which to represent the delay depends on the data. In this paper, we propose a nonparametric delayed feedback model for CVR prediction that represents the distribution of the time delay without assuming a parametric distribution, such as an exponential or Weibull distribution. Because the distribution of the time delay is modeled depending on the content of an ad and the features of a user, various shapes of the distribution can be represented potentially. In experiments, we show that the proposed model can capture the distribution for the time delay on a synthetic dataset, even when the distribution is complicated. Moreover, on a real dataset, we show that the proposed model outperforms the existing method that assumes an exponential distribution for the time delay in terms of conversion rate prediction.
Aug 16 2017 cs.LG
In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.
In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention. In this paper, we particularly consider generating Japanese captions for images. Since most available caption datasets have been constructed for English language, there are few datasets for Japanese. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO, which is called STAIR Captions. STAIR Captions consists of 820,310 Japanese captions for 164,062 images. In the experiment, we show that a neural network trained using STAIR Captions can generate more natural and better Japanese captions, compared to those generated using English-Japanese machine translation after generating English captions.
For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world. Each and every day during the 14 days, the system gathered robot interaction experiences with people through a hit-and-trial method and then trained the MDARQN on these experiences using end-to-end reinforcement learning approach. The results of interaction based learning indicate that the robot has learned to respond to complex human behaviors in a perceivable and socially acceptable manner.
For robots to coexist with humans in a social world like ours, it is crucial that they possess human-like social interaction skills. Programming a robot to possess such skills is a challenging task. In this paper, we propose a Multimodal Deep Q-Network (MDQN) to enable a robot to learn human-like interaction skills through a trial and error method. This paper aims to develop a robot that gathers data during its interaction with a human and learns human interaction behaviour from the high-dimensional sensory information using end-to-end reinforcement learning. This paper demonstrates that the robot was able to learn basic interaction skills successfully, after 14 days of interacting with people.
Social Coding Sites (SCSs) are social media services for sharing software development projects on the Web, and many open source projects are currently being developed on SCSs. One of the characteristics of SCSs is that they provide a platform on social networks that encourages collaboration between developers with the same interests and purpose. For example, external developers can easily report bugs and improvements to the project members. In this paper, we investigate keys to the success of projects on SCSs based on large data consisting of more than three hundred thousand projects. We focus on the following three perspectives: 1) the team structure, 2) social activity with external developers, and 3) content developed by the project. To evaluate the success quantitatively, we define activity, popularity and sociality as success indexes. A summary of the findings we obtained by using the techniques of correlation analysis, social network analysis and topic extraction is as follows: the number of project members and the connectivity between the members are positively correlated with success indexes. Second, projects that faithfully tackle change requests from external developers are more likely to be successful. Third, the success indexes differ between topics of softwares developed by projects. Our analysis suggests how to be successful in various projects, not limited to social coding.