Topic Tag: video

home Forums Topic Tag: video

 ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks

  

Today’s general-purpose deep convolutional neural networks (CNN) for image classification and object detection are trained offline on large static datasets. Some applications, however, will require training in real-time on live video streams with a human-in-the-loop. We refer to this class of…


 Shared Learning : Enhancing Reinforcement in Q-Ensembles

  

Deep Reinforcement Learning has been able to achieve amazing successes in a variety of domains from video games to continuous control by trying to maximize the cumulative reward. However, most of these successes rely on algorithms that require a large amount of data to train in order to obtain resu…


 Robust Physical-World Attacks on Deep Learning Models

   

Although deep neural networks (DNNs) perform well in a variety of applications, they are vulnerable to adversarial examples resulting from small-magnitude perturbations added to the input data. Inputs modified in this way can be mislabeled as a target class in targeted attacks or as a random class …


 End-to-End United Video Dehazing and Detection

 

The recent development of CNN-based image dehazing has revealed the effectiveness of end-to-end modeling. However, extending the idea to end-to-end video dehazing has not been explored yet. In this paper, we propose an End-to-End Video Dehazing Network (EVD-Net), to exploit the temporal consistency…


 Build your own Machine Learning Visualizations with the new TensorBoard API

  

Posted by Chi Zeng and Justine Tunney, Software Engineers, Google Brain Team When we open-sourced TensorFlow in 2015, it included TensorBoard, a suite of visualizations for inspecting and understanding your TensorFlow models and runs. Tensorboard included a small, predetermined set of visualization…


 Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach

 

Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications. However, the inadequate network bandwidth often limits the spatial resolution of the transmitted video, which will heavily degrade the recognition rel…


 Recurrent Ladder Networks

  

We propose a recurrent extension of the Ladder networks whose structure is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and tempor…


 A multi-agent reinforcement learning model of common-pool resource appropriation

  

Humanity faces numerous problems of common-pool resource appropriation. This class of multi-agent social dilemma includes the problems of ensuring sustainable use of fresh water, common fisheries, grazing pastures, and irrigation systems. Abstract models of common-pool resource appropriation based …


 Human Pose Forecasting via Deep Markov Models

 

Human pose forecasting is an important problem in computer vision with applications to human-robot interaction, visual surveillance, and autonomous driving. Usually, forecasting algorithms use 3D skeleton sequences and are trained to forecast for a few milliseconds into the future. Long-range forec…


 Multi-label Class-imbalanced Action Recognition in Hockey Videos via 3D Convolutional Neural Networks

  

Automatic analysis of the video is one of most complex problems in the fields of computer vision and machine learning. A significant part of this research deals with (human) activity recognition (HAR) since humans, and the activities that they perform, generate most of the video semantics. Video-ba…


 Fast Image Processing with Fully-Convolutional Networks

 

We present an approach to accelerating a wide variety of image processing operators. Our approach uses a fully-convolutional network that is trained on input-output pairs that demonstrate the operator’s action. After training, the original operator need not be run at all. The trained network …


 Expressions in Virtual Reality

   

Posted by Steven Hickson, Software Engineering Intern, and Nick Dufour, Avneesh Sud, Software Engineers, Machine Perception Recently Google Machine Perception researchers, in collaboration with Daydream Labs and YouTube Spaces, presented a solution for virtual headset ‘removal’ for mixed realit…


 Shrinking data for surgical training

Technique that reduces video files to one-tenth their initial size enables speedy analysis of laparoscopic procedures. Shrinking data for surgical training by Larry Hardesty | MIT News Office


 Roboschool

We are releasing Roboschool: open-source software for robot simulation, integrated with OpenAI Gym. Your browser does not support the video tag. Three control policies running on three different robots, racing each other in Roboschool. You can re-enact this scene by running agent_zoo/demo_race…


 Computer learns to recognize sounds by watching video

Machine-learning system doesn’t require costly hand-annotated data. Computer learns to recognize sounds by watching video by Larry Hardesty | MIT News Office