Setti, Francesco; Oleari, Elettra; Leporini, Alice; Trojaniello, Diana; Sanna, Alberto; Capitanio, Umberto; Montorsi, Francesco; Salonia, Andrea; Muradore, Riccardo 2019, ISBN: 978-1-5386-7825-1. Abstract | Links | BibTeX | Tags: Artificial Intelligence, Cognitive control, Computer Science, Laparoscopy, Laparoscopy, machine learning, Robot, Robotic surgery, Surgery, Teleoperation Hernansanz, Albert; Martínez, ; Rovira, ; Casals, Alicia A physical/virtual platform for hysteroscopy training Conference Proceedings of the 9th Joint Workshop on New Technologies for Computer/Robot Assisted Surgery, 2019. Abstract | Links | BibTeX | Tags: Computer Science, Endoscopy, Laparoscopy, Laparoscopy, Robot, Robotic surgery, Robotic Surgery, Surgery, Surgical robots, Training Roberti, Andrea; Muradore, Riccardo; Fiorini, Paolo; Cristani, Marco; Setti, Francesco An energy saving approach to active object recognition and localization Conference Annual Conference of the IEEE Industrial Electronics Society (IECON).Washington, DC, USA. 2018. Abstract | Links | BibTeX | Tags: Active object recognition, Artificial Intelligence, Computer Science, Learning, Object recognition, Pattern Recognition, POMDP, Robotics Roberti, Andrea; Carletti, Marco; Setti, Francesco; Castellani, Umberto; Fiorini, Paolo; Cristani, Marco Recognition self-awareness for active object recognition on depth images Conference BMVC 2018 2018. Abstract | Links | BibTeX | Tags: 3D object classifier, Artificial Intelligence, Computer Science, Object exploration, Object recognition, POMDP Singh, Gurkirt; Saha, Suman; Cuzzolin, Fabio Predicting action tubes Journal Article 2018, (Proceedings of the ECCV 2018 Workshop on Anticipating Human Behaviour (AHB 2018), Munich, Germany, Sep 2018). Abstract | Links | BibTeX | Tags: Artificial Intelligence, Computer Science, Computer vision, Object recognition, Pattern Recognition, Robot, Robotics Singh, Gurkirt; Saha, Suman; Cuzzolin, Fabio TraMNet - Transition Matrix Network for Efficient Action Tube Proposals Proceeding 2018. Abstract | Links | BibTeX | Tags: Computer Science, Computer vision, Electrical Engineering, Image processing, Pattern Recognition, Robot, Robotics, Systems Science, Visual processing Behl, Harkirat Singh; Sapienza, Michael; Singh, Gurkirt; Saha, Suman; Cuzzolin, Fabio; Torr, Philip H S Incremental Tube Construction for Human Action Detection Proceeding 2018. Abstract | Links | BibTeX | Tags: Action detection, Artificial Intelligence, Computer Science, Computer vision, Detection, Pattern Recognition, Robot
2019
title = {A Multirobots Teleoperated Platform for Artificial Intelligence Training Data Collection in Minimally Invasive Surgery},
author = {Francesco Setti and Elettra Oleari and Alice Leporini and Diana Trojaniello and Alberto Sanna and Umberto Capitanio and Francesco Montorsi and Andrea Salonia and Riccardo Muradore},
editor = {IEEE},
url = {http://bmvc2018.org/contents/papers/0593.pdf},
doi = {10.1109/ISMR.2019.8710209},
isbn = {978-1-5386-7825-1},
year = {2019},
date = {2019-05-09},
pages = {1-7},
abstract = {Dexterity and perception capabilities of surgical robots may soon be improved by cognitive functions that can support surgeons in decision making and performance monitoring, and enhance the impact of automation within the operating rooms. Nowadays, the basic elements of autonomy in robotic surgery are still not well understood and their mutual interaction is unexplored. Current classification of autonomy encompasses six basic levels: Level 0: no autonomy; Level 1: robot assistance; Level 2: task autonomy; Level 3: conditional autonomy; Level 4: high autonomy. Level 5: full autonomy. The practical meaning of each level and the necessary technologies to move from one level to the next are the subject of intense debate and development. In this paper, we discuss the first outcomes of the European funded project Smart Autonomous Robotic Assistant Surgeon (SARAS). SARAS will develop a cognitive architecture able to make decisions based on pre-operative knowledge and on scene understanding via advanced machine learning algorithms. To reach this ambitious goal that allows us to reach Level 1 and 2, it is of paramount importance to collect reliable data to train the algorithms. We will present the experimental setup to collect the data for a complex surgical procedure (Robotic Assisted Radical Prostatectomy) on very sophisticated manikins (i.e. phantoms of the inflated human abdomen). The SARAS platform allows the main surgeon and the assistant to teleoperate two independent two-arm robots. The data acquired with this platform (videos, kinematics, audio) will be used in our project and will be released (with annotations) for research purposes.},
keywords = {Artificial Intelligence, Cognitive control, Computer Science, Laparoscopy, Laparoscopy, machine learning, Robot, Robotic surgery, Surgery, Teleoperation},
pubstate = {published},
tppubtype = {conference}
}
title = {A physical/virtual platform for hysteroscopy training},
author = {Albert Hernansanz and Martínez and Rovira and Alicia Casals},
editor = {CRAS 2019},
doi = {10.5281/zenodo.3373297},
year = {2019},
date = {2019-03-21},
booktitle = {Proceedings of the 9th Joint Workshop on New Technologies for Computer/Robot Assisted Surgery},
abstract = {This work presents HysTrainer (HT), a training module for hysteroscopy, which is part of the generic endoscopic training platform EndoTrainer (ET). This platform merges both technologies, with the benefits of having a physical anatomic model and computer assistance for augmented reality and objective assessment. Further to the functions of a surgical trainer, EndoTrainer provides an integral education, training and evaluation platform.},
keywords = {Computer Science, Endoscopy, Laparoscopy, Laparoscopy, Robot, Robotic surgery, Robotic Surgery, Surgery, Surgical robots, Training},
pubstate = {published},
tppubtype = {conference}
}
2018
title = {An energy saving approach to active object recognition and localization},
author = {Andrea Roberti and Riccardo Muradore and Paolo Fiorini and Marco Cristani and Francesco Setti},
editor = {IECON 2018},
doi = {10.1109/IECON.2018.8591411},
year = {2018},
date = {2018-10-21},
organization = {Annual Conference of the IEEE Industrial Electronics Society (IECON).Washington, DC, USA. },
abstract = {We propose an active object recognition (AOR) strategy explicitly suited to work with a real robotic arm. So far, AOR policies on robotic arms have focused on heterogeneous constraints, most of them related to classification accuracy, classification confidence, number of moves etc., discarding physical and energetic constraints a real robot has to fulfill. Our strategy adjusts this discrepancy, with a POMDP-based AOR algorithm that explicitly considers manipulability and energetic terms in the planning optimization. The manipulability term avoids the robotic arm to encounter singularities, which require expensive and straining backtracking steps; the energetic term deals with the arm gravity compensation when in static conditions, which is crucial in AOR policies where time is spent in the classifier belief update, before to do the next move. Several experiments have been carried out on a redundant, 7-DoF Panda arm manipulator, on a multi-object recognition task. This allows to appreciate the improvement of our solution with respect to other competitors evaluated on simulations only.},
keywords = {Active object recognition, Artificial Intelligence, Computer Science, Learning, Object recognition, Pattern Recognition, POMDP, Robotics},
pubstate = {published},
tppubtype = {conference}
}
title = {Recognition self-awareness for active object recognition on depth images},
author = {Andrea Roberti and Marco Carletti and Francesco Setti and Umberto Castellani and Paolo Fiorini and Marco Cristani},
editor = {British Machine Vision Conference (BMVC). Newcastle-Upon-Tyne, UK. (bmvc2018.org). Spotlight, 2% acceptance rate.},
url = {http://bmvc2018.org/contents/papers/0593.pdf},
doi = {10.5281/zenodo.3362923},
year = {2018},
date = {2018-09-06},
organization = {BMVC 2018},
abstract = {We propose an active object recognition framework that introduces the recognition self-awareness, which is an intermediate level of reasoning to decide which views to cover during the object exploration. This is built first by learning a multi-view deep 3D object classifier; subsequently, a 3D dense saliency volume is generated by fusing together single-view visualization maps, these latter obtained by computing the gradient map of the class label on different image planes. The saliency volume indicates which object parts the classifier considers more important for deciding a class. Finally, the volume is injected in the observation model of a Partially Observable Markov Decision Process (POMDP). In practice, the robot decides which views to cover, depending on the expected ability of the classifier to discriminate an object class by observing a specific part. For example, the robot will look for the engine to discriminate between a bicycle and a motorbike, since the classifier has found that part as highly discriminative. Experiments are carried out on depth images with both simulated and real data, showing that our framework predicts the object class with higher accuracy and lower energy consumption than a set of alternatives.},
keywords = {3D object classifier, Artificial Intelligence, Computer Science, Object exploration, Object recognition, POMDP},
pubstate = {published},
tppubtype = {conference}
}
title = {Predicting action tubes},
author = {Gurkirt Singh and Suman Saha and Fabio Cuzzolin},
editor = {ECCV 2018 Workshop on Anticipating Human Behaviour (AHB 2018), Munich, Germany, Sep 2018},
url = {http://openaccess.thecvf.com/content_ECCVW_2018/papers/11131/Singh_Predicting_Action_Tubes_ECCVW_2018_paper.pdf},
doi = {10.5281/zenodo.3362942},
year = {2018},
date = {2018-08-23},
abstract = {In this work, we present a method to predict an entire `action tube' (a set of temporally linked bounding boxes) in a trimmed video just by observing a smaller subset of it. Predicting where an action is going to take place in the near future is essential to many computer vision based applications such as autonomous driving or surgical robotics. Importantly, it has to be done in real-time and in an online fashion. We propose a Tube Prediction network (TPnet) which jointly predicts the past, present and future bounding boxes along with their action classification scores. At test time TPnet is used in a (temporal) sliding window setting, and its predictions are put into a tube estimation framework to construct/predict the video long action tubes not only for the observed part of the video but also for the unobserved part. Additionally, the proposed action tube predictor helps in completing action tubes for unobserved segments of the video. We quantitatively demonstrate the latter ability, and the fact that TPnet improves state-of-the-art detection performance, on one of the standard action detection benchmarks - J-HMDB-21 dataset.},
note = {Proceedings of the ECCV 2018 Workshop on Anticipating Human Behaviour (AHB 2018), Munich, Germany, Sep 2018},
keywords = {Artificial Intelligence, Computer Science, Computer vision, Object recognition, Pattern Recognition, Robot, Robotics},
pubstate = {published},
tppubtype = {article}
}
title = {TraMNet - Transition Matrix Network for Efficient Action Tube Proposals},
author = {Gurkirt Singh and Suman Saha and Fabio Cuzzolin},
url = {https://arxiv.org/abs/1808.00297},
year = {2018},
date = {2018-08-01},
abstract = {Current state-of-the-art methods solve spatio-temporal ac-tion localisation by extending 2D anchors to 3D-cuboid proposals onstacks of frames, to generate sets of temporally connected bounding boxescalled action micro-tubes. However, they fail to consider that the underly-ing anchor proposal hypotheses should also move (transition) from frameto frame, as the actor or the camera do. Assuming we evaluate n2D an-chors in each frame, then the number of possible transitions from each2D anchor to he next, for a sequence of fconsecutive frames, is in theorder of O(nf), expensive even for small values of f.To avoid this problem we introduce a Transition-Matrix-based Network(TraMNet) which relies on computing transition probabilities betweenanchor proposals while maximising their overlap with ground truth bound-ing boxes across frames, and enforcing sparsity via a transition threshold.As the resulting transition matrix is sparse and stochastic, this reducesthe proposal hypothesis search space from O(nf) to the cardinality ofthe thresholded matrix. At training time, transitions are specific to celllocations of the feature maps, so that a sparse (efficient) transition ma-trix is used to train the network. At test time, a denser transition matrixcan be obtained either by decreasing the threshold or by adding to itall the relative transitions originating from any cell location, allowingthe network to handle transitions in the test data that might not havebeen present in the training data, and making detection translation-invariant. Finally, we show that our network is able to handle sparseannotations such as those available in the DALY dataset, while allowingfor both dense (accurate) or sparse (efficient) evaluation within a singlemodel. We report extensive experiments on the DALY, UCF101-24 andTransformed-UCF101-24 datasets to support our claims.},
keywords = {Computer Science, Computer vision, Electrical Engineering, Image processing, Pattern Recognition, Robot, Robotics, Systems Science, Visual processing},
pubstate = {published},
tppubtype = {proceedings}
}
title = {Incremental Tube Construction for Human Action Detection},
author = {Harkirat Singh Behl and Michael Sapienza and Gurkirt Singh and Suman Saha and Fabio Cuzzolin and Philip H. S. Torr},
editor = {British Machine Vision Conference (BMVC). Newcastle-Upon-Tyne, UK},
url = {https://arxiv.org/abs/1704.01358},
year = {2018},
date = {2018-07-23},
abstract = {Current state-of-the-art action detection systems are tailored for offline batch-processing applications. However, for online applications like human-robot interaction, current systems fall short, either because they only detect one action per video, or because they assume that the entire video is available ahead of time. In this work, we introduce a real-time and online joint-labelling and association algorithm for action detection that can incrementally construct space-time action tubes on the most challenging action videos in which different action categories occur concurrently. In contrast to previous methods, we solve the detection-window association and action labelling problems jointly in a single pass. We demonstrate superior online association accuracy and speed (2.2ms per frame) as compared to the current state-of-the-art offline systems. We further demonstrate that the entire action detection pipeline can easily be made to work effectively in real-time using our action tube construction algorithm.},
keywords = {Action detection, Artificial Intelligence, Computer Science, Computer vision, Detection, Pattern Recognition, Robot},
pubstate = {published},
tppubtype = {proceedings}
}
2019, ISBN: 978-1-5386-7825-1. A physical/virtual platform for hysteroscopy training Conference Proceedings of the 9th Joint Workshop on New Technologies for Computer/Robot Assisted Surgery, 2019. An energy saving approach to active object recognition and localization Conference Annual Conference of the IEEE Industrial Electronics Society (IECON).Washington, DC, USA. 2018. Recognition self-awareness for active object recognition on depth images Conference BMVC 2018 2018. Predicting action tubes Journal Article 2018, (Proceedings of the ECCV 2018 Workshop on Anticipating Human Behaviour (AHB 2018), Munich, Germany, Sep 2018). TraMNet - Transition Matrix Network for Efficient Action Tube Proposals Proceeding 2018. Incremental Tube Construction for Human Action Detection Proceeding 2018.
2019
2018