In field robotics, particularly in the agricultural sector, precise localization presents a challenge due to the constantly changing nature of the environment. Simultaneous Localization and Mapping algorithms can provide an effective estimation of a robot's position, but their long-term performance may be impacted by false data associations. Additionally, alternative strategies such as the use of RTK-GPS can also have limitations, such as dependence on external infrastructure. To address these challenges, this paper introduces a novel stability scan filter. This filter can learn and infer the motion status of objects in the environment, allowing it to identify the most stable objects and use them as landmarks for robust robot localization in a continuously changing environment. The proposed method involves an unsupervised point-wise labelling of LiDAR frames by utilizing temporal observations of the environment, as well as a regression network called Long-Term Stability Network (LTS-NET) to learn and infer 3D LiDAR points long-term motion status. Experiments demonstrate the ability of the stability scan filter to infer the motion stability of objects on a real agricultural long-term dataset. Results show that by only utilizing points belonging to long-term stable objects, the localization system exhibits reliable and robust localization performance for long-term missions compared to using the entire LiDAR frame points.
Achieving a robust long-term deployment with mobile robots in the agriculture domain is both a demanded and challenging task. The possibility to have autonomous platforms in the field performing repetitive tasks, such as monitoring or harvesting crops, collides with the difficulties posed by the always-changing appearance of the environment due to seasonality. With this scope in mind, we report an ongoing effort in the long-term deployment of an autonomous mobile robot in a vineyard, with the main objective of acquiring what we called the Bacchus Long-Term (BLT) data set. This data set consists of multiple sessions recorded in the same area of a vineyard but at different points in time, covering a total of 7 months to capture the whole canopy growth from March until September. The multimodal data set recorded is acquired with the main focus put on pushing the development and evaluations of different mapping and localization algorithms for long-term autonomous robots operation in the agricultural domain. Hence, besides the data set, we also present an initial study in long-term localization using four different sessions belonging to four different months with different plant stages. We identify that state-of-the-art localization methods can only cope partially with the amount of change in the environment, making the proposed data set suitable to establish a benchmark on which the robotics community can test its methods. On our side, we anticipate two solutions pointed at extracting stable temporal features for improving long-term 4D localization results. The BLT data set is available at https://lncn.ac/lcas-blt.
Weeds management is one of the major challenges in vineyard cultivation, as weeds can cause significant losses and intense competition in crops. In this direction, the development of an automated weed monitoring process will provide valuable data for understanding weed management practices. This paper presents a new approach for detecting weeds within vineyards. It introduces an innovative robotic system that detects and maps the distribution of weeds in real-time using a deep learning algorithm. The developed model was tested under various conditions with different levels of weed growth and performed accurately in cases where weeds had distinct boundaries.
We consider the problem of image-based visual servoing (IBVS) under inelastic visibility constraints that arise from the field-of-view (FOV) of the camera. The target is affected by possible aperiodic impulse perturbations. Under the assumption that a minimum time needs to elapse after each impulse, the proposed controller guarantees that all image features will remain strictly inside the FOV and will converge to the desired feature values with prescribed performance characteristics, related to accuracy and convergence time, between any two consecutive impulses. Simulation results clarify and verify the theoretical findings.
In this work we consider uncertain impulsive systems in Brunovsky canonical form with possibly aperiodic impulses. Following the prescribed performance control methodology, a state feedback controller is designed to guarantee that between any two consecutive impulses, the output tracking error will converge to a neighborhood of zero of predefined size, in no greater than a user selected fixed-time. In addition, all signals in the closed-loop are bounded. Simulations clarify and verify the approach.
Agricultural multi-month dataset with overlapping paths for mapping and localisation algorithms for autonomous robots
In March 2022, we started conducting the long-term data acquisition campaign at Ktima Gerovassiliou vineyard. This vineyard extends for more than 100ha located on the outskirts of Epanomi, Greece.
As shown in the image below (taken in March and June), agricultural environments present seasonal changes, repetitive structures, uneven terrain and different weather conditions, which make achieving long-term autonomy for robots a challenging problem.
Motivated by these challenging conditions and the lack of an agricultural dataset in the literature, we present the BLT dataset. Its primary objective is to push developments and evaluations of different mapping and localisation algorithms for long-term autonomous robots operating in agricultural fields. However, we believe that thanks to its temporal aspect, the dataset can also be used for phenotyping and crop mapping tasks.
Available from https://lcas.lincoln.ac.uk/wp/research/data-sets-software/blt/
In this research, we present an end-to-end data-driven pipeline for determining the long-term stability status of objects within a given environment, specifically distinguishing between static and dynamic objects. Understanding object stability is key for mobile robots since long-term stable objects can be exploited as landmarks for long-term localisation. Our pipeline includes a labelling method that utilizes historical data from the environment to generate training data for a neural network. Rather than utilizing discrete labels, we propose the use of point-wise continuous label values, indicating the spatio-temporal stability of individual points, to train a point cloud regression network named LTS-NET. Our approach is evaluated on point cloud data from two parking lots in the NCLT dataset, and the results show that our proposed solution, outperforms direct training of a classification model for static vs dynamic object classification.
This paper presents a comprehensive review of ground agricultural robotic systems and applications with special focus on harvesting that span research and commercial products and results, as well as their enabling technologies. The majority of literature concerns the development of crop detection, field navigation via vision and their related challenges. Health monitoring, yield estimation, water status inspection, seed planting and weed removal are frequently encountered tasks. Regarding robotic harvesting, apples, strawberries, tomatoes and sweet peppers are mainly the crops considered in publications, research projects and commercial products. The reported harvesting agricultural robotic solutions, typically consist of a mobile platform, a single robotic arm/manipulator and various navigation/vision systems. This paper reviews reported development of specific functionalities and hardware, typically required by an operating agricultural robot harvester; they include (a) vision systems, (b) motion planning/navigation methodologies (for the robotic platform and/or arm), (c) Human-Robot-Interaction (HRI) strategies with 3D visualization, (d) system operation planning & grasping strategies and (e) robotic end-effector/gripper design. Clearly, automated agriculture and specifically autonomous harvesting via robotic systems is a research area that remains wide open, offering several challenges where new contributions can be made.
Kinesthetic teaching allows the direct skill transfer from the human to the robot and has been widely used to teach single arm tasks intuitively. In the bi-manual case, simultaneously moving both end-effectors is challenging due to the high physical and cognitive load imposed to the user. Thus, previous works on bi-manual task teaching resort to less intuitive methods by teaching each arm separately. This in turn requires motion synthesis and synchronization before execution. In this work, we leverage knowledge from the relative task space to facilitate a kinesthetic demonstration by guiding both end-effectors which is more human-like and intuitive
way for performing bi-manual tasks. Our method utilizes the notion of virtual fixtures and inertia minimization in the null space of the task. The controller is experimentally validated in a bi-manual task which involves the drawing of a preset line on a workpiece utilizing two KUKA IIWA7 R800 robots. Results from ten participants were compared with a gravity compensation scheme demonstrating improved performance
Automation of vineyards cultivation necessitates for mobile robots to retain accurate localization system. The
paper introduces a stereo vision-based Graph-Simultaneous Localization and Mapping (Graph-SLAM) pipeline custom-
tailored to the specificities of vineyard fields. Graph-SLAM is reinforced with a Loop Closure Detection (LCD) based on
semantic segmentation of the vine trees. The Mask R-CNN network is applied to segment the trunk regions of images, on
which unique visual features are extracted. These features are used to populate the bag of visual words (BoVWs) retained
on the formulated graph. A nearest neighbor search is applied to each query trunk-image to associate each unique feature
descriptor with the corresponding node in the graph using a voting procedure. We apply a probabilistic method to select the
most suitable loop closing pair and, upon an LCD appearance, the 3D points of the trunks are employed to estimate the loop
closure constraint to the graph. The traceable features on trunk segments drastically reduce the number of retained BoVWs,
which in turn significantly expedites the loop closure and graph optimization, rendering our method suitable for large scale mapping in vineyards. The pipeline has been evaluated on several data sequences gathered from real vineyards, in different seasons, when the appearance of vine trees vary significantly, and exhibited robust mapping in long distances.
We design a state-feedback controller to impose prescribed performance attributes on the output regulation error for uncertain nonlinear systems, in the presence of unknown time-varying delays appearing both to the state and control input signals, provided that an upper bound on those delays is known. The proposed controller achieves pre-specified minimum convergence rate and maximum steady-state error, and keeps bounded all signals in the closed-loop. We proved that the error is confined strictly within a delayed version of the constructed performance envelope, that depends on the difference between the actual state delay and its corresponding upper bound. Nevertheless, the maximum value of the output regulation error at steady-state remains unaltered, exactly as pre-specified by the constructed performance functions. Furthermore, the controller does not incorporate knowledge regarding the nonlinearities of the controlled system, and is of low-complexity in the sense that no hard calculations (analytic or numerical) are required to produce the control signal. Simulation results validate the theoretical findings.
In this work, we present TS-Rep, a self-supervised method that learns representations from multi-modal varying-length time series sensor data from real robots. TS-Rep is based on a simple yet effective technique for triplet learning, where we randomly split the time series into two segments to form anchor and positive while selecting random subseries from the other time series in the mini-batch to construct negatives. We additionally use the nearest neighbour in the representation space to increase the diversity in the positives. For evaluation, we perform a clusterability analysis on representations of three heterogeneous robotics datasets. Then learned representations are applied for anomaly detection, and our method consistently performs well. A classifier trained on TS-Rep learned representations outperforms unsupervised methods and performs close to the fully-supervised methods for terrain classification. Furthermore, we show that TS-Rep is, on average, the fastest method to train among the baselines.
Work focused on deploying an autonomous robot in a vineyard at specific time intervals and to record sensor data:
- Long-term robust deployment of the robot in the wild navigating on a toplofical map.
- Building Agri-KITTI, a long-term database of robot sensor data spanning across various seasons.
- Adopting the database as benchmark for SLAM algorithms in agricultural environments.
Long-term autonomy is one of the most demanded capabilities looked into a robot. The possibility to perform the same task over and over on a long temporal horizon, offering a high standard of reproducibility and robustness, is appealing. Long-term autonomy can play a crucial role in the adoption of robotics systems for precision agriculture, for example in assisting humans in monitoring and harvesting crops in a large orchard. With this scope in mind, we report an ongoing effort in the long-term deployment of an autonomous mobile robot in a vineyard for data collection across multiple months. The main aim is to collect data from the same area at different points in time so to be able to analyse the impact of the environmental changes in the mapping and localisation tasks. In this work, we present a map-based localisation study taking 4 data sessions. We identify expected failures when the pre-built map visually differs from the environment’s current appearance and we anticipate LTS-Net, a solution pointed at extracting stable temporal features for improving long-term 4D localisation results.
Topological maps have proven to be an effective representation to be used for outdoor robot navigation. These typically consist of a set of nodes that represent physical locations of the environment and a set of edges representing the robot’s ability to move between these locations. They allow planning to be more efficient and to easily define different robot navigation behaviours depending on the location. In the literature, the topological maps are sometimes manually created in an 2d occupancy map previously built by a robot, but this is not very practical or scalable when it has to be done in a 50ha vineyard with hundreds of rows. Other works focus on the vine rows classification using mainly Color Vegetation indices, however this assumes there is a green canopy which is not always the case depending on the time of the year. Focusing only on the rows also leaves other non-traversable structures such as fences, buildings and poles unmapped. To overcome the aforementioned limitations, we propose a pipeline to use UAV imagery as an input to create a topological map of the vineyard where an AGV has to be deployed.
In this work, a control scheme for approaching and unveiling a partially occluded object of interest is proposed. The control scheme is based only on the classified point cloud obtained by the in-hand camera attached to the robot's end effector. It is shown that the proposed controller reaches in the vicinity of the object progressively unveiling the neighborhood of each visible point of the object of interest. It can therefore potentially achieve the complete unveiling of the object. The proposed control scheme is evaluated through simulations and experiments with a UR5e robot with an in-hand RealSense camera on a mock-up vine setup for unveiling the stem of a grape cluster.
With the incorporation of autonomous robotic platforms in various areas (Industry, Agriculture, etc.), numerous mundane operations have been assisted by fully automated. From the dawn of humanity, in Agriculture, the high demanding working environment let the
development of techniques and machineries that could cope with each case. To further explore, new technologies (from high performance motors to optimization algorithms) have been implemented and tested in this field. Every cultivation season, there are several operations that contribute to the crop development and had to occur at least once. One of the above-mentioned operations is the weeding. In every cultivated crop, there are crops that developed which are not part of the cultivation. These crops, in most cases, have a negative impact to the crop and had to be removed. With traditional methods, weeding was taken place either by hand (smaller cultivations) or with the use of herbicides (larger cultivation). In the second case, the dosage and the time are pre-defined, and they are not taking into consideration the growth percentage and the weed allocation within the field.
In this work, a novel approach for intra-row (between the vine plants) weeding in real vineyard fields is developed and presented. All the experiments both for data aggregation and the algorithm testing were took place in a high value vineyard which produce numerous types of wine. The focus of this work was to implement an accurate real-time the weed detection and segmentation model using a deep learning algorithm in order to optimize the weed detection procedure at the intra-row of the vineyard. This approach consists of two essential sub-systems. The first one is the robotic platform that embeds all the necessary sensors (GPS, LiDAR, IMU, RGB camera) and the required computational power for the detection algorithm. The second one is the developed algorithm for weed detection. The developed algorithms were tested in many datasets from vineyards with different levels of weed development. In order to proper validate the algorithm, the unknown data ware acquired in different time periods with variations in both camera angle and wine varieties. The results show that the proposed technique gives promising results in various field conditions.
Weed management is one of the major challenges in viticulture, as long as weeds can cause significant yield losses and severe competition to the cultivations. In this direction, the development of an automated procedure for weed monitoring will provide useful data for understanding their management practices. In this work, a new image-based technique was developed in order to provide maps based on weeds’ height at the inter-row path of the vineyards. The developed algorithms were tested in many datasets from vineyards with different levels of weed development. The results show that the proposed technique gives promising results in various field conditions.
Robotic grasping in highly cluttered environments remains a challenging task due to the lack of collision free grasp
affordances. In such conditions, non-prehensile actions could help to increase such affordances. We propose a multi-fingered push-grasping policy that creates enough space for the fingers to wrap around an object to perform a stable power grasp, using a single primitive action. Our approach learns a direct mapping from visual observations to actions and is trained in a fully end-to-end manner. To achieve a more efficient learning, we decouple the action space by learning separately the robot hand pose and finger configuration. Experiments in simulation demonstrate that the proposed push-grasping policy achieves higher grasp success rate over baselines and it can generalize to unseen objects. Furthermore, although training is performed in simulation, the learned policy is robustly transferred to a real environment without a significant drop in success rate. Qualitative results, code, pre-trained models and simulation environments are available at https://robot-clutter.github.io/ppg.
We consider the problem of controlling, with prescribed performance, uncertain Euler-Lagrange systems in the presence of aperiodic impulses affecting the system state. Between any two consecutive impulse time instants, we guarantee that the output tracking error converges to a predefined and arbitrarily small region of interest, within a prespecified fixed time. Furthermore, all signals in the closed-loop are bounded. The magnitude of the impulses and their time of appearance are unknown in advance. Yet a known minimum time interval is required to elapse, before the appearance of a new impulse. Simulation results clarify and verify the theoretical findings.
Trajectory tracking in the orientation space utilizing unit quaternions yields non linear error dynamics as opposed to Cartesian position. In this work, we study trajectory tracking in the orientation space utilizing the most popular quaternion error representations and angular velocity errors. By selecting error functions carefully we show exponential convergence in a region of attraction containing large initial errors. We further show that under certain conditions frequently encountered in practice, the formulation respecting the geometric characteristics of the quaternion manifold and its tangent space yields linear tracking dynamics allowing us to guarantee a desired tracking performance by gain selection without tuning. Simulation and experimental results are provided.
In the last years, the use of robotics technology in agriculture is constantly increasing. Robotic platforms are the application of automation and robotics in agriculture field to relieve manual and heavy tasks from workers. These devices have already started to transform many aspects of agriculture and are hesitantly finding their way to the market. Therefore, robotic solutions which can provide alternative routes to weed management may provide a transformational enabling technology to mitigate against biotic and abiotic stresses on crop production, for example automatic weeding robots are the preferable substitute for chemical herbicide to remove weeds. One of the most impacting abiotic factors in agriculture are weeds, causing important yield loss in every cultivation. Integrated weed management coupled with the use of robotic platforms (UGVs), allows the effectively weed management, us a beneficial methodology for the environment. The detection of weed spots in a cultivation can be achieved by combining image acquisition by UGV and further processing by specific algorithms. These algorithms can be used to weeds control by autonomous robotic systems via mechanical procedures or herbicide spray.
The weed management is one of the major challenges in viticulture, as long as weeds can cause significant yield losses and severe competition to vines. One of the cheapest and effectiveness method remains the weed control with chemicals; however, several adverse effects and risks may arise. Different methods like tillage, thermal method, mulching and cover crops can be included in weed control strategy, depending on the environmental conditions, soil and crop. As it is known, the mechanical methods are the most cost-effective weed management methods in vineyards.
Monitoring weed in different vineyards will provide a useful database for understanding the weed management practices. In this direction, this paper presents a system for a weeding detection robot. The objective is to be enabling the weed detection robot to navigate autonomously between the inter-row spaces of crop for automatic weed control, reduce labor cost and time. In this paper, various of image processing techniques with the implementation of an RGB-D camera was examined in order to: i) detect the path between two rows of vineyard and ii) allocate the weeds based on various a priori characteristics. As a pre-processing state, the real time data from the RGB-D camera transformed into different color spaces in order to denote the noise that could occur. Subsequently, the examined algorithms and techniques tested in numerous of aggregated data from real vineyards with different levels of weed development. Finally, the developed algorithm tested by implementing it on a UGV platform with promising results.
Many visual scene understanding applications, especially in visual servoing settings, may require high quality object mask predictions for the accurate undertaking of various robotic tasks. In this work we investigate a setting where separate instance labels for all objects under view are required, but the available instance segmentation methods produce object masks inferior to a semantic segmentation algorithm. Motivated by the need to add instance label information to the higher fidelity semantic segmentation output, we propose an anisotropic label diffusion algorithm that propagates instance labels predicted by an instance segmentation algorithm inside the semantic segmentation masks. Our method leverages local topological and color information to propagate the instance labels, and is guaranteed to preserve the semantic segmentation mask. We evaluate our method on a challenging grape bunch detection dataset, and report experimental results that showcase the applicability of our method.
The problem of motion planning in obstacle cluttered environments is an important task in robotics. In the literature several methodologies exist to address the problem. In this work we consider using the feedback-based approach, where the solution comes from designing a controller capable of guaranteeing trajectory tracking with obstacle avoidance. Commonly, all respective studies consider simplified robot dynamics, which is usually insufficient in practical applications. In this work we focus on the collision avoidance problem with respect to a moving spherical object. We assume knowledge of a nominal controller that achieves tracking of a desired trajectory in the absence of obstacles, and we design an auxiliary control scheme to guarantee that the robot’s end-effector will always operate in a safe distance from the moving obstacle’s surface. The controller we develop does not take into account the actual robot dynamics, thus constituting a truly model-free approach. Experimental studies conducted on a KUKA LWR4+ robotic manipulator clarify and verify the proposed control scheme.
Spectroscopy is a widespread technique used in many scientific fields such as in the food production. The use of hyperspectral data and specifically in the visible and near infrared (VNIR) and in the short-wave infrared (SWIR) regions in grape production is of great interest. Due to its fine spectral resolution, hyperspectral analysis can contribute to both fruit monitoring and quality control at all stages of maturity with a simple and inexpensive way. This work presents an application of a contact probe spectrometer that covers the VNIR–SWIR spectrum (350–2500 nm) for the quantitative estimation of the wine grapes’ ripeness. A total of 110 samples of grape vine Syrah (Vitis vinifera Syrah) variety were collected over the 2020 harvest and pre-harvest seasons from Ktima Gerovassiliou located in Northern Greece. Their total soluble solids content (oBrix) was measured in-situ using a refractometer. Two different machine learning algorithms, namely partial least square regression (PLS) and random forest (RF) were applied along with several spectral pre-processing methods in order to predict the oBrix content from the VNIR–SWIR hyperspectral data. Additionally, the most important features of the spectrum were identified, as indicated by the most accurate models. The performance of the different models was examined in terms of the following metrics: coefficient of the determination (R2), root mean square error (RMSE) and ratio of performance to interquartile distance (RPIQ). The values of R2 = 0.90, RMSE =1.51 and RPIQ = 4.41 for PLS and 0.92, 1.34, 4.96 for RF respectively, indicate that by using a portable VNIR–SWIR spectrometer it is possible to estimate the wine grape maturity in-situ.
In this work, we present a comparative analysis of the trajectories estimated from various Simultaneous Localization and Mapping (SLAM) systems in a simulation environment for vineyards. Vineyard environment is challenging for SLAM methods, due to visual appearance changes over time, uneven terrain, and repeated visual patterns. For this reason, we created a simulation environment specifically for vineyards to help studying SLAM systems in such a challenging environment. We evaluated the following SLAM systems: LIO-SAM, StaticMapping, ORBSLAM2, and RTAB-MAP in four different scenarios. The mobile robot used in this study equipped with 2D and 3D lidars, IMU, and RGB-D camera (Kinect v2). The results show good and encouraging performance of RTAB-MAP in such an environment.
Mobile bimanual manipulation in a dynamic and uncertain environment requires the continuous and fast adjustment of the robot motion for the satisfaction of the constraints imposed by the task, the robot itself and the environment. We formulate the pick-and-place task as a sequence of mobile manipulation tasks with a combination of relative, global and local targets. Distributed distance sensors on the robot are utilized to sense the surroundings and facilitate collision avoidance with dynamic and static obstacles. We propose an approach to kinematically control the robot by solving a priority constrained optimization problem online. Experimental results on the YuMi bimanual robot mounted on the Ridgeback mobile platform validate the performance of the proposed approach.
Prehensile robotic grasping of a target object in clutter is challenging because, in such conditions, the target touches other objects, resulting to the lack of collision free grasp affordances. To address this problem, we propose a modular reinforcement learning method which uses continuous actions to totally singulate the target object from its surrounding clutter. A high level policy selects between pushing primitives, which are learned separately. Prior knowledge is effectively incorporated into learning, through action primitives and feature selection, increasing sample efficiency. Experiments demonstrate that the proposed method considerably outperforms the state-of-the-art methods in the singulation task. Furthermore, although training is performed in simulation the learned policy is robustly transferred to a real environment without a significant drop in success rate. Finally, singulation tasks in different environments are addressed by easily adding a new primitive and by retraining only the high level policy
Extracting a known target object from a pile of other objects in a cluttered environment is a challenging robotic manipulation task encountered in many robotic applications. In such conditions, the target object touches or is covered by adjacent obstacle objects, thus rendering traditional grasping techniques ineffective. In this paper, we propose a pushing policy aiming at singulating the target object from its surrounding clutter, by means of lateral pushing movements of both the neighboring objects and the target object until sufficient 'grasping room' has been achieved. To achieve the above goal we employ reinforcement learning and particularly Deep Q-learning (DQN) to learn optimal push policies by trial and error. A novel Split DQN is proposed to improve the learning rate and increase the modularity of the algorithm. Experiments show that although learning is performed in a simulated environment the transfer of learned policies to a real environment is effective thanks to robust feature selection. Finally we demonstrate that the modularity of the algorithm allows the addition of extra primitives without retraining the model from scratch.
DMP have been extensively applied in various robotic tasks thanks to their generalization and robustness properties. However, the successful execution of a given task may necessitate the use of different motion patterns that take into account not only the initial and target position but also features relating to the overall structure and layout of the scene. To make DMP applicable to a wider range of tasks and further automate their use, we design a framework combining deep residual networks with DMP, that can encapsulate different motion patterns of a planar task, provided through human demonstrations on the RGB image plane. We can then automatically infer from new raw RGB visual input the appropriate DMP parameters, i.e. the weights that determine the motion pattern and the initial/target positions. We compare our method against another SoA method for inferring DMP from images and carry out experimental validations in two different planar tasks.
The dataset is defined by a set of detailed technical specifications intended to capture high-quality data in a grapevine setting. The Intel Realsense D435 and the ORBBEC ASTRA sensors were used in this dataset, both of which were positioned 850 mm above ground and approximately 725 mm from the vine plants. The Intel Realsense D435 takes RGB images with a resolution of 1920 x 1080 pixels and depth information at 1280 x 720 pixels. This sensor captured 300 consecutive single-shot images while mounted on a tripod on the ground. Furthermore, a continuous stream of images at 30 frames per second (fps) was obtained, yielding approximately 3000 images while the sensor was moved by a human operator. The ORBBEC ASTRA, which had a resolution of 640 x 480 pixels for RGB and depth, used a similar operational setup, producing a continuous stream of images at 30 fps while being moved by a human operator. These technical details highlight the dataset's precise design, resulting in a diverse collection of static and dynamic images suitable for a variety of applications, including grape detection and analysis in agricultural contexts. All of the data were collected at Ktima Gerovassiliou in Epanomi, Thessaloniki.
Below, there is a folder containing the stationary images and a sample folder containing the moving images of the dataset. For the full dataset please contact Stefanos Papadam (sgpapadam@iti.gr), Ioannis Mariolis (ymariolis@iti.gr), Georgia Peleka (gpeleka@iti.gr).
The dataset consists of three distinct groups of RGB images, each designed for training and validation purposes. The groups were separated based on the camera sensor used to collect the data. The first set contains 151 images, with 101 for training and 50 for validation. The first group is contained in the C1 folder. These images were taken with a ZED Mini camera held by hand at a constant distance of one meter from the vineyard. The resolution varies within this set, with images available in 1024 x 1024 or 2208 x 1242 pixels. The second collection encompasses 236 RGB images, of which 155 are designated for training and 81 for validation. The second group is contained in the C2 and C3 folders. These images were taken with an Intel Realsense D435 camera, capturing shots of the grapes positioned 20 to 50 cm away from the camera. The resolution for this set is standardized at 1920 x 1080 pixels, with the camera securely mounted on one of the robot's arms. The third and final group consists of 1320 images, with 1092 allocated for training and 228 for validation. This group is splitted in the I1, I2 and I3 folders. Captured using a ZED2 camera, these images were obtained by an inspection platform located 1 meter away from the vineyard, maintaining a resolution of 1920 x 1080 pixels. Together, these diverse sets form a comprehensive RGB dataset for the development and assessment of vineyard-related algorithms. This dataset used for detecting the grapes in the images. All of the data were collected at Ktima Gerovassiliou in Epanomi, Thessaloniki.
Below, there are six separate folders with the training data (which can be merged into one) and one folder containing the validation data. Every image has a txt file containing the image's grapes annotations. The first number represents the annotation's class (0 for grape), the second and third numbers are the normalized (in the image's shape) coordinates of the grape's bounding box up left edge, and the fourth and fifth numbers are the bounding box's width and height accordingly.
The dataset consists of multiple sessions, each with a unique variety or a combination of varieties and captured using a specific set of technical specifications. The data were collected at a vineyard in the summer of 2023. The distance between the camera and RGBD images ranges from 60 to 80 centimeters, while Hyperspectral images range from 80 to 100 centimeters. The RGBD images have a resolution of 2208 x 1242 pixels, which provides detailed visual data for analysis. Simultaneously, in the hyperspectral camera the grayscale (panchromatic) images have a resolution of 1000 x 1000 pixels, with the respective co-registered cube files having a spatial resolution of 50 x 50 pixels. The spectral dimension spans 138 bands, with a spectral range from 450 to 998 nms, i.e., a spectral resolution of 4 nm, allowing for a more detailed analysis of the spectral information. The hyperspectral data was captured in "RAW" mode, with the grabbed values corresponding to digital numbers (DN).
The data acquisition process uses specific sensors, with RGB and depth images captured by the ZED2 stereo camera. For the Hyperspectral component, the FireflEYE 185 camera by Cubert Gmbh is employed, ensuring precision and accuracy in capturing spectral data across different varieties and sessions within the dataset. All of the data were collected at Gerovassiliou Field in Epanomi, Thessaloniki.
The dataset includes 12 sessions of Malagouzia variety, 3 sessions of Priknadi variety, 2 sessions of Chardonnay variety, 1 session of Monemvasia variety, 2 sessions of consecutive Malagouzia and Monemvasia varieties and 3 sessions of consecutive Priknadi and Monemvasia varieties.
It is worth noting that the two cameras are registered between them using matrices R and T, which are included in the dataset to achieve exact alignment. Both RGB and Hyperspectral images are used exclusively for detection purposes, specifically to identify grapes within their respective fields of view.
CERTH/ITI coordinated the setup and collection of RGBD data, while AUTH/RSSGIS guided the collection of Hyperspectral data, demonstrating a collaborative effort in compiling this comprehensive dataset for grapevine analysis.
Below, there is a sample of the dataset structure due to the big size of the dataset. We uploaded a complete sample of the Malagouzia variety since the remaining data has a similar form. For the full dataset with all the varieties please contact Stefanos Papadam (sgpapadam@iti.gr), Ioannis Mariolis (ymariolis@iti.gr), Georgia Peleka (gpeleka@iti.gr)
The stem detection dataset includes RGB images that were collected specifically for training and validation of stem detection algorithms. There are a total of 573 RGB images, 508 of which were used for training and 65 for validation. These images were taken with an Intel Realsense D435 camera positioned at a variable distance of 20 to 50 centimeters from the grapes. The images' resolution is set to 1920 x 1080 pixels, resulting in a detailed and comprehensive visual dataset for the development and evaluation of stem detection algorithms. All of the data were collected at Ktima Gerovassiliou in Epanomi, Thessaloniki.
Below, there are three separate folders with the training data (which can be merged into one) and one folder containing the validation data. Every image has a txt file containing the image's stem annotations. The first number represents the annotation's class (0 for stem), and the remaining numbers are the normalized (in the image's shape) coordinates of the stem's mask.
In this work, a methodology to find the best view of a grape stem and approach angle in order to crop it is proposed. The control scheme is based only on a classified point cloud obtained by the in-hand camera attached to the robot's end effector without continuous stem tracking. It is shown that the proposed controller finds and reaches the optimal view point and subsequently the stem fast and efficiently, accelerating the overall harvesting procedure. The proposed control scheme is evaluated through experiments in the lab with a UR5e robot with an in-hand RealSense camera on a mock-up vine.
Most existing robotic harvesters utilize a unimanual approach with a single arm grasping and detaching the crop, either via a detachment movement, or stem cutting by a especially designed gripper/cutter end-effector. However, such unimanual solutions cannot be applied for sensitive crops and cluttered environments such as grapes, where obstacles may occlude the stem, leaving no space for the cutter's placement. In such cases, the solution would require a bimanual robot that visually unveils the stem, while manipulating the grasped crop to create cutting affordances. Considering grapes vertical trellis setups, a dual-arm coordinated motion control methodology is proposed in this work for reaching a stem precut state. The camera equipped arm with the cutter reaches the stem, visually unveiling it, while the second arm moves the grasped grape toward the surrounding free-space, facilitating its stem cutting. In-lab experimental validation and extensive evaluation in a real vineyard with the BACCHUS bimanual harvesting platform demonstrate the performance of the proposed approach.
Most existing robotic harvesters utilize a unimanual approach with a single arm grasping and detaching the crop, either via a detachment movement, or stem cutting by a especially designed gripper/cutter end-effector. However, such unimanual solutions cannot be applied for sensitive crops and cluttered environments such as grapes, where obstacles may occlude the stem, leaving no space for the cutter's placement. In such cases, the solution would require a bimanual robot that visually unveils the stem, while manipulating the grasped crop to create cutting affordances. Considering grapes vertical trellis setups, a dual-arm coordinated motion control methodology is proposed in this work for reaching a stem precut state. The camera equipped arm with the cutter reaches the stem, visually unveiling it, while the second arm moves the grasped grape toward the surrounding free-space, facilitating its stem cutting. In-lab experimental validation and extensive evaluation in a real vineyard with the BACCHUS bimanual harvesting platform demonstrate the performance of the proposed approach.