Building tools operators are inevitably uncovered to hazard when working in an excessive atmosphere (Kim et al., 2017). Teleoperation of building tools can successfully help an operator in finishing a process whereas avoiding harmful conditions (Wang and Dunston, 2006). Tools teleoperation has been utilized in lots of domains, similar to area exploration, army protection, underwater operation, telerobotics in forestry and mining, telesurgery, and telepresence robots (Lichiardopol, 2007). For instance, Woo-Keun et al. (2004) mixed power with movement command into a set area robotic teleoperation system. Kot and Novák (2018) employed digital actuality and the HMD Oculus Rift in Tactical Robotic System. The examples illustrate the potential of teleoperation in building to cut back operational dangers and prolong the ranges of building actions.

Building tools teleoperation remains to be an open analysis space and isn’t utilized in sensible actions. Restricted situational consciousness encountered in a teleoperating atmosphere is among the fundamental causes that hinder the appliance (Hong et al., 2020). Situational consciousness is outlined as “the notion of the weather within the atmosphere inside a quantity of time and area, the comprehension of their which means, and the projection of their standing within the close to future” (Endsley, 1988). Throughout teleoperation, an operator has no direct notion of the atmosphere however has to depend on visible info on one or a number of teleoperating screens. The operator’s perceptual processing is decoupled from the bodily atmosphere, leading to a low situational consciousness that will result in collisions and different accidents (Woods et al., 2004).

Present research have explored quite a lot of means to enhance perceptual consciousness, amongst which, the appliance of digital annotation (VA) has demonstrated vital potential. A VA can current vital info from sensors as a visible cue to help in teleoperation, compensating the operator’s restricted situational consciousness. Analysis and sensible examples have been reported in some vacationer and navigation techniques (Orlosky et al., 2014; Williams et al., 2017), surgical procedure coaching techniques (Andersen et al., 2016), and augmented actuality (AR)-based leisure purposes (Larabi, 2018; Tylecek and Fisher, 2018).

Nonetheless, the prevailing VA system might not be immediately utilized to building tools teleoperation. Challenges stay as a consequence of some distinctive options of working a bit of building tools. An essential one is expounded to human consideration allocation. When utilizing the VA-based vacationer system, a person can place as a lot consideration as obligatory on visualizing and understanding the VA. In distinction, a building tools operator should place sufficient consideration on the working process below a normally nerve-racking state of affairs. A VA could be ignored if it fails to attract the operator’s consideration or could be very interruptive then again. As well as, in contrast to the surgical procedure system, building tools operation usually entails a frequent change in areas and scenes, which can require extra consideration from an operator.

Many VA-related research and purposes within the building area give attention to function-oriented applied sciences and infrequently ponder the issue from a human-oriented perspective (Hong et al., 2021). Understanding how an operator’s visible system responds to totally different VAs throughout building tools teleoperation stays a problem. It has been discovered that many design options of a VA, similar to form, format, dimension, and showing location, could have an effect on the driving force’s understanding and due to this fact have an effect on the effectiveness of VA use. Topics carrying a head-mounted show advised that textual content annotations be positioned beneath the middle of the display (Orlosky et al., 2014). Highlighted traces across the edges of obstacles are simpler to grasp and react to than radar maps (Hong et al., 2020).

This work goals at constructing a visible consideration map for the development tools teleoperation to depict how an operator allocates her/his visible consideration throughout operation with VAs. The visible consideration map can contribute to a scientific foundation for understanding an operator’s visible consideration allocating mechanism below a nerve-racking work state of affairs. It additionally informs design methods for practitioners to enhance the person interface of next-generation teleoperating tools.

Associated Work

Digital Annotation Design

Digital annotation system has been utilized in lots of fields, similar to plane working, navigation, and surgical procedure. A VA can complement otherwise-inaccessible info to enhance an operator’s situational consciousness within the teleoperation context. As an example, through the simulation of plane operation, the GPS normally makes use of textual content and graphics to annotate visitors situations and route info (Christoph et al., 2007). The intuitive graphic annotations within the Surgical Wound Closure Coaching System present the precise grip level of the scalpel and the route of the scalpel minimize, giving the trainee efficient steerage on surgical operations and procedures (Andersen et al., 2016). The navigation system designed by Bolton et al. (2015) adopted anchored annotations to focus on landmarks and improved response occasions and success charges by 43.1 and 26.2%, respectively.

A well-designed VA can facilitate an operator’s spatial understanding whereas requiring a manageable degree of cognitive load. In the meantime, it has been reported that the processing of VA throughout operation could distract operators and have an effect on the working efficiency. Textual content annotations on head-mounted shows can distract topics and intrude with the studying process, probably lowering the efficiency (Orlosky et al., 2014). A number of topics within the excavator teleoperation experiment reported that the digital annotations have been distracting through the operation and harmed efficiency (Hong et al., 2020).

Present research have recognized a number of vital design options, similar to format, dimension, and place, which can play a vital function in a person’s psychological means of understanding VAs. The consultant codecs of VA embrace picture (Shapira et al., 2008; Fritsche et al., 2017), signal (Ziaei et al., 2011), and textual content or video with varied properties (Hori and Shimizu, 1999). Each single and multi-formats are studied within the current works. As an example, a single textual format is used to acquire hypertext info to create digital actuality idea maps (Verlinden et al., 1993). Yeh et al. (2013) used multi-formats, together with colour, textual content, and digits, to discover the results of collaborative duties. Pennington (2001) designed the cross-shaped VA and the ring-shaped VA to suggest stopping the motion and whistling to warn the employees.

The primary criterion for figuring out the scale of VA is that they need to be capable of remind folks to the best extent doable with out interfering with the remainder of the show (Hori and Shimizu, 1999). Some works fastened the scale of VA, similar to photographs of 640 × 480 pixels (Grasset et al., 2012), whereas some experiments adopted VAs with versatile sizes. Outcomes have proven that bigger VAs usually tend to be detected and responded to by topics (Orlosky et al., 2014).

With totally different VA showing or anchoring positions, customers have skilled totally different distractions, affecting process efficiency. Within the experiment by Driewer et al. (2005), the anchoring place of the VA modified based on the display, and the central place obtained essentially the most consideration from the topics. The highlighting of the perimeters of an impediment within the optimistic area of view is extra seen to the operator than the radar map within the higher proper nook (Hong et al., 2020). In an experiment the place individuals wore head-mounted shows to learn newspapers whereas strolling, individuals usually positioned textual content annotation beneath the middle of the display, avoiding the highest left and proper corners (Orlosky et al., 2014).

Different features, similar to colour and distinction, are additionally the essential elements when designing a VA. The affiliation of visitors sign colours (crimson, yellow, and inexperienced) with meanings similar to prohibitions or stops at intersections is globally acknowledged (Pennington, 2001), simply as detected obstacles and hazard zones flip crimson on maps (Driewer et al., 2005). However, it’s discovered that people could solely give attention to the areas of comparatively excessive visible saliency and ignore different areas and views (Sato et al., 2020).

Inside the context of teleoperating building tools, the VA system might help with object identification and goal detection in a dynamic building website. Operators can get hold of spatial details about the encircling atmosphere with the assistance of VA. Nonetheless, when VAs are offered to the operator, it raises one other query: how does an operator’s visible system allocate consideration to the VA and the work scene?

Human Visible Consideration

Researchers assumed an underlying relationship between consideration allocation and teleoperation performances (Riley et al., 2004). It has been divided into 4 classes: preattention, inattention, divided consideration, and targeted consideration (Matthews et al., 2003), and the totally different consideration ranges will result in totally different info acceptance (Kahneman, 1973). On the preattention stage, folks deal with objects that aren’t inherently obtainable for later processing and thus don’t have an effect on consciousness. Inattention makes an individual not acutely aware of a perceptual stimulus, however the info could have an effect on habits (Fernandez-Duque and Thornton, 2000). Divided consideration distributes consideration over a number of objects, and targeted consideration makes use of all attentional assets to give attention to one stimulus (Matthews et al., 2003).

The data processing of VAs throughout operation is probably associated to an operator’s visible consideration allocating mechanism. In teleoperation, info is principally obtained by the imaginative and prescient, and human consideration determines what folks consider or ignore (Anderson, 1980). Consideration could also be particularly vital when operators should give attention to VAs to realize an correct evaluation of the state of affairs. Generally, they could be inclined to the saliency impact. For instance, salient info from one place could draw a lot of the operator’s consideration, and knowledge from different areas is ignored (Thomas and Wickens, 2001).

The present literature has proposed a bottom-up framework for visible consideration examine (Bergen and Julesz, 1983). It emphasizes exploring elements that appeal to consideration, similar to colour and motion (El-Nasr and Yan, 2006). The associated research could be divided into two teams primarily based on whether or not the analysis media is static summary photographs or summary movies with altering backgrounds (Rea et al., 2017). The static photographs was once utilized in pure situations, and the movies are normally utilized in complicated scenes with free motion (Chun, 2000; Burke et al., 2005).

Human visible consideration requires a correct collection of measures. Researchers have adopted totally different metrics for analysis, similar to response price, process accuracy with trajectory, work effectivity with time, operation time, collision quantity, and response time (Chen et al., 2007; Menchaca-Brandan et al., 2007; Lengthy et al., 2011; Zornitza et al., 2014; Wallmyr et al., 2019). Amongst these research, some have given totally different weights to the evaluation indexes relying on their significance.

With laptop imaginative and prescient strategies rising prior to now decade, some researchers have explored the human visible consideration mechanism in 2D and 3D fields. Many experimental outcomes are offered by visible consideration maps or statistical charts. A visible consideration map summarizes essentially the most often visualized areas in a picture by a gaggle of topics (Corredor et al., 2017). For instance, El-Nasr and Yan (2006) took 2D and 3D video games as experimental duties to acquire two-dimensional and three-dimensional consideration maps after which analyzed eye-movement patterns. A dynamic and generally hazardous building websites usually require a teleoperator to conduct info integration of the positioning scene and VA indicators. An operation process has already positioned a specific amount of cognitive load on an operator, and the way a lot consideration can the operator afford to spare on processing VAs? Investigating the visible consideration allocating mechanism and constructing an consideration map is of significant significance in such a context.


A digital teleoperation platform was developed to hold out the experiment designed for this examine. It permits the person to carry out an excavating process repeatedly. Totally different VAs could seem through the experiment, and the person should conduct a sure motion based on the appeared VA. The working information have been recorded all through the entire time.

Digital Annotation Design

The design of VA on this examine follows a number of rules. First, a VA shall convey simple info that any operator, at first sight, can perceive. A complete of two shapes, ring and cross, are examined on this experiment (seek advice from Determine 1). The ring-shaped VA requires the operator to push the honk button whereas excavating, and the cross-shaped VA requires the operator to stop operation till the VA vanishes. Such a design ensures that an operator can simply perceive a VA so long as it’s seen. Accordingly, the generated map primarily presents details about allocating an operator’s visible consideration slightly than a fancy mixture of visible consideration, cognitive load, or different elements concerned throughout “pondering.”

Determine 1. Two kinds of VA: (A) ring-shaped (B) cross-shaped.

Second, the VA ought to seem in the proper location with correct dimension to be seen with restricted interference to the operator’s view of the work scene. The VA on this examine randomly displays totally different sizes of small, center, and enormous (Determine 2). The VA within the experiment will seem randomly at any location on the teleoperation display to analyze the situation’s impression. As well as, we designed a colorless worksite with crimson VAs to keep away from potential interference from totally different colour contrasts on the positioning.

Determine 2. Totally different VA sizes: small, center, and enormous.

Experiment Design

The experiment consists of three periods. Earlier than the experiment began, topics have been required to fill out the pre-task questionnaire to offer details about gender, age, and former 3D gaming expertise. The primary session presents all topics with a brief video introducing excavator operation and management (Determine 3). Every topic is given 5 min to familiarize the operation.

Determine 3. Introduction video screenshots.

The second session knowledgeable individuals that the aim of the check is to maneuver the balls from one trench to a different as quick as doable whereas performing actions based on the VA that randomly seems on the display. Then, 2 min is given for the topics to apply operation with VAs.

The third session is the formal check of 10 min. Determine 4 demonstrates the interplay mechanism between the topic and the system. The system initiates the duty and begins to show the cross and ring-shaped VAs in a random location with a random interval of three–9 s all through the experiment. The topic operates the excavator via two joysticks. When a VA seems, the topic should reply inside 6 s; in any other case, the VA will disappear, and will probably be thought of a failed case of VA response. The variety of balls moved and proper VA responses are offered within the prime left nook of the display.

Determine 4. Interplay mechanism of the experiments.

Experimental Platform

The teleoperation platform is deployed on a pc with 3.70 GHz Intel(R) Core(TM), 64G RAM, and NVIDIA GeForce RTX 2080 Ti with 11,048 MB VRAM. The excavator simulation software program is developed in Unity. The UML class diagram in Determine 5 illustrates the structure of the software program. The excavator mannequin was downloaded from GitHub, together with the excavators’ motion management. The experiment adopts a teleoperation view that resides within the cockpit.

Determine 5. UML class diagram of the software program platform.

A pilot check with three individuals was carried out earlier than the formal experiment to make sure that the system features correctly. After the formal check, all display video information have been rigorously reviewed to make sure that the collected information have been correct.


The topics have been recruited from the pool of Zhejiang College college students via invites and flyers. A complete of twenty topics have been recruited for the experiments, together with 10 females and 10 males. The imply age of the topics was 23.5 years. All individuals haven’t any building tools operation expertise. The 3D gaming expertise is split into three sorts: “by no means or hardly ever play,” “not fairly often however higher than the primary sort,” and “often play and good at 3D video games,” as advised by El-Nasr and Yan (2006). Most topics had earlier 3D recreation expertise (Determine 6).

Determine 6. Earlier 3D gaming expertise of topics.

Human Visible Consideration Evaluation Indices

Response price and response time are analyzed as the 2 main evaluation indices. Response price is the ratio of right responses over failed responses. Response time refers back to the length between a VA seems and the topic responds to it. The response price immediately measures the topic’s efficiency and the response time implies the problem of processing a VA. As well as, we additionally recorded what number of balls have been moved by every topic as an evaluation of excavating productiveness.


Descriptive Statistic Outcomes

The descriptive statistics information of gender and 3D recreation expertise are proven in Determine 7. Since solely two topics often play 3D video games, we mixed the 2 teams of “not fairly often” and “often play.” No clear sample was discovered.

Determine 7. Descriptive statistics of gender and 3D recreation expertise.

The response price, response time, and excavation productiveness of every topic have been submitted to a t-test, as listed in Desk 1. Gender demonstrates no vital impact in differentiating the performances of response price, response time, and excavation productiveness. Those that play extra 3D video games tended to reply shortly (p = 0.054), however the consequence was not statistically vital.

Desk 1. T-test outcomes.

Response Price

Desk 2 lists the response outcomes. It’s seen that the cross-shaped VA has a greater response price than the ring-shaped VA.

Desk 2. Descriptive statistics of response charges.

Determine 8 demonstrates the right responses for various sizes of VA. The radius of the dots (40 mm) within the scatter chart is estimated primarily based on the imaginative and prescient span idea (Frey and Bosse, 2018). The coordinate system in Determine 8 matches the decision of the teleoperation display, and the origin is the middle place of the display. The scattered factors are the corresponding place the place the VA seems on the display. Determine 9 consists of each right and failed responses. The blue dots stand for the right ones and the crimson dots for the failed responses.

Determine 8. Visualization of right response numbers: (A) cross VA, (B) ring VA.

Determine 9. Visualization of all response numbers: (A) cross VA, (B) ring VA.

To higher visualize the consequence, we divided the display into 8 × 12 grids and calculated an adjusted right response price for every grid by subtracting the variety of false responses from right responses. Determine 10 varieties the consequence right into a contour map, utilizing a spectrum of heat colour to chilly colour to signify the adjusted right response values from excessive to low.

Determine 10. Visible consideration map and corresponding view.

The map identifies 4 kinds of areas, as proven in Determine 10. Areas 1 and 4 are near the sting of the display. Particularly, space 4 refers back to the blind spot of excavator operation, the place the excavator’s increase blocks the view. An operator hardly ever wants to maneuver the eyesight into these areas to carry out an excavation process. They each have a low adjusted response price, as anticipated. Space 2 is close to and across the fovea imaginative and prescient area and has the best response price. The excavating motion principally occurs inside this space. An operator should pay sufficient consideration to the world for correct interplay between the excavator and the atmosphere. As well as, it’s seen that subareas A and B inside space 2 have excessive response charges. Subarea A corresponds to the rating billboard, and subarea B corresponds to the situation of the 2 trenches for digging and dumping, respectively. It is sensible that an operator pays extra consideration to the subareas. What stays to be defined is space 3, which is situated within the fovea space however presents the bottom response price.

Response Time

Desk 3 demonstrated that almost all response occasions are lower than 5 s. Typically, the response time of the ring VA is longer than that of the cross VA, and the response time is shorter when the scale is bigger.

Desk 3. Descriptive statistics of response time.

Determine 11 reveals the scattered diagrams of the response time. The radius of the dot is calculated by dividing the 40 mm by every corresponding response time. A big radius stands for a brief response time. As proven in Determine 11, when a VA seems on the fringe of the display, the operator’s response time can be extended accordingly. With the scale rising, the variety of bigger dots can also be rising. The cross VA, on common, wanted an extended response time. It ought to be famous that the cross VA results in a greater response price, based on Desk 2. Determine 12 is the contour map for response time. No clear sample could be discovered.

Determine 11. Visualization of response time: (A) cross VA, (B) ring VA.

Determine 12. Map of the adjusted profitable response time.

Knowledge Interpretation and Dialogue

This examine investigated human visible consideration with a VAs-aided teleoperation system. The outcomes revealed that human consideration allocation modified often with the totally different VA properties. This part analyzes the mechanism of human consideration allocation intimately.

Visible Consideration Throughout Excavator Operation

Determine 10 demonstrates a transparent sample of an operator’s visible consideration through the excavating process. A major discovering is that the working process considerably influences an operator’s visible consideration. On this experiment, an operator wants to maneuver balls from the left to the proper trench by performing actions of bucket digging, increase lifting, cabin rotation, and bucket dumping. The eyesight through the actions primarily fell into space 2, particularly subarea B in Determine 10. The excessive response price in subarea A additionally helps this discovering. As well as, it matches our current data about human visible consideration that the perfect space for the human eye to acknowledge objects is ± 10° horizontally and –30° to + 10° round the usual line of sight within the vertical path (Ren et al., 2012).

The affect on consideration allocation by the working process is more likely to override the impact of colour distinction. The positioning background is white within the experiment, and the excavator half is yellow. The crimson VA ought to be extra conspicuous towards the white background than the yellow background. However, the experiment didn’t differentiate the efficiency primarily based on the background colour.

The explanation inflicting a low response price in space 3 stays unrevealed. After rigorously reviewing the experiment video information a number of occasions, we nonetheless can’t establish a stable cause. We are able to solely speculate that the saliency impact could contribute to this phenomenon. Though space 3 is within the heart of the display, an operator’s visible consideration is drawn to the trenches and the shifting bucket more often than not. The trenches and the bucket hint type a hoop across the heart space, and the middle space, identical to areas 1 and 4, receives much less consideration from the operator. Nonetheless, it requires additional investigation to validate our hypothesis. As well as, sensing information could be collected throughout excavation duties, similar to eye-movement monitoring, electroencephalograph (EEG), and electromyography (EMG), as advised by Lee et al. (2022). The sensing information may present a possibility for extra simple statement.

Cross Digital Annotations vs. Ring Digital Annotations

In response to Desk 2, the cross VA reveals a a lot better efficiency in response price. Though the cross and ring are two widespread VA shapes utilized in many current research, we noticed outstanding variations on this experiment. The ring VA requires the operator to push the honk button and the cross VA to stop operation. Many topics demonstrated a “pondering” course of after they noticed a hoop VA, however only a few wanted to spend time on “pondering” for a cross VA. It’s doable as a result of the form of the cross usually means “cease” within the cultural background and in lots of sensible scenes, similar to visitors lights and no trespassing indicators. As well as, the VA colour on this examine is crimson, which can improve the impression of “cease.” However, irrespective of how straightforward we think about it may be to push the honk button to reply to a hoop VA, the problem degree raises dramatically when a topic is below a nerve-racking situation throughout excavator operation.

A sensible implication is that we have to rigorously think about all human frequent sense and cultural backgrounds through the design of VA. The impact of any further small cognitive load imposed on an operator in a nerve-racking working situation could also be escalated.

Visible Consideration by Digital Annotations Measurement

Intuitively, because the VA dimension will increase, topics usually tend to detect VAs. Some experiment information in Desk 2 and Determine 8 help this intuitive assumption; nevertheless, evidently the marginal optimistic impact of accelerating VA dimension is reducing. With the three totally different sizes, the typical response charges are 0.900, 0.914, and 0.915 for the cross-shaped VA and 0.571, 0.577, and 0.526 for the ring-shaped VA, respectively. The information current a development of enchancment from small to center sizes however not from center to massive sizes.

Contemplating the imaginative and prescient span idea that the human area of view with ample studying decision usually spans about 6 levels of arc, the middle-sized VA on this examine appears to be near the perfect most dimension. It brings up a vital query what’s the most acceptable VA dimension. We advise a bigger dimension in apply. On this experiment, the topics anticipate VAs to look throughout operation and are very more likely to have allotted a specific amount of consideration devoted to VAs. When VAs could not present up with a daily sample in a sensible scene, it might require a extra conspicuous strategy to current itself.


The overarching aim of this examine was to analyze the operator’s visible consideration when VAs are current throughout excavator teleoperation. A visible consideration map is constructed primarily based on the experiment outcomes, contemplating the impact of VA dimension, form, and showing location. It’s noticed that the excavating process influences an operator’s visible consideration, and the form of VA performs a vital function in allocating visible consideration. It’s also speculated that the advantage of rising VA dimension could have an asymptotic degree, and the optimum dimension is to be studied sooner or later.

A significant query is why there may be an consideration vacuum space within the imaginative and prescient heart. We advise future investigations with extra topics, eye-movement monitoring, and physiological measurement units. Testing on various kinds of building tools will even be useful.

Knowledge Availability Assertion

The unique contributions offered within the examine are included within the article/supplementary materials, additional inquiries could be directed to the corresponding writer/s.

Ethics Assertion

The research involving human individuals have been reviewed and authorised by the Human Analysis Ethics Committee. The sufferers/individuals supplied their written knowledgeable consent to take part on this examine.

Creator Contributions

JF: writing the draft, methodology, and information evaluation. XL: information assortment and modifying. XS: conceptualization, modifying, and supervision. All authors contributed to the article and authorised the submitted model.


This analysis was funded by the Middle for Stability Structure, Zhejiang College, China and the Nationwide Pure Science Basis of China (grant no. 71971196).

Battle of Curiosity

The authors declare that the analysis was carried out within the absence of any industrial or monetary relationships that could possibly be construed as a possible battle of curiosity.

Writer’s Be aware

All claims expressed on this article are solely these of the authors and don’t essentially signify these of their affiliated organizations, or these of the writer, the editors and the reviewers. Any product which may be evaluated on this article, or declare which may be made by its producer, will not be assured or endorsed by the writer.






