AI computer vision in RPA

Virtual desktop infrastructures (VDIs) around Citrix, VMware or Windows Remote Desktop pose some challenges for Robotic Process Automation (RPA). In general, RPA relies on so-called selectors that use the underlying properties of user interfaces (UI) and their elements. Text fields and buttons in native desktop systems are identified in this manner and correctly set settings lead to reliable and robust procedures. The UI that we see in virtual environments is just an image that is sent from the remote desktop. In this case, no selectors can be identified. Efforts to automate therefore amount to optical character recognition (OCR) or «image matching». These methods are not as reliable as the selectors described, but positive results can nevertheless be achieved. However, it should be noted that even minor adjustments to the UI or when using a different screen resolution can lead to inaccuracies and errors that prevent automation.

With “AI Computer Vision” from UiPath , these challenges are mastered. An algorithm (1) enables human-like recognition of user interfaces by «using a mixture of AI, OCR, fuzzy matching of texts and an anchor system that holds everything together». As a result, the activities enable the software robot to visually identify transmitted elements of the remote desktop. Accordingly, AI Computer Vision does not rely on image matching, which makes the automated process steps in the workflow resistant to interface adjustments or a changed screen resolution. In addition to the options mentioned above, the Computer Vision activities can be used for element recognition in other cases such as in SAP, Silverlight, PDFs or for images in general.

To be able to use the Computer Vision activities in UiPath Studio, the official package of UiPath «UiPath.ComputerVision.Activities» has to be added. A demo of AI Computer Vision can be found at this link:


(1) The algorithm is already integrated in the software and is also improved by UiPath itself, not by the user. Accordingly, there is no lead time and no individual learning of the machine learning algorithm before you can use the activities yourself. If someone from the UiPath community discovers an element in a virtual environment that is not yet identifiable, the element or area can be reported to UiPath with a simple function. This improves AI Computer Vision for all developers and users.

Routinuum GmbH – your integration partner?

Our company specializes in the conception and implementation of RPA projects and a UiPath partner. We support our customers in all project phases in order to anchor the possibilities of this technology in the respective organization in the long term. Have we piqued your interest? Please contact us without obligation to get more information on the topic or to discuss the possible uses in your company.

Leave a Comment