AI researchers develop computer vision method for highly accurate dichotomous image segmentation

For many years, the computer vision datasets that are the basis of many artificial intelligence (AI) models have provided accurate annotations. They have been quite good at meeting the needs of machine perception systems. However, to enable sensitive human-computer interaction and immersive virtual life, AI has reached an era where it demands exact results from computer vision algorithms. One of the most fundamental computer vision techniques, image segmentation, is key to helping robots perceive and understand the outside world.

For various applications, including image editing, 3D reconstruction, augmented reality (AR), satellite image analysis, medical image processing, and robot manipulation, it can offer more descriptions precise targets than image categorization and object identification. Based on how the applications mentioned above directly influence physical things, we can categorize them as “light” (like image editing and image analysis) and “heavy” (like manufacturing and surgical robots).

“Lightweight” applications may be more tolerant of segmentation failures and deviations because these issues primarily increase labor and time expenditures, often within reason. In contrast, deviations or failures in “heavy” applications are more likely to result in catastrophic repercussions, such as physical damage to objects or injuries that can be fatal to beings like people and animals. Therefore, models for these applications must be accurate and reliable. Due to their accuracy and robustness, most segmentation models are even less appropriate in these “heavy” applications, which prevents segmentation approaches from playing an increasingly crucial role in larger applications.

The researchers call this work dichotomous image segmentation (DIS), which attempts to separate extremely precise elements from photographs of nature. They aim to address “heavy” and “light” applications within a universal framework. Existing image segmentation challenges, however, mainly focus on segmenting objects with particular qualities, such as visible, disguised, meticulous, or specific categories. Since most of them use the same input/output formats and rarely use proprietary techniques explicitly designed to segment targets in their models, virtually all work is dataset dependent.

Unlike semantic segmentation, the suggested DIS task often focuses on images with one or more targets. It is easier to get more complete and accurate information about each target. Accordingly, it is very encouraging to develop a category-independent DIS task to accurately segment objects of varying structural complexity, regardless of their properties.

The researchers highlighted the following new contributions:

  1. 5,470 high-resolution photos and exact binary segmentation masks are combined into DIS5K, a large expandable DIS dataset
  2. A single starting point, IS-Net, designed with intermediate supervision, avoids overfitting in high-dimensional feature spaces by requiring direct synchronization of features.
  3. A new measure of human remediation efforts (HCE) counts the human interventions needed to repair bad areas.
  4. The DIS benchmark is based on the latest DIS5K, making it the most comprehensive DIS analysis

The dataset is expected to be released soon along with the model on the GitHub repository mentioned below.

This Article is written as a summary article by Marktechpost Staff based on the research paper 'Highly Accurate Dichotomous Image Segmentation'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, github and project.

Please Don't Forget To Join Our ML Subreddit

James G. Williams