Artificial intelligence (AI) is not a panacea for effortlessly solving the planet’s environmental problems. AI still sparks passionate and dystopian predictions within some parts of the academic community, especially in the natural sciences. For some, the existence of AI tools means an existential threat to human creativity.10 Concerns about the increasing environmental costs of carbon emissions1 and water use demanded by information and communication technologies are also on the horizon. These viewpoints, however, overlook the advantages of employing AI in biodiversity research.
It is time to address the elephant in the room. In the catastrophic scenario of declining species numbers in the Anthropocene, computer scientists and biologists must work together for a deeper understanding of Earth’s biota. Solving our shared environmental problems will require collaboration of major companies and academic research groups. It is a two-way path: we need both AI developments that meet the demands of biologists, ecologists, botanists, and zoologists, and, at the same time, minimal standardization of species datasets—species description templates, georeferences, molecular markers, metadata—that allow the effective training of AI based-tools to scientific purposes.
Recognizing biodiversity is more than a matter of terminology discussed in seminars in natural history museums or outdated university departments. It is estimated there nearly 100 million species on the planet, but only approximately two million have been formally described. To a somewhat shocking degree, we do not even know what we do not know. Among the many other benefits, understanding the richness of this unknown biota can represent an economic asset, benefiting pharmacological and medical industries as well as serving as a cornerstone for deep tech companies, which can explore biodiversity sustainably while respecting environmental integrity and the knowledge of indigenous and traditional populations. AI could be utilized to revolutionize ecosystem conservation and biodiversity description by analyzing vast and varied data sources, ranging from assemblages of fossilized trilobites and extinct dinosaurs to the myriad morphological attributes of a single insect wing.
Biological taxonomy, the science of identifying, describing, and classifying organisms, has a long heritage of using technology. For example, modern biologists use software for illustration and digital photography, and ecological, phylogenetic and biogeographical analysis. Computational tools to aid the preparation of species descriptions date back to the 1970s. Dallwitz’s program2 for constructing identification keys is an example. Over the years, this system has evolved into DELTA (DEscription Language for TAxonomy), which serves as a comprehensive system for encoding species descriptions for computer processing. Computer-assisted biological taxonomy remains a prominent topic in the field.6,8
Today, the biologists’ workflow is a dynamic blend of traditional methods and technological advancements. While the core principles of the activity continue to be rooted in meticulous human-based observation and classification (fieldwork, specimens collection and mounting, manually species identification, collection curation), the integration of digital tools has streamlined and enhanced the process. Considering the advance of generative AI (GenAI), we have all the ingredients to develop efficient and consistent AI-based routines that will replace systems such as DELTA in species recognition and description, allowing the gain of precision and comparability and accelerating the process of biodiversity recognition and documentation.
AI has already made significant strides in the field of biological taxonomy. Deep learning and computer vision allied to sensors have been used to validate image-based taxonomic identification and to develop public and curated reference databases.3 Well-established machine learning approaches, such as convolutional neural networks (CNNs) and random forests, have helped recognize patterns from images and identify insect species.4,7 We are currently investigating the power of Vision Transformer (ViT) methods5 to identify and classify species, considering the intrinsic morphological complexity of insect groups, our target taxon. However, a gap exists between current computational approaches in biology and the state of the art in GenAI research, suggesting ample room for further advancement. From the biological point of view, computer scientists who understand the immensity of the issues related to diversity loss and climate change are greatly needed.
We face interesting opportunities when using GenAIs in semi-automated species description after photographs and illustrations, preparation of structured taxonomic papers from notes and information extracted from simple sheets, and construction of character lists for evolutionary and phylogenetic analyses. Nonetheless, some popular AI tools based on large language models (LLMs) such as ChatGPT and Bard/Gemini are not fine-tuned enough to allow scientifically accurate results, but the initial outputs are exciting. Actually, the current generation of LLMs can identify morphological body patterns in images, even when organisms are camouflaged in their natural habitat. However, they cannot definitively determine whether a specific entity belongs to a recognized species among a wide variety of biological groups, especially the most diverse ones, such as insects.
In standard taxonomic procedure, dichotomous identification keys are used by biologists to classify specimens in particular taxonomic categories (order, family, genus, and species, to name a few) based on observation under optical microscopes, scanning microscopes, and stereomicroscopes. This meticulous activity is time-consuming and not error-free. If efficient and accurate AI tools could be developed that are less prone to variation among human analysts that could have a huge impact on the near future of biological taxonomy. As species identification is fundamental for diversity measurements used in environmental conservation strategies, as well as medical and epidemiological analyses, boosting efficiency in taxonomy is crucial in the contemporary context of climate change and its adverse consequences on natural environments.
An even more complex task is describing species from scratch. Given that the work of taxonomists to document new species necessarily involves high-definition photos, electronic micrographs, and illustrations followed by detailed morphological descriptions, the development of AI-based tools to recognize patterns in images, compare them with known species, identify new species and produce structured taxonomic descriptions, would significantly speed up the recognition of biota, especially in countries with few professional biologists and insufficient funding for basic research. The computational challenge involves the developers’ recognition of the peculiarities of biological studies and the importance of detail beyond identifying general patterns.
Biological taxonomy is the first step toward the understanding of species’ relationships and evolutionary history. Since a significant portion of the biota that once existed on the planet will never be known, gaps in the reconstruction of evolutionary trees are common. Data augmentation driven by GenAI could play a relevant role here precisely because, based on the recognition of morphological patterns in described species, they could generate new data to train models and suggest putative species that would help explain critical evolutionary transitions that have happened in the four billion years since the origin of life on Earth. Aided by AI, knowing the past of the planet’s biota, notably the periods of mass extinctions in which a significant part of life disappeared, could allow us to build conceptual and practical tools to deal with the biodiversity crisis we are experiencing now.
As most of the taxonomic research happens when sitting in front of a computer, regardless of the time spent in fieldwork or at the bench, the training of the next generation of biologists will have to consider AI’s ubiquity. Despite the progress, biologists still approach the reliability of AI cautiously. Concerns linger regarding the possibilities of errors and inaccuracies in automated processes: the fear of an AI mishap leading to flawed taxonomy and subsequent academic repercussions is palpable. Valan et al.9 provided questions and answers and a study case about how taxonomists can confidently use off-the-shelf CNNs. The main issue is that biologists did not fully accept this perspective. In this sense, while embracing technological advancements is essential to tackle the huge scale of the problem, we also need ways of automatizing activities without removing humans from the review and error correction processes. Any technological tool aimed at revolutionizing biodiversity studies must balance automation and human oversight, ensuring accuracy, reliability, and user trust.
The Anthropocene presents an unparalleled challenge to human civilization. The recent tragedy in the Brazilian state of Rio Grande do Sul, in which nearly 90% of the state’s cities and two million people were affected by historic rains and floods, is a clear example of how ignoring serious environmental policies can have disastrous social, economic, and environmental consequences. Dealing with the current environmental crisis is pivotal for humanity’s future, and the collaborative efforts of computer scientists and biologists are essential in this regard. The ability to solve biodiversity-related problems through computational thinking will depend on the developers’ understanding of the biological contexts in which the problems exist. In a nutshell, training massive datasets and publishing appealing methods are not enough. In biological sciences, the debate about AI should transcend technical advancements alone. As we move forward, we need to ensure AI is a bridge rather than a barrier hindering our pursuit of understanding and preserving the natural world.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment