Five auto-annotation workflows. Each workflow begins with a descriptor text input. (i) DM sends the descriptor to a text-mining tool which annotates it with a set of ontology terms. (ii) DCM asks the LLM to preprocess the descriptor into multiple short concepts (blue pieces), which are then annotated by the textmining tool. (iii) DE uses the LLM to get an embedding vector of the descriptor, which it then compares to a pre-calculated database containing embedding vectors of all ontology terms. The descriptor is annotated with the terms with the highest vector similarity to the descriptor. (iv) DCE performs the same LLM preprocessing step in DCM to obtain concepts. DCE then uses the LLM to get an embedding vector of each concept, which it then compares to the embedding vector database of all ontology terms. The descriptor is annotated with the terms with the highest vector similarity to one or more concepts. (v) DCRAG first runs the DE and DCE workflows to get a list of candidate ontology terms. It then asks the LLM to choose the most appropriate terms for the descriptor from the list of candidates
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.