Two Sets of Models for Entity Detection:
NuNER & NuNER Zero
NuNER
NuNERZero
NuNER-v2.0
This model provides the best embeddings for NER in English. Roberta-base fine-tuned on the expanded version of NuNER.
NuNER_Zero
It reliably works with long entities of 5 tokens and more.
Trained on diverse dataset tailored for real-life use cases.
NuNER_multilingual-v0.1
Best embeddings for multilingual NER.
Multilingual BERT fine-tuned on an artificially annotated multilingual subset of Oscar dataset.
NuNER_Zero-span
Span-prediction version of NuNER_Zero.
For entities fewer than 12 tokens.
NuNER
The OG.
Few-shot version of the collection.
Entity recognition encoder pre-training via LLM-annotated data.
NuExtract_Zero-4k
Long-context version of NuNER_Zero.
For applications where context size matters.
NuNER
NuNER-v2.0
This model provides the best embeddings for NER in English. Roberta-base fine-tuned on the expanded version of NuNER.
NuNER_multilingual-v0.1
Best embeddings for multilingual NER.
Multilingual BERT fine-tuned on an artificially annotated multilingual subset of Oscar dataset.
NuNER
The OG.
Few-shot version of the collection.
Entity recognition encoder pre-training via LLM-annotated data.
NuNERZero
NuNER_Zero
It reliably works with long entities of 5 tokens and more.
Trained on diverse dataset tailored for real-life use cases.
NuNER_Zero-span
Span-prediction version of NuNER_Zero.
For entities fewer than 12 tokens.
NuExtract_Zero-4k
Long-context version of NuNER_Zero.
For applications where context size matters.
Comparison of NuNER with LLMs. Dashed curves indicate in-context learning and solid curves indicate fine-tuning.