Entity Extraction

Description: Evaluates an LLM's ability to identify and extract specific entities from extensive text, given predefined definitions and potential values for each entity.

Number of Samples: 131

Language: Polish

Provider: OLX Group

Evaluation Method: F1 score to evaluate the LLM's ability to extract entities from the given text. Ground truth was curated by initial labeling by a domain expert and finalized by a second expert after a thorough review.

Data Collection Period: April 2024 - May 2024

Last updated: June 19, 2024

Share this view
#
Model
Provider
F1 Score
No results.

Have a unique use-case you’d like to test?

We want to evaluate how LLMs perform on your specific, real world task. You might discover that a small, open-source model delivers the performance you need at a better cost than proprietary models. We can also add custom filters, enhancing your insights into LLM capabilities. Each time a new model is released, we'll provide you with updated performance results.

Leaderboard

An open-source model beating GPT-4 Turbo on our interactive leaderboard.

Don’t worry, we’ll never spam you.

Please, briefly describe your use case and motivation. We’ll get back to you with details on how we can add your benchmark.