Skip to content

Score gets bigger than 1, threshold not functional #910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
marconett opened this issue Mar 12, 2025 · 1 comment
Open

Score gets bigger than 1, threshold not functional #910

marconett opened this issue Mar 12, 2025 · 1 comment

Comments

@marconett
Copy link

marconett commented Mar 12, 2025

Describe the bug

According to the docs (https://docs.orama.com/open-source/usage/search/introduction#what-does-the-search-method-return), score should be between 0 and 1.

Using the example data from the threshold doc (https://docs.orama.com/open-source/usage/search/threshold), scores are between 0 and 1 and filtering results based on thresholds works.

But with the data I am working with, the score get's bigger than 1, which also leads to threshold being useless.

As an example, I used the stopwords from this library to show this behavior.

To Reproduce

import { create, insertMultiple, search } from "@orama/orama"
import { stopwords } from '@orama/stopwords/english'

const db = create({
  schema: {
    title: 'string',
  },
})

const getRandomWord = () => ' ' + stopwords[Math.floor(Math.random() * stopwords.length)];

insertMultiple(db, [
  ...stopwords.map(word => ({ title: word })),
  ...stopwords.map(word => ({ title: word + getRandomWord() }))
]);

const result = search(db, {
  term: 'her',
  threshold: 0,
});

console.log(result.hits.map(hit => {
  return {
    title: hit.document.title,
    score: hit.score,
  }
}));

Output:

[
  { title: 'hers', score: 6.584305791656615 },
  { title: 'herself', score: 6.584305791656615 },
  { title: "here's", score: 6.584305791656615 },
  { title: 'here', score: 6.227383875508277 },
  { title: 'her', score: 5.9423872295572 },
  { title: 'her her', score: 5.9423872295572 },
  { title: 'herself which', score: 3.70476794591908 },
  { title: "here's your", score: 3.70476794591908 },
  { title: "how's hers", score: 3.70476794591908 },
  { title: 'before herself', score: 3.70476794591908 }
]

Expected behavior

  • Score should be between 0 and 1.
  • Threshold should work as documented.

Environment Info

OS: MacOS 15.3.2
Node: 18.20.5
Orama: 3.1.2

Affected areas

Search

Additional context

No response

@micheleriva
Copy link
Member

Hi @marconett, we just released Orama v3.1.6 with a fix on the threshold. Would you mind testing if your issue is solved? About scores being > 1, I should probably update the docs. We can rescale the scores to be between 0 and 1... but I'm not sure what the advantage would eventually be from a technical standpoint. Please tell me if I missed something!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants