Statistically improbable phrase

A statistically improbable phrase (SIP) is a phrase or set of words that occurs more frequently in a document (or collection of documents) than in some larger corpus. Amazon.com uses this concept in determining keywords for a given book or chapter, since keywords of a book or chapter are likely to appear disproportionately within that section.

Source: Wikipedia — Statistically improbable phrase (CC BY-SA 4.0)

Statistically improbable phrase

A statistically improbable phrase (SIP) is a phrase or set of words that occurs more frequently in a document (or collection of documents) than in some larger corpus. Amazon.com uses this concept in determining keywords for a given book or chapter, since keywords of a book or chapter are likely to appear disproportionately within that section.

Source: Wikipedia "Statistically improbable phrase" · CC BY-SA 4.0

Share this article: X · Bluesky
Privacy Policy