Hasty Briefsbeta

Bilingual

Half million 'Words with Spaces' missing from dictionaries

6 hours ago
  • #word-games
  • #linguistics
  • #dictionaries
  • Nearly half a million compound phrases with spaces are missing from dictionaries.
  • Traditional dictionaries focus on individual words, ignoring many multi-word expressions (MWEs).
  • Merriam-Webster covers only 18% of the top 10,000 MWEs, dropping to 7% by 100K.
  • Wiktionary improves coverage but still misses many MWEs, especially transparent and semi-opaque ones.
  • English has about 250 billion possible two-word combinations, with 15% being plausible.
  • Some MWEs carry conceptual weight beyond their parts, like 'hot dog' or 'red tape'.
  • Print dictionaries cover only 2-3% of MWEs, while Wiktionary covers 30%.
  • Named entities and technical terms are better covered by Wikipedia and Wiktionary.
  • The line between 'word' and 'phrase' is fuzzy, with some phrases functioning as single semantic units.
  • Lexicographers used substitutability tests to decide which phrases to include, favoring established terms.
  • Wiktionary volunteers, unconstrained by space, focused on phrasal verbs and everyday idioms.
  • For word games, MWEs could be playable if they name a thing, not just based on length.