Hasty Briefsbeta

Merriam-Webster and Unstructured Data Processing

9 days ago
  • #data processing
  • #dictionary
  • #unstructured data
  • Merriam-Webster's dictionary creation process involves collecting and curating unstructured data through 'reading and marking' by editors.
  • Editors structure the data by defining or revising words manually, a labor-intensive but high-value step.
  • Ancillary features like etymology and pronunciations add further value to the dictionary.
  • Successful data projects follow a pattern: collect unstructured data, structure it, and offer subsidiary datasets.
  • Examples include Google Search and cryptic crossword datasets, which follow a similar process.