Hasty Briefsbeta

Making 10M government PDF documents searchable

14 days ago
  • #Search Tool
  • #PDF
  • #Government Documents
  • Government organizations frequently use PDF files for document distribution due to their ease of forwarding and printing.
  • Finding and accessing specific PDFs among millions of files can be challenging.
  • GovScape, a research project by the University of Washington and Boston University, offers a search interface via the End of Term Web Archive’s 2020 crawl.
  • The GovScape code is open source and available on GitHub.
  • The tool is expected to become increasingly important in the future.