Making 10M government PDF documents searchable
14 days ago
- #Search Tool
- #Government Documents
- Government organizations frequently use PDF files for document distribution due to their ease of forwarding and printing.
- Finding and accessing specific PDFs among millions of files can be challenging.
- GovScape, a research project by the University of Washington and Boston University, offers a search interface via the End of Term Web Archive’s 2020 crawl.
- The GovScape code is open source and available on GitHub.
- The tool is expected to become increasingly important in the future.