Hasty Briefsbeta

Bilingual

Every GitHub Object Has Two IDs

4 months ago
  • #Software Development
  • #Reverse Engineering
  • #GitHub API
  • GitHub's API uses two separate ID systems: node IDs (from GraphQL) and database IDs (for URLs).
  • Node IDs are base64 encoded and can be decoded to extract the database ID using bitmasking.
  • The decoded node IDs revealed a 96-bit integer, with the database ID embedded in the lower 32 bits.
  • GitHub has legacy and new ID formats, with older repositories using a simpler, text-based format.
  • Newer IDs use MessagePack for binary serialization, encoding repository and object IDs in an array.
  • A function was created to convert node IDs to database IDs by decoding and extracting the last array element.
  • The exploration uncovered GitHub's mixed use of legacy and new ID formats, adding complexity to their system.