Every GitHub Object Has Two IDs
4 months ago
- #Software Development
- #Reverse Engineering
- #GitHub API
- GitHub's API uses two separate ID systems: node IDs (from GraphQL) and database IDs (for URLs).
- Node IDs are base64 encoded and can be decoded to extract the database ID using bitmasking.
- The decoded node IDs revealed a 96-bit integer, with the database ID embedded in the lower 32 bits.
- GitHub has legacy and new ID formats, with older repositories using a simpler, text-based format.
- Newer IDs use MessagePack for binary serialization, encoding repository and object IDs in an array.
- A function was created to convert node IDs to database IDs by decoding and extracting the last array element.
- The exploration uncovered GitHub's mixed use of legacy and new ID formats, adding complexity to their system.