Github Scrape
Hi, please read this note I am Siddharth, a first year MIIS student at Carnegie Mellon’s Language Technologies Institute. I am working on neural information retrieval project with Prof. Jamie Callan which involves collecting commit data along with pre and post commit file states. I will be querying Github’s API for this data on various big repositories (like PyTorch & Tensorflow) and storing it locally. I will be optimizing it in some time to just clone the repository and build the dataset locally, but for proof of concept, I am doing with API (albeit at a small scale)....