Skip to content

jayala-29/cusp-project-files

Repository files navigation

cusp-project-files writeup

Before running any scripts, dependencies need to be installed. If you have root privileges, simply run ./dependency-scripts/root-install.dependencies.sh. Otherwise, if you do not have root privileges and do not have conda installed, run ./dependency-scripts/install-conda.sh and restart your terminal to get conda working. Afterwards, run ./dependency-scripts/no_root-install-dependencies.sh to install additional requirements. Furthermore, you must some flavor of both python 2.x and python 3.x for CodeQL to work correctly.

Scripts that are used for static source code analysis leveraging CodeQL will be available in codeql-cli and queries after this step. Included are queries downloaded on 04/19/2022, but you still need to get CodeQL version 2.8.5. To do so, run ./setup-codeql.sh. It is recommended to have a Github token when prompted via gh auth login, which can be done by logging into Github on the browser and navigating here.

Afterwards, the required directory setup can be made with ./setup-dirs.sh which will create the following:

  • pr-nums
  • metadata
  • rands
  • topcc-prs
  • repos
  • databases
  • rand-results
  • cc-results

For a one-click run to generate results, repos.txt must contain target repo authors and the repo name itself in the <repo-author>/<repo-name> format. An example of this is as follows in repos.txt:

OctoPrint/OctoPrint
Ultimaker/Cura
Klipper3d/klipper
kubernetes-client/python

After this is done, simply run ./iter-repos.sh which will call collect-metadata.sh and run-codeql.sh in that order. No additional arguments are necessary as iter-repos.sh will pass them to the respective collection scripts using repos.txt. If you would like to speed up static analysis, you can increase the number of threads used by modifying the -j flag in run-codeql.sh, which is currently set to 6 for an 8-core machine, on lines 27-28 and 40-41.

Repo metadata results are stored in ./metadata/<repo-name>-data.csv, randomly selected PRs to be analyzed can be found in ./rands/<repo-name>-prnums.txt, and 15 PRs that contain the highest amount of code churn can be found in ./topcc-prs/<repo-name>-ccprs.txt.

Static analysis results from specified repos in repos.txt can be found in ./rand-results/<repo-name>, which are from randomly selected PRs, and in ./topcc-prs/<repo-name>, which are from 15 PRs that contain the highest amount of code churn.

If you would like to run the most up-to-date version of these tools, you can do so by downloading the latest codeql-cli here and latest queries repository here.

However, per limitations of this study, the only queries used to collect data are categorized as error queries in Python. This list can be found at ./queries/python/ql/src/q.txt.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published