cusp-project-files writeup

Before running any scripts, dependencies need to be installed. If you have root privileges, simply run ./dependency-scripts/root-install.dependencies.sh. Otherwise, if you do not have root privileges and do not have conda installed, run ./dependency-scripts/install-conda.sh and restart your terminal to get conda working. Afterwards, run ./dependency-scripts/no_root-install-dependencies.sh to install additional requirements. Furthermore, you must some flavor of both python 2.x and python 3.x for CodeQL to work correctly.

Scripts that are used for static source code analysis leveraging CodeQL will be available in codeql-cli and queries after this step. Included are queries downloaded on 04/19/2022, but you still need to get CodeQL version 2.8.5. To do so, run ./setup-codeql.sh. It is recommended to have a Github token when prompted via gh auth login, which can be done by logging into Github on the browser and navigating here.

Afterwards, the required directory setup can be made with ./setup-dirs.sh which will create the following:

pr-nums
metadata
rands
topcc-prs
repos
databases
rand-results
cc-results

For a one-click run to generate results, repos.txt must contain target repo authors and the repo name itself in the <repo-author>/<repo-name> format. An example of this is as follows in repos.txt:

OctoPrint/OctoPrint
Ultimaker/Cura
Klipper3d/klipper
kubernetes-client/python

After this is done, simply run ./iter-repos.sh which will call collect-metadata.sh and run-codeql.sh in that order. No additional arguments are necessary as iter-repos.sh will pass them to the respective collection scripts using repos.txt. If you would like to speed up static analysis, you can increase the number of threads used by modifying the -j flag in run-codeql.sh, which is currently set to 6 for an 8-core machine, on lines 27-28 and 40-41.

Repo metadata results are stored in ./metadata/<repo-name>-data.csv, randomly selected PRs to be analyzed can be found in ./rands/<repo-name>-prnums.txt, and 15 PRs that contain the highest amount of code churn can be found in ./topcc-prs/<repo-name>-ccprs.txt.

Static analysis results from specified repos in repos.txt can be found in ./rand-results/<repo-name>, which are from randomly selected PRs, and in ./topcc-prs/<repo-name>, which are from 15 PRs that contain the highest amount of code churn.

If you would like to run the most up-to-date version of these tools, you can do so by downloading the latest codeql-cli here and latest queries repository here.

However, per limitations of this study, the only queries used to collect data are categorized as error queries in Python. This list can be found at ./queries/python/ql/src/q.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

cusp-project-files writeup

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
dependency-scripts		dependency-scripts
queries		queries
README.md		README.md
collect-metadata.sh		collect-metadata.sh
iter-repos.sh		iter-repos.sh
repos.txt		repos.txt
run-codeql.sh		run-codeql.sh
setup-codeql.sh		setup-codeql.sh
setup-dirs.sh		setup-dirs.sh

jayala-29/cusp-project-files

Folders and files

Latest commit

History

Repository files navigation

cusp-project-files writeup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages