Skip to content

chore: add a GH action for language checking using openAI #4271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .github/workflows/style.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Style
on:
pull_request:

jobs:
review:
runs-on: ubuntu-latest
steps:
- name: Checkout scripts directory
uses: actions/checkout@v4
with:
repository: vaadin/docs
path: .
sparse-checkout: scripts/

- uses: actions4gh/[email protected]

- name: Fetch changes and review them
id: review
run: |
# Run the script to fetch changes and send them to OpenAI
R=`./scripts/style-check/style-check.sh ${{ github.event.pull_request.number }} ${{ github.repository }}` || exit 1

# show response from OpenAI
set -x
R=`echo "$R" | jq -r '.choices[0].message.content'` || exit 1
set +x
[ -z "$R" ] && echo "Nothing to review" && exit 0

# store response in output variable 'result'
echo "result<<EOF" >> $GITHUB_OUTPUT
echo -e "$R" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
OPENAI_API_KEY: ${{ secrets.OPENAI_TOKEN }}
shell: bash

- if: ${{ success() }}
uses: peter-evans/find-comment@v3
id: fc
with:
issue-number: ${{ github.event.pull_request.number }}
body-includes: AI Language Review

- if: ${{ success() && steps.review.outputs.result }}
uses: peter-evans/create-or-update-comment@v3
with:
comment-id: ${{ steps.fc.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
edit-mode: replace
body: |
## AI Language Review
${{ steps.review.outputs.result }}

- if: ${{ success() && steps.fc.outputs.comment-id && !steps.review.outputs.result }}
run: gh api --method DELETE /repos/${{ github.repository }}/issues/comments/${{ steps.fc.outputs.comment-id }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
50 changes: 50 additions & 0 deletions scripts/style-check/get-file-diferences.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/bash
# --------------------------------------
# Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>
# --------------------------------------

if [ -z "$1" ]; then
echo "Error: PR number must be provided as the first argument"
echo "Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>"
exit 1
fi

if [ -z "$2" ]; then
echo "Error: Repository name (org/repo) must be provided as the second argument"
echo "Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>"
exit 1
fi

PR_NUMBER=$1
REPO=$2

# Get changed files in the PR
gh api -H "Authorization: token $GITHUB_TOKEN" \
repos/$REPO/pulls/$PR_NUMBER/files \
--paginate \
-q '.[].filename' |
while read -r file; do
if [[ "$file" =~ \.(tsx|adoc|java)$ ]]; then
echo "Processing: $file"

# URL encode the file path
encoded_file=$(jq -rn --arg x "$file" '$x|@uri')

# Get the base and head branch names
base_ref=$(gh pr view $PR_NUMBER -R $REPO --json baseRefName -q '.baseRefName')
head_ref=$(gh pr view $PR_NUMBER -R $REPO --json headRefName -q '.headRefName')

# Get the original content (from base branch)
gh api -H "Authorization: token $GITHUB_TOKEN" \
/repos/$REPO/contents/$encoded_file?ref=$base_ref \
-q '.content' | tr -d '\n' | openssl base64 -d -A > "$(basename "$file").old"

# Get the new content (from the PR head)
gh api -H "Authorization: token $GITHUB_TOKEN" \
/repos/$REPO/contents/$encoded_file?ref=$head_ref \
-q '.content' | tr -d '\n' | openssl base64 -d -A > "$(basename "$file").new"
else
echo "Ignoring: $file"
fi
done

76 changes: 76 additions & 0 deletions scripts/style-check/send-openai-request.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#!/bin/bash
# --------------------------------------
# Usage: $0 \
# "System prompt here" \
# "User prompt here" \
# file1.txt file2.adoc ...
# --------------------------------------

set -euo pipefail

if [ "$#" -lt 3 ]; then
echo "Usage: $0 \"System prompt\" \"User prompt\" file1 [file2 ...]" >&2
exit 1
fi

# Extract prompts, and collect the rest as files
system_prompt="$1"
user_prompt="$2"
shift 2
files=( "$@" )

# Validate files exist
for f in "${files[@]}"; do
if [ ! -f "$f" ]; then
echo "Error: file not found: $f" >&2
exit 1
fi
done

# Build the array of {type:"text",text:...} for each file, then the user prompt
# Use jq -Rs to JSON-escape each file’s contents
file_blocks="[]"
# start with an empty JSON array

for f in "${files[@]}"; do
# create one block for this file
json_block=$(jq -Rs --arg fn "$f" '{type:"text",filename:$fn,text:.}' "$f")

# append it to the existing array
file_blocks=$(jq -n \
--argjson blocks "$file_blocks" \
--argjson block "$json_block" \
'$blocks + [$block]'
)
done

# finally, add the user prompt as one more block
user_block=$(jq -Rn --arg txt "$user_prompt" '{type:"text",text:$txt}')
file_blocks=$(jq -n \
--argjson blocks "$file_blocks" \
--argjson block "$user_block" \
'$blocks + [$block]'
)


# Assemble the full payload
payload=$(jq -n \
--arg model "gpt-4o" \
--arg sp "$system_prompt" \
--argjson content "$file_blocks" \
'{model:$model,
messages:[
{role:"system", content: $sp},
{role:"user", content: $content}
]
}'
)

echo "Payload:" >&2
echo "$payload" >&2

# Fire the request
curl -s https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
--data "$payload"
42 changes: 42 additions & 0 deletions scripts/style-check/style-check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/bash
# --------------------------------------
# Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>
#
# OPENAI_API_KEY and GITHUB_TOKEN must be set in the environment.
# --------------------------------------

DIR=$(dirname "$0")

if [[ -z $OPENAI_API_KEY ]]; then
echo "Please set the OPENAI_API_KEY environment variable." >&2
exit 1
fi

if [[ -z $GITHUB_TOKEN ]]; then
echo "Please set the GITHUB_TOKEN environment variable." >&2
exit 1
fi

if [ -z "$1" ]; then
echo "Error: PR number must be provided as the first argument" >&2
echo "Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>" >&2
exit 1
fi

if [ -z "$2" ]; then
echo "Error: Repository must be provided as the second argument" >&2
echo "Usage: $0 <PR_NUMBER> <ORG/REPO_NAME>" >&2
exit 1
fi

bash $DIR/get-file-diferences.sh $1 $2 >&2

if compgen -G *.new *.old > /dev/null; then
SYSTEM_PROMPT=$(cat $DIR/system-prompt.txt)
USER_PROMPT=$(cat $DIR/user-prompt.txt)
bash $DIR/send-openai-request.sh "$SYSTEM_PROMPT" "$USER_PROMPT" *.old *.new
else
echo '{"choices":[{"message":{"content":""}}]}'
fi


23 changes: 23 additions & 0 deletions scripts/style-check/system-prompt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
You are a senior technical writer reviewing changes to documentation and source code files. Your goal is to ensure that changes improve clarity, accuracy, and completeness, while maintaining a professional and friendly tone.

Guidelines:
- Use American English.
- Prefer present tense over future tense.
- Review AsciiDoc, Java, JavaScript, and TypeScript files.
- A file ending in ".old" is the previous version. A file ending in ".new" is the updated version.
- If the ".old" file is empty, assume the file is new and review the ".new" file on its own.
- In source code files, check for obvious mistakes, including spelling errors and internal contradictions.

When reviewing AsciiDoc files:
- Write for a broad range of technical abilities without being condescending.
- Do not congratulate the reader.
- Spell out "and", "plus", and "or" in text (use "&" only in headings or titles for brevity).
- Use the Oxford comma (i.e., a comma before the final "and" or "or" in a list).
- Use title case for all titles and headings.
- Assume all features described are generally available (avoid wording that suggests features are under development unless in release notes or upgrade guides).
- Avoid comparisons to previous versions outside of release notes or upgrade guides.

Additional instructions:
- When commenting on specific parts, reference the real name of the modified file, which is the filename without the `.old` or `.new` extension.

Always focus on making documentation understandable, detailed, and inclusive.
13 changes: 13 additions & 0 deletions scripts/style-check/user-prompt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Context:
- You are provided with pairs of files: ".old" for the original version, ".new" for the modified version.
- These files are changes proposed by a GitHub pull request.

Task:
- Provide a brief review of the differences between each pair of files.
- Only list issues that require improvement.
- It's very important that you not produce any comment for lines that have not been modified in the PR.
- If a file is new (".old" is empty), only review the ".new" file without referencing a previous version.
- When reviewing parts, refer to the real name of the modified file (i.e., the filename without the `.new` or `.old` suffix) if possible.
- If you dont have any suggestion your response should be an empty string.

Focus your feedback strictly on necessary corrections or improvements.