Skip to content

Conversation

@sourcery-ai
Copy link

@sourcery-ai sourcery-ai bot commented Nov 2, 2023

Branch main refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch origin sourcery/main
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

@sourcery-ai sourcery-ai bot requested a review from hkhdair November 2, 2023 03:28
Comment on lines -73 to +80
filename = os.path.join(TRANSCRIPT_FOLDER, video_id + ".json")

metadata = {}
metadata["speaker"] = ""
metadata["title"] = playlist_item["snippet"]["title"]
metadata["videoId"] = playlist_item["snippet"]["resourceId"]["videoId"]
metadata["description"] = playlist_item["snippet"]["description"]

filename = os.path.join(TRANSCRIPT_FOLDER, f"{video_id}.json")

metadata = {
"speaker": "",
"title": playlist_item["snippet"]["title"],
"videoId": playlist_item["snippet"]["resourceId"]["videoId"],
"description": playlist_item["snippet"]["description"],
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function gen_metadata refactored with the following changes:


video_id = playlist_item["snippet"]["resourceId"]["videoId"]
filename = os.path.join(TRANSCRIPT_FOLDER, video_id + ".json.vtt")
filename = os.path.join(TRANSCRIPT_FOLDER, f"{video_id}.json.vtt")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_transcript refactored with the following changes:

# Get the next page token from the response and create a new request object
next_page_token = response.get("nextPageToken")
if next_page_token:
if next_page_token := response.get("nextPageToken"):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 149-167 refactored with the following changes:

This removes the following comments ( why? ):

# Get the next page token from the response and create a new request object

word_count = len(words)
if word_count > 0:
append_text = " ".join(words[0 : int(word_count * PERCENTAGE_OVERLAP)])
append_text = " ".join(words[:int(word_count * PERCENTAGE_OVERLAP)])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function append_text_to_previous_segment refactored with the following changes:

if current_seconds < seg_finish_seconds and total_tokens < MAX_TOKENS:
# add the text to the transcript
text += current_text + " "
text += f"{current_text} "
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_json_vtt_transcript refactored with the following changes:

Comment on lines -146 to +148
if len(time_value) == 3:
h, m, s = time_value
return int(h) * 3600 + int(m) * 60 + int(s)
else:
if len(time_value) != 3:
return 0
h, m, s = time_value
return int(h) * 3600 + int(m) * 60 + int(s)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function convert_time_to_seconds refactored with the following changes:

"""This function removes the text from each dictionary in the list."""
return [
{k: v for k, v in seg.items() if k != "text" and k != "description"}
{k: v for k, v in seg.items() if k not in ["text", "description"]}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function remove_text refactored with the following changes:

# create multiple threads to process the queue
threads = []
for i in range(PROCESSING_THREADS):
for _ in range(PROCESSING_THREADS):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 228-228 refactored with the following changes:

# create multiple threads to process the queue
threads = []
for i in range(PROCESSOR_THREADS):
for _ in range(PROCESSOR_THREADS):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 179-179 refactored with the following changes:

Comment on lines -193 to +196
if len(time_value) == 3:
h, m, s = time_value
return int(h) * 3600 + int(m) * 60 + int(s)
else:
if len(time_value) != 3:
return 0
h, m, s = time_value
return int(h) * 3600 + int(m) * 60 + int(s)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function convert_time_to_seconds refactored with the following changes:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant