-
Couldn't load subscription status.
- Fork 0
Sourcery refactored main branch #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| from pyspark.sql import functions as F | ||
| from faker import Faker | ||
| from collections import OrderedDict | ||
| from collections import OrderedDict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 22-65 refactored with the following changes:
- Use f-string instead of string concatenation [×4] (
use-fstring-for-concatenation)
| except: | ||
| print('File already exists') | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the following improvement in Lines 47-59:
| except: | ||
| print('File already exists') | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the following improvement in Lines 50-62:
|
|
||
| print('---------------------------------------------------') | ||
| print('Processing Record Number: ', rec_cnt) | ||
|
|
||
| # Define the full API call for current record in the DataFrame | ||
| full_url = url_part1 + str(row['lat']) + "&lon=" + str(row['lon']) + url_part2 + api_key | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 67-89 refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation)
| elif profile == 'default': | ||
| config = ProfileConfigProvider().get_config() | ||
| else: | ||
| if profile == 'default': | ||
| config = ProfileConfigProvider().get_config() | ||
| else: | ||
| config = ProfileConfigProvider(profile).get_config() | ||
|
|
||
| config = ProfileConfigProvider(profile).get_config() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function create_api_client refactored with the following changes:
- Merge else clause's nested if statement into elif (
merge-else-if-into-elif)
| return ( | ||
| train_data | ||
| ) | ||
| return features.filter(features.issue_year <= 2015) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function train_data refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| valid_data = features.filter(features.issue_year > 2015) | ||
|
|
||
| return ( | ||
| valid_data | ||
| ) | ||
| return features.filter(features.issue_year > 2015) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function valid_data refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| df = dlt.read_stream("sales_orders_cleaned").where("city == 'Los Angeles'") | ||
| df = dlt.read_stream("sales_orders_cleaned").where("city == 'Los Angeles'") | ||
| df = df.select(df.city, df.order_date, df.customer_id, df.customer_name, explode(df.ordered_products).alias("ordered_products_explode")) | ||
|
|
||
| dfAgg = df.groupBy(df.order_date, df.city, df.customer_id, df.customer_name, df.ordered_products_explode.curr.alias("currency"))\ | ||
| .agg(sum(df.ordered_products_explode.price).alias("sales"), sum(df.ordered_products_explode.qty).alias("qantity")) | ||
|
|
||
| return dfAgg | ||
| return df.groupBy( | ||
| df.order_date, | ||
| df.city, | ||
| df.customer_id, | ||
| df.customer_name, | ||
| df.ordered_products_explode.curr.alias("currency"), | ||
| ).agg( | ||
| sum(df.ordered_products_explode.price).alias("sales"), | ||
| sum(df.ordered_products_explode.qty).alias("qantity"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function sales_order_in_la refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| df = dlt.read_stream("sales_orders_cleaned").where("city == 'Chicago'") | ||
| df = dlt.read_stream("sales_orders_cleaned").where("city == 'Chicago'") | ||
| df = df.select(df.city, df.order_date, df.customer_id, df.customer_name, explode(df.ordered_products).alias("ordered_products_explode")) | ||
|
|
||
| dfAgg = df.groupBy(df.order_date, df.city, df.customer_id, df.customer_name, df.ordered_products_explode.curr.alias("currency"))\ | ||
| .agg(sum(df.ordered_products_explode.price).alias("sales"), sum(df.ordered_products_explode.qty).alias("qantity")) | ||
|
|
||
| return dfAgg | ||
| return df.groupBy( | ||
| df.order_date, | ||
| df.city, | ||
| df.customer_id, | ||
| df.customer_name, | ||
| df.ordered_products_explode.curr.alias("currency"), | ||
| ).agg( | ||
| sum(df.ordered_products_explode.price).alias("sales"), | ||
| sum(df.ordered_products_explode.qty).alias("qantity"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function sales_order_in_chicago refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable)
| fname = self.filename + '/tweets_' + str(file_timestamp) + '.json' | ||
| fname = f'{self.filename}/tweets_{str(file_timestamp)}.json' | ||
|
|
||
|
|
||
| f = open(fname, 'w') | ||
| for tweet in self.tweet_stack: | ||
| f.write(jsonpickle.encode(tweet._json, unpicklable=False) + '\n') | ||
| f.close() | ||
| with open(fname, 'w') as f: | ||
| for tweet in self.tweet_stack: | ||
| f.write(jsonpickle.encode(tweet._json, unpicklable=False) + '\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function TweetStream.write_file refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation) - Use
withwhen opening file to ensure closure (ensure-file-closed)
Sourcery Code Quality Report❌ Merging this PR will decrease code quality in the affected files by 0.01%.
Here are some functions in these files that still need a tune-up:
Legend and ExplanationThe emojis denote the absolute quality of the code:
The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request. Please see our documentation here for details on how these metrics are calculated. We are actively working on this report - lots more documentation and extra metrics to come! Help us improve this quality report! |
Branch
mainrefactored by Sourcery.If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.
See our documentation here.
Run Sourcery locally
Reduce the feedback loop during development by using the Sourcery editor plugin:
Review changes via command line
To manually merge these changes, make sure you're on the
mainbranch, then run:Help us improve this pull request!