Skip to content

Conversation

SalemJorden
Copy link
Contributor

@SalemJorden SalemJorden commented Apr 19, 2024

BigQuery DataFrames sample for Single time-series forecasting from Google Analytics data, Step two (optional): Visualize the time series you want to forecast.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. labels Apr 19, 2024
# Start by selecting the data you'll use for training. `read_gbq` accepts
# either a SQL query or a table ID. Since this example selects from multiple
# tables via a wildcard, use SQL to define this data. Watch issue
# https://github.com/googleapis/python-bigquery-dataframes/issues/169
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wildcard tables are now supported. You aren't using SQL here.

# [START bigquery_dataframes_single_timeseries_forecasting_model_tutorial]
import bigframes.pandas as bpd

# Start by selecting the data you'll use for training. `read_gbq` accepts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "Step two" (https://cloud.google.com/bigquery/docs/arima-single-time-series-forecasting-tutorial#step_two_optional_visualize_the_time_series_you_want_to_forecast) we aren't doing any training yet.

Instead, this sentence from the SQL version seems more applicable:

The FROM bigquery-public-data.google_analytics_sample.ga_sessions_* clause indicates that you are querying the ga_sessions_* tables in the google_analytics_sample dataset.

Please rephrase that to apply to what you're doing here.

'bigquery-public-data.google_analytics_sample.ga_sessions_*'
)
parsed_date = bpd.to_datetime(df.date, format= "%Y%m%d", utc = True)
total_visits = df.groupby(["date"])["parsed_date"].sum()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our 1:1 we did a series groupby to calculate the number of visits per day. It is possible to do the same with a DataFrame groupby, but if so, you'll need to select just the "visits" field here before calling sum(), which is slightly more convoluted since visits is a subfield of a struct.

@tswast tswast mentioned this pull request Apr 22, 2024
4 tasks
@tswast tswast marked this pull request as ready for review April 22, 2024 20:35
@tswast tswast requested review from a team as code owners April 22, 2024 20:35
Copy link

snippet-bot bot commented Apr 22, 2024

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@tswast tswast added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 22, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 22, 2024
@tswast tswast added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 22, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 22, 2024
@tswast
Copy link
Collaborator

tswast commented Apr 23, 2024

Looks like the e2e tests passed! 🎉 But there was a flake in the Kokoro presubmit tests that is unrelated to this change. Re-running the tests should hopefully let us merge.

@tswast tswast changed the title Docs: Single Time Series Forecasting Code Sample Step 2 docs: add the first sample for the Single time-series forecasting from Google Analytics data tutorial Apr 23, 2024
@tswast tswast enabled auto-merge (squash) April 23, 2024 14:56
@tswast tswast added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 23, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 23, 2024
@tswast tswast merged commit 2b84c4f into googleapis:main Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. size: s Pull request size is small.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants