Timezones with date only indexes in Athena

My data is stored in Athena, and indexed by partitions denoting the date with daily granularity, i.e. `2025-01-01` is a single partition.

Up until recently I've been able to run queries against the `dt` column which hits the correct partition and scans the correct amount of data for the dates requested in the query.

```
dt: {
  sql: `from_iso8601_timestamp(dt)`, // or date_parse(dt, '%Y-%m-%d')
  type: `time`,
},
```

Now I need to query the data taking timezone into account. Obviously the above doesn't have time included so it'll select the wrong data.

```
dt: {
  sql: `from_iso8601_timestamp(dt || 'T' || time)`,
  type: `time`,
},
```

The problem here is that now the partition isn't hit because there's no where clause directly referencing `dt` such as `WHERE dt = '2025-01-01'`.

```sql
-- hits the partition
select * from e where from_iso8601_timestamp(dt) BETWEEN ...

-- does not hit the partition
select * from e where from_iso8601_timestamp(dt || 'T' || time) BETWEEN ...
```

The problem I'm facing is that using Cube's traditional means, I'm unable to select against the `dt` column.

The way that I've currently got this working is using `queryRewrite` to add filters on a string version of `dt` which overlaps the date range set by the timeDimesions query by adding or subtracting a day where necessary.

```javascript
queryRewrite: (query) => {
  const tdDt = query.timeDimensions.find(
    ({ dimension }) => dimension === 'EventsTesting.dt'
  );

  const offset = dayjs().tz(query.timezone).utcOffset();

  const from = dayjs(tdDt.dateRange[0]).subtract(offset > 0 ? 1 : 0, 'days');
  const to = dayjs(tdDt.dateRange[1]).add(offset < 0 ? 1 : 0, 'day');

  query.filters.push({
    member: 'EventsTesting.dtString',
    operator: 'gte',
    values: [from.format('YYYY-MM-DD')],
  });

  query.filters.push({
    member: 'EventsTesting.dtString',
    operator: 'lte',
    values: [to.format('YYYY-MM-DD')],
  });

  return query;
}
```

Resulting in

```sql
WHERE
  (
    from_iso8601_timestamp(dt || 'T' || time) >= from_iso8601_timestamp(?)
    AND from_iso8601_timestamp(dt || 'T' || time) <= from_iso8601_timestamp(?)
  )
  AND ("events_testing".dt >= ?)
  AND ("events_testing".dt <= ?)
```



Update: This does not work with preagg building as it doesn't use queryRewrite so building preaggs still scans all my data because cube is not letting me just add a where clause to `dt` in Athena's expected format to match a partition


**To Reproduce**
Steps to reproduce the behavior:

Using Athena as the driver, run the test below in vitest and observe results.


**Expected behavior**
I'm not entirely sure what the best course of action here is as obviously this is partly to do with Athena.

I tried to use various time dimensions rather than filters to manipulate the query prior to reaching the above solution. I could only manipulate the left side of the from 

For example take the below filter function - I can't slice the time values off because cube modifies them 

```javascript
      FILTER_PARAMS.EventsTesting.dt.filter((from, to) => {
        const fromDate = from.substr(0, 10);
        const toDate = to.substr(0, 10);
        return `from_iso8601_timestamp(dt || 'T' || time)
            BETWEEN from_iso8601_timestamp(${from})
            AND from_iso8601_timestamp(${to})
            AND dt BETWEEN ${fromDate} AND ${toDate}
          `;
      })
```

and it ends up like so 

```sql
(from_iso8601_timestamp(dt || 'T' || time)
            BETWEEN from_iso8601_timestamp('2025-01-31T13:00:00.000Z')
            AND from_iso8601_timestamp('2025-02-01T12:59:59.999Z')
            AND dt BETWEEN '2025-01-31T13:00:00.000Z' AND '2025-02-01T12:59:59.999Z'
          )
```

I suppose some better ability to control how dimensions (particularly time dimensions) are managed by cube maybe is the way here?

Perhaps I'm going about the whole thing wrong or it's just a missing feature relating to the Athena Driver which has been overlooked or something. It's felt like a huge struggle to get to the above so maybe it's the API, the DX, the Documentation or just me but I'm hoping to open a discussion about it in some way.

**Minimally reproducible Cube Schema**
In case your bug report is data modelling related please put your minimally reproducible Cube Schema here.
You can use selects without tables in order to achieve that as follows.

Use Athena as the driver.

```javascript
cube(`EventsTesting`, {
  // LA 7,8,9,10
  // NY 6,7,8,9
  // SYD 3,4,5,6
  // UTC 5,6,7,8
  // ATH 4,5,6,7

  data_source: `events`,

  sql: `
    WITH data AS (
        SELECT * FROM (
            VALUES
                ('2025-01-31', '01:00:00', 1),
                ('2025-01-31', '07:00:00', 2),
                ('2025-01-31', '15:00:00', 3),
                ('2025-01-31', '23:00:00', 4),
                ('2025-02-01', '01:00:00', 5),
                ('2025-02-01', '07:00:00', 6),
                ('2025-02-01', '15:00:00', 7),
                ('2025-02-01', '23:00:00', 8),
                ('2025-02-02', '01:00:00', 9),
                ('2025-02-02', '07:00:00', 10),
                ('2025-02-02', '15:00:00', 11),
                ('2025-02-02', '23:00:00', 12),
                ('2025-02-03', '01:00:00', 13),
                ('2025-02-03', '07:00:00', 14),
                ('2025-02-03', '15:00:00', 15),
                ('2025-02-03', '23:00:00', 16)
        ) AS t(dt, time, event_id)
    )
    SELECT e.*
    FROM data e
    WHERE ${FILTER_GROUP(
      FILTER_PARAMS.EventsTesting.dt.filter(
        (from, to) =>
          `from_iso8601_timestamp(dt || 'T' || time)
            BETWEEN from_iso8601_timestamp(${from})
            AND from_iso8601_timestamp(${to})`
      )
    )}
  `,

  dimensions: {
    eventId: {
      sql: `event_id`,
      type: `number`,
      primaryKey: true,
    },

    dt: {
      sql: `from_iso8601_timestamp(dt || 'T' || time || 'Z')`,
      type: `time`,
    },

    dtString: {
      public: false,
      sql: `dt`,
      type: `string`,
    },
  },
});
```

And a vitest file

```javascript
import { test, expect } from 'vitest';
import fetch from 'node-fetch';

test.concurrent.each([
  { timezone: 'America/Los_Angeles', ids: [7, 8, 9, 10] },
  { timezone: 'America/New_York', ids: [6, 7, 8, 9] },
  { timezone: 'Australia/Sydney', ids: [3, 4, 5, 6] },
  { timezone: 'Europe/London', ids: [5, 6, 7, 8] },
  { timezone: 'Europe/Athens', ids: [4, 5, 6, 7] },
])(
  'can return results in $timezone',
  { timeout: 10000 },
  async ({ timezone, ids }) => {
    const query = {
      timezone: timezone,
      measures: [],
      timeDimensions: [
        {
          dimension: 'EventsTesting.dt',
          dateRange: ['2025-02-01', '2025-02-01'],
        },
      ],
      dimensions: ['EventsTesting.eventId'],
    };

    // Create URL params
    const params = new URLSearchParams();
    params.append('query', JSON.stringify(query));

    // Make the request
    const response = await fetch(
      `http://localhost:4000/cubejs-api/v1/load?${params}`,
      {
        method: 'GET',
        headers: {
          Authorization: 'token',
        },
      }
    );

    const { data } = await response.json();

    expect(
      data.map((d) => Number(d['EventsTesting.eventId'])).sort((a, b) => a - b)
    ).toEqual(ids);
  }
);
```

**Version:**
[e.g. 1.2.0]

**Additional context**
Add any other context about the problem here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Timezones with date only indexes in Athena #9368

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Timezones with date only indexes in Athena #9368

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions