Skip to content

Commit c88cc8f

Browse files
author
Shinichi Takii
authored
Merge pull request shinichi-takii#23 from shinichi-takii/feature/add-generate-bq-ddl
add bq-ddl generate func
2 parents c98a271 + e58acae commit c88cc8f

File tree

7 files changed

+422
-80
lines changed

7 files changed

+422
-80
lines changed

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
# Changelog
22

3+
## 1.2.0
4+
- Add `DdlParseTable.to_bigquery_ddl` function.
5+
- BigQuery DDL (CREATE TABLE) statement generate function.
6+
- Add `DdlParseColumn.bigquery_legacy_data_type` property.
7+
- Get BigQuery Legacy SQL data property.
8+
- Alias of `DdlParseColumn.bigquery_data_type` property.
9+
- Add `DdlParseColumn.bigquery_standard_data_type` property.
10+
- Get BigQuery Standard SQL data property.
11+
312
## 1.1.3
413
- Add support inline comment.
514
- Add support constraint name with quotes.

README.md

Lines changed: 30 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@
88
[![Requirements Status](https://requires.io/github/shinichi-takii/ddlparse/requirements.svg?branch=master)](https://requires.io/github/shinichi-takii/ddlparse/requirements/?branch=master)
99
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://github.com/shinichi-takii/ddlparse/blob/master/LICENSE.md)
1010

11-
*DDL parase and Convert to BigQuery JSON schema module, available in Python.*
11+
*DDL parase and Convert to BigQuery JSON schema and DDL statements module, available in Python.*
1212

1313
----
1414

1515
## Features
1616

1717
- DDL parse and get table schema information.
1818
- Currently, only the `CREATE TABLE` statement is supported.
19+
- Convert to [BigQuery JSON schema](https://cloud.google.com/bigquery/docs/schemas#creating_a_json_schema_file) and [BigQuery DDL statements](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language).
1920
- Supported databases are MySQL, PostgreSQL, Oracle, Redshift.
20-
- Convert to [BigQuery JSON schema](https://cloud.google.com/bigquery/docs/schemas#creating_a_json_schema_file).
2121

2222
## Requirement
2323

@@ -50,15 +50,15 @@ $ pip install ddlparse --upgrade
5050
### Example
5151

5252
```python
53-
from ddlparse import DdlParse
53+
from ddlparse.ddlparse import DdlParse
5454

5555
sample_ddl = """
5656
CREATE TABLE My_Schema.Sample_Table (
57-
ID integer PRIMARY KEY,
58-
NAME varchar(100) NOT NULL,
59-
TOTAL bigint NOT NULL,
60-
AVG decimal(5,1) NOT NULL,
61-
CREATED_AT date, -- Oracle 'DATE' -> BigQuery 'DATETIME'
57+
Id integer PRIMARY KEY,
58+
Name varchar(100) NOT NULL,
59+
Total bigint NOT NULL,
60+
Avg decimal(5,1) NOT NULL,
61+
Created_At date, -- Oracle 'DATE' -> BigQuery 'DATETIME'
6262
UNIQUE (NAME)
6363
);
6464
"""
@@ -111,18 +111,28 @@ print(table.to_bigquery_fields(DdlParse.NAME_CASE.upper))
111111

112112
print("* COLUMN *")
113113
for col in table.columns.values():
114-
print("name = {} : data_type = {} : length = {} : precision(=length) = {} : scale = {} : constraint = {} : not_null = {} : PK = {} : unique = {} : BQ {}".format(
115-
col.name,
116-
col.data_type,
117-
col.length,
118-
col.precision,
119-
col.scale,
120-
col.constraint,
121-
col.not_null,
122-
col.primary_key,
123-
col.unique,
124-
col.to_bigquery_field()
125-
))
114+
col_info = []
115+
col_info.append("name = {}".format(col.name))
116+
col_info.append("data_type = {}".format(col.data_type))
117+
col_info.append("length = {}".format(col.length))
118+
col_info.append("precision(=length) = {}".format(col.precision))
119+
col_info.append("scale = {}".format(col.scale))
120+
col_info.append("constraint = {}".format(col.constraint))
121+
col_info.append("not_null = {}".format(col.not_null))
122+
col_info.append("PK = {}".format(col.primary_key))
123+
col_info.append("unique = {}".format(col.unique))
124+
col_info.append("bq_data_type = {}".format(col.bigquery_data_type))
125+
col_info.append("bq_legacy_data_type = {}".format(col.bigquery_legacy_data_type))
126+
col_info.append("bq_standard_data_type = {}".format(col.bigquery_standard_data_type))
127+
col_info.append("BQ {}".format(col.to_bigquery_field()))
128+
print(" : ".join(col_info))
129+
130+
print("* DDL (CREATE TABLE) statements *")
131+
print(table.to_bigquery_ddl())
132+
133+
print("* DDL (CREATE TABLE) statements - dataset name, table name and column name to lower case / upper case *")
134+
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.lower))
135+
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.upper))
126136

127137
print("* Get Column object (case insensitive) *")
128138
print(table.columns["total"])

README.rst

Lines changed: 62 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,36 @@
11
DDL Parse
22
=========
33

4-
`PyPI version <https://pypi.python.org/pypi/ddlparse>`__ `Python
5-
version <https://pypi.python.org/pypi/ddlparse>`__ `Travis CI Build
6-
Status <https://travis-ci.org/shinichi-takii/ddlparse>`__ `Coveralls
7-
Coverage
8-
Status <https://coveralls.io/github/shinichi-takii/ddlparse?branch=master>`__
9-
`codecov Coverage
10-
Status <https://codecov.io/gh/shinichi-takii/ddlparse>`__ `Requirements
11-
Status <https://requires.io/github/shinichi-takii/ddlparse/requirements/?branch=master>`__
12-
`License <https://github.com/shinichi-takii/ddlparse/blob/master/LICENSE.md>`__
13-
14-
*DDL parase and Convert to BigQuery JSON schema module, available in
15-
Python.*
4+
.. image:: https://img.shields.io/pypi/v/ddlparse.svg
5+
:target: https://pypi.python.org/pypi/ddlparse
6+
:alt: PyPI version
7+
8+
.. image:: https://img.shields.io/pypi/pyversions/ddlparse.svg
9+
:target: https://pypi.python.org/pypi/ddlparse
10+
:alt: Python version
11+
12+
.. image:: https://travis-ci.org/shinichi-takii/ddlparse.svg?branch=master
13+
:target: https://travis-ci.org/shinichi-takii/ddlparse
14+
:alt: Travis CI Build Status
15+
16+
.. image:: https://coveralls.io/repos/github/shinichi-takii/ddlparse/badge.svg?branch=master
17+
:target: https://coveralls.io/github/shinichi-takii/ddlparse?branch=master
18+
:alt: Coveralls Coverage Status
19+
20+
.. image:: https://codecov.io/gh/shinichi-takii/ddlparse/branch/master/graph/badge.svg
21+
:target: https://codecov.io/gh/shinichi-takii/ddlparse
22+
:alt: codecov Coverage Status
23+
24+
.. image:: https://requires.io/github/shinichi-takii/ddlparse/requirements.svg?branch=master
25+
:target: https://requires.io/github/shinichi-takii/ddlparse/requirements/?branch=master
26+
:alt: Requirements Status
27+
28+
.. image:: https://img.shields.io/badge/License-BSD%203--Clause-blue.svg
29+
:target: https://github.com/shinichi-takii/ddlparse/blob/master/LICENSE.md
30+
:alt: License
31+
32+
*DDL parase and Convert to BigQuery JSON schema and DDL statements
33+
module, available in Python.*
1634

1735
--------------
1836

@@ -21,9 +39,11 @@ Features
2139

2240
- DDL parse and get table schema information.
2341
- Currently, only the ``CREATE TABLE`` statement is supported.
24-
- Supported databases are MySQL, PostgreSQL, Oracle, Redshift.
2542
- Convert to `BigQuery JSON
26-
schema <https://cloud.google.com/bigquery/docs/schemas#creating_a_json_schema_file>`__.
43+
schema <https://cloud.google.com/bigquery/docs/schemas#creating_a_json_schema_file>`__
44+
and `BigQuery DDL
45+
statements <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language>`__.
46+
- Supported databases are MySQL, PostgreSQL, Oracle, Redshift.
2747

2848
Requirement
2949
-----------
@@ -66,15 +86,15 @@ Example
6686

6787
.. code:: python
6888
69-
from ddlparse import DdlParse
89+
from ddlparse.ddlparse import DdlParse
7090
7191
sample_ddl = """
7292
CREATE TABLE My_Schema.Sample_Table (
73-
ID integer PRIMARY KEY,
74-
NAME varchar(100) NOT NULL,
75-
TOTAL bigint NOT NULL,
76-
AVG decimal(5,1) NOT NULL,
77-
CREATED_AT date, -- Oracle 'DATE' -> BigQuery 'DATETIME'
93+
Id integer PRIMARY KEY,
94+
Name varchar(100) NOT NULL,
95+
Total bigint NOT NULL,
96+
Avg decimal(5,1) NOT NULL,
97+
Created_At date, -- Oracle 'DATE' -> BigQuery 'DATETIME'
7898
UNIQUE (NAME)
7999
);
80100
"""
@@ -127,18 +147,28 @@ Example
127147
128148
print("* COLUMN *")
129149
for col in table.columns.values():
130-
print("name = {} : data_type = {} : length = {} : precision(=length) = {} : scale = {} : constraint = {} : not_null = {} : PK = {} : unique = {} : BQ {}".format(
131-
col.name,
132-
col.data_type,
133-
col.length,
134-
col.precision,
135-
col.scale,
136-
col.constraint,
137-
col.not_null,
138-
col.primary_key,
139-
col.unique,
140-
col.to_bigquery_field()
141-
))
150+
col_info = []
151+
col_info.append("name = {}".format(col.name))
152+
col_info.append("data_type = {}".format(col.data_type))
153+
col_info.append("length = {}".format(col.length))
154+
col_info.append("precision(=length) = {}".format(col.precision))
155+
col_info.append("scale = {}".format(col.scale))
156+
col_info.append("constraint = {}".format(col.constraint))
157+
col_info.append("not_null = {}".format(col.not_null))
158+
col_info.append("PK = {}".format(col.primary_key))
159+
col_info.append("unique = {}".format(col.unique))
160+
col_info.append("bq_data_type = {}".format(col.bigquery_data_type))
161+
col_info.append("bq_legacy_data_type = {}".format(col.bigquery_legacy_data_type))
162+
col_info.append("bq_standard_data_type = {}".format(col.bigquery_standard_data_type))
163+
col_info.append("BQ {}".format(col.to_bigquery_field()))
164+
print(" : ".join(col_info))
165+
166+
print("* DDL (CREATE TABLE) statements *")
167+
print(table.to_bigquery_ddl())
168+
169+
print("* DDL (CREATE TABLE) statements - dataset name, table name and column name to lower case / upper case *")
170+
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.lower))
171+
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.upper))
142172
143173
print("* Get Column object (case insensitive) *")
144174
print(table.columns["total"])

ddlparse/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@
77

88
from .ddlparse import *
99

10-
__copyright__ = 'Copyright (C) 2018 Shinichi Takii'
11-
__version__ = '1.1.3'
10+
__copyright__ = 'Copyright (C) 2018-2019 Shinichi Takii'
11+
__version__ = '1.2.0'
1212
__license__ = 'BSD-3-Clause'
1313
__author__ = 'Shinichi Takii'
1414
__author_email__ = '[email protected]'

ddlparse/ddlparse.py

Lines changed: 85 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ def __init__(self, source_database=None):
2727
def source_database(self):
2828
"""
2929
Source database option
30+
3031
:param source_database: enum DdlParse.DATABASE
3132
"""
3233
return self._source_database
@@ -51,14 +52,14 @@ def name(self):
5152
def name(self, name):
5253
self._name = name
5354

54-
def _get_name(self, name_case=DdlParseBase.NAME_CASE.original):
55+
def get_name(self, name_case=DdlParseBase.NAME_CASE.original):
5556
"""
5657
Get Name converted case
5758
5859
:param name_case: name case type
59-
* NAME_CASE.original : Return to no convert
60-
* NAME_CASE.lower : Return to lower
61-
* NAME_CASE.upper : Return to upper
60+
* DdlParse.NAME_CASE.original : Return to no convert
61+
* DdlParse.NAME_CASE.lower : Return to lower
62+
* DdlParse.NAME_CASE.upper : Return to upper
6263
6364
:return: name
6465
"""
@@ -161,7 +162,7 @@ def unique(self, flag):
161162

162163
@property
163164
def bigquery_data_type(self):
164-
"""Get BigQuery data type"""
165+
"""Get BigQuery Legacy SQL data type"""
165166

166167
# BigQuery data type = {source_database: [data type, ...], ...}
167168
BQ_DATA_TYPE_DIC = OrderedDict()
@@ -205,6 +206,27 @@ def bigquery_data_type(self):
205206

206207
raise ValueError("Unknown data type : '{}'".format(self._data_type))
207208

209+
@property
210+
def bigquery_legacy_data_type(self):
211+
"""Get BigQuery Legacy SQL data type"""
212+
213+
return self.bigquery_data_type
214+
215+
@property
216+
def bigquery_standard_data_type(self):
217+
"""Get BigQuery Standard SQL data type"""
218+
219+
legacy_data_type = self.bigquery_data_type
220+
221+
if legacy_data_type == "INTEGER":
222+
return "INT64"
223+
elif legacy_data_type == "FLOAT":
224+
return "FLOAT64"
225+
elif legacy_data_type == "BOOLEAN":
226+
return "BOOL"
227+
228+
return legacy_data_type
229+
208230
@property
209231
def bigquery_mode(self):
210232
"""Get BigQuery constraint"""
@@ -214,7 +236,7 @@ def bigquery_mode(self):
214236
def to_bigquery_field(self, name_case=DdlParseBase.NAME_CASE.original):
215237
"""Generate BigQuery JSON field define"""
216238

217-
return '{{"name": "{}", "type": "{}", "mode": "{}"}}'.format(self._get_name(name_case), self.bigquery_data_type, self.bigquery_mode)
239+
return '{{"name": "{}", "type": "{}", "mode": "{}"}}'.format(self.get_name(name_case), self.bigquery_data_type, self.bigquery_mode)
218240

219241

220242
class DdlParseColumnDict(OrderedDict, DdlParseBase):
@@ -245,7 +267,16 @@ def append(self, column_name, data_type_array=None, constraint=None, source_data
245267
return column
246268

247269
def to_bigquery_fields(self, name_case=DdlParseBase.NAME_CASE.original):
248-
"""Generate BigQuery JSON fields define"""
270+
"""
271+
Generate BigQuery JSON fields define
272+
273+
:param name_case: name case type
274+
* DdlParse.NAME_CASE.original : Return to no convert
275+
* DdlParse.NAME_CASE.lower : Return to lower
276+
* DdlParse.NAME_CASE.upper : Return to upper
277+
278+
:return: BigQuery JSON fields define
279+
"""
249280

250281
bq_fields = []
251282

@@ -267,6 +298,7 @@ def __init__(self, source_database=None):
267298
def source_database(self):
268299
"""
269300
Source database option
301+
270302
:param source_database: enum DdlParse.DATABASE
271303
"""
272304
return super().source_database
@@ -300,10 +332,54 @@ def columns(self):
300332
return self._columns
301333

302334
def to_bigquery_fields(self, name_case=DdlParseBase.NAME_CASE.original):
303-
"""Generate BigQuery JSON fields define"""
335+
"""
336+
Generate BigQuery JSON fields define
337+
338+
:param name_case: name case type
339+
* DdlParse.NAME_CASE.original : Return to no convert
340+
* DdlParse.NAME_CASE.lower : Return to lower
341+
* DdlParse.NAME_CASE.upper : Return to upper
342+
343+
:return: BigQuery JSON fields define
344+
"""
304345

305346
return self._columns.to_bigquery_fields(name_case)
306347

348+
def to_bigquery_ddl(self, name_case=DdlParseBase.NAME_CASE.original):
349+
"""
350+
Generate BigQuery CREATE TABLE statements
351+
352+
:param name_case: name case type
353+
* DdlParse.NAME_CASE.original : Return to no convert
354+
* DdlParse.NAME_CASE.lower : Return to lower
355+
* DdlParse.NAME_CASE.upper : Return to upper
356+
357+
:return: BigQuery CREATE TABLE statements
358+
"""
359+
360+
if self.schema is None:
361+
dataset = "dataset"
362+
elif name_case == self.NAME_CASE.lower:
363+
dataset = self.schema.lower()
364+
elif name_case == self.NAME_CASE.upper:
365+
dataset = self.schema.upper()
366+
else:
367+
dataset = self.schema
368+
369+
cols_def = []
370+
for col in self.columns.values():
371+
cols_def.append("{name} {type}{not_null}".format(
372+
name=col.get_name(name_case),
373+
type=col.bigquery_standard_data_type,
374+
not_null=" NOT NULL" if col.not_null else "",
375+
))
376+
377+
return "#standardSQL\nCREATE TABLE `project.{dataset}.{table}`\n(\n {colmns_define}\n)".format(
378+
dataset=dataset,
379+
table=self.get_name(name_case),
380+
colmns_define=",\n ".join(cols_def),
381+
)
382+
307383

308384
class DdlParse(DdlParseBase):
309385
"""DDL parser"""
@@ -356,6 +432,7 @@ def __init__(self, ddl=None, source_database=None):
356432
def source_database(self):
357433
"""
358434
Source database option
435+
359436
:param source_database: enum DdlParse.DATABASE
360437
"""
361438
return super().source_database

0 commit comments

Comments
 (0)