Releases · dimajix/flowman

29 Mar 04:52

kupferk

0.23.1

aeaf752

0.23.1

github-154: Fix failing migration when PK requires change due to data type
github-156: Recreate indexes when data type of column changes
github-155: Project level configs are used outside job
github-157: Fix UPSERT operations for SQL Server
github-158: Improve non-nullability of primary key column
github-160: Use sensible defaults for default documenter
github-161: Improve schema caching during execution
github-162: ExpressionColumnCheck does not work when results contain NULL values
github-163: Implement new column length quality check

Assets 12

18 Mar 17:12

kupferk

0.23.0

55199f2

0.23.0

The main feature of this version is a significant improvement of the new documentation system, which now also includes column level lineage. The automatically generated documentation is a valuable artifact for both developers and business experts to improve the understanding of the data models and transformations. Flowman projects can also specify quality checks (like NOT NULL condition, foreign key relationships or arbitrary SQL expressions), which are not only included in the documentation but also executed on the real data.

Moreover support for SQL databases has been improved again with the introduction of temporary staging tables to perform updates within a transactional commit.

Detailed Changes

github-148: Support staging table for all JDBC relations
github-120: Use staging tables for UPSERT and MERGE operations in JDBC relations
github-147: Add support for PostgreSQL
github-151: Implement column level lineage in documentation
github-121: Correctly apply documentation, before/after and other common attributes to templates
github-152: Implement new 'cast' mapping

Assets 12

01 Mar 15:01

kupferk

0.22.0

80a9ec4

0.22.0

Add new sqlserver relation
Implement new documentation subsystem
Change default build to Spark 3.2.1 and Hadoop 3.3.1
Add new drop target for removing tables
Speed up project loading by reusing Jackson mapper
Implement new jdbc metric sink
Implement schema cache in Executor to speed up documentation and similar tasks
Add new config variables flowman.execution.mapping.schemaCache and flowman.execution.relation.schemaCache
Add new config variable flowman.default.target.verifyPolicy to ignore empty tables during VERIFY phase
Implement initial support for indexes in JDBC relations

Assets 12

24 Feb 16:53

kupferk

0.21.2

ba5a982

0.21.2

Fix importing projects

Assets 2

24 Feb 12:33

kupferk

0.21.1

20eba69

0.21.1

flowexec now returns different exit codes depending on the processing result

Assets 2

26 Jan 14:48

kupferk

0.21.0

3033035

0.21.0

This is a minor release with only few noticeable changes, but some internal refactorings.

Fix wrong dependencies in Swagger plugin
Implement basic schema inference for local CSV files
Implement new stack mapping
Improve error messages of local CSV parser

Assets 12

07 Jan 06:16

kupferk

0.20.1

c139042

0.20.1

Implement detection of dependencies introduced by schema

Assets 12

05 Jan 16:32

kupferk

0.20.0

25eeb60

0.20.0

Fix detection of Derby metastore to truncate comment lengths.
Add new config variable flowman.default.relation.input.columnMismatchPolicy (default is IGNORE)
Add new config variable flowman.default.relation.input.typeMismatchPolicy (default is IGNORE)
Add new config variable flowman.default.relation.output.columnMismatchPolicy (default is ADD_REMOVE_COLUMNS)
Add new config variable flowman.default.relation.output.typeMismatchPolicy (default is CAST_ALWAYS)
Improve handling of _SUCCESS files for detecting (non-)dirty directories
Implement new merge target
Implement merge operation for Delta relations
Implement merge operation for JDBC relations (only for some databases, i.e. MS SQL)
Add new config variable flowman.execution.target.useHistory (default is false)
Change the semantics of config variable flowman.execution.target.forceDirty (default is false)
Add new -d / --dirty option for explicitly marking individual targets as dirty

Assets 12

14 Dec 11:18

kupferk

0.19.0

c2d2748

0.19.0

Add build profile for Hadoop 3.3
Add build profile for Spark 3.2
Allow SQL expressions as dimensions in aggregate mapping
Update Hive views when the resulting schema would change
Add new mapping cache command to FlowShell
Support embedded connection definitions
Much improved Flowman History Server
Fix wrong metric names with TemplateTarget
Implement more template types for connection, schema, dataset, assertion and measure
Implement new measure target for creating custom metrics for measuring data quality
Add new config option flowman.execution.mapping.parallelism

Assets 12

13 Oct 17:37

kupferk

0.18.0

27de909

0.18.0

Improve automatic schema migration for Hive and JDBC relations
Improve support of CHAR(n) and VARCHAR(n) types. Those types will now be propagates to Hive with newer Spark versions
Support writing to dynamic partitions for file relations, Hive tables, JDBC relations and Delta tables
Fix the name of some config variables (floman.* => flowman.*)
Added new config variables flowman.default.relation.migrationPolicy and flowman.default.relation.migrationStrategy
Add plugin for supporting DeltaLake (https://delta.io), which provides deltaTable and deltaFile relation types
Fix non-deterministic column order in schema mapping, values mapping and values relation
Mark Hive dependencies has 'provided', which reduces the size of dist packages
Significantly reduce size of AWS dependencies in AWS plugin
Add new build profile for Cloudera CDP-7.1
Improve Spark configuration of LocalSparkSession and TestRunner
Update Spark 3.0 build profile to Spark 3.0.3
Upgrade Impala JDBC driver from 2.6.17.1020 to 2.6.23.1028
Upgrade MySQL JDBC driver from 8.0.20 to 8.0.25
Upgrade MariaDB JDBC driver from 2.2.4 to 2.7.3
Upgrade several Maven plugins to latest versions
Add new config option flowman.workaround.analyze_partition to workaround CDP 7.1 issues
Fix migrating Hive views to tables and vice-versa
Add new option "-j " to allow running multiple job instances in parallel
Add new option "-j " to allow running multiple tests in parallel
Add new uniqueKey assertion
Add new schema assertion
Update Swagger libraries for swagger schema
Implement new openapi plugin to support OpenAPI 3.0 schemas
Add new readHive mapping
Add new simpleReport and report hook
Implement new templates

Assets 10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Detailed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: dimajix/flowman

0.23.1

Uh oh!

0.23.0

Detailed Changes

Uh oh!

0.22.0

Uh oh!

0.21.2

Uh oh!

0.21.1

Uh oh!

0.21.0

Uh oh!

0.20.1

Uh oh!

0.20.0

Uh oh!

0.19.0

Uh oh!

0.18.0

Uh oh!