Skip to content

Stringifying of object data is extremely slow #1137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rs0h opened this issue Aug 7, 2019 · 4 comments
Open

Stringifying of object data is extremely slow #1137

rs0h opened this issue Aug 7, 2019 · 4 comments

Comments

@rs0h
Copy link

rs0h commented Aug 7, 2019

It is almost impossible to wait for 15,000 rows with the SDO_GEOMETRY data type.
fetchArraySize does not affect the process of fetching object data.

However, using the same code to fetch regular data types is fast.

    connection = await oracledb.getConnection(dbConfig);
    oracledb.fetchArraySize = 1000;
    let t0 = performance.now()
    result = await connection.execute(
      `select geometry from g where rownum < 15000`,
      [],
      {
        outFormat: oracledb.OUT_FORMAT_OBJECT
      }

    );

    console.log(result.rows.length);
    let count = 0
    let all = ""
    for (const row of result.rows) {
      all += JSON.stringify(row.geometry)
      count++
    }
    let t1 = performance.now()
    console.log(" Fetch" + (t1 - t0) + " milliseconds.")

output:

14999 
Fetch 604606.555001 milliseconds.
> process.platform
'win32'
> process.version
'v12.6.0'
> process.arch
'x64'
> require('oracledb').versionString
'4.0.0'
>

Oracle Database version = 11.2.0.3

@rs0h rs0h added the question label Aug 7, 2019
@cjbj
Copy link
Member

cjbj commented Aug 7, 2019

With the increased availability of Object access I was expecting more visibility. I mentioned the performance of Objects in the recent office hours. There is definitely overhead in the pickling/unpickling of objects from the DB, even before they get to the node-oracledb code. Can you give us a runnable test case (i.e. with data) to review and also send over to the Object team to see if they can make improvements?

@cjbj
Copy link
Member

cjbj commented Aug 8, 2019

Can you use pipeline table functions? The presentation Best Practices, Tips and Tricks With
Oracle Spatial and Graph
looks interesting. We showed something similar with dbms_output.

PS. Later Oracle versions have fixes like this spatial fix.

@rs0h
Copy link
Author

rs0h commented Aug 8, 2019

Christopher,

Thank you for the presentation.

The point is to use functions as designed without workarounds. I can do pipelining or make other techniques to speed up data extraction but honestly this is not that I want to do.

I will send you SQLLDR sample data on your email.

@anthony-tuininga
Copy link
Member

anthony-tuininga commented Aug 23, 2019

Using the sample data you provided, I ran tests with 11.2, 12.2 and 19.3 using the following code (modifed slightly from what you provided):

const oracledb = require('oracledb');
const dbConfig = require('./dbconfig.js');
const { performance } = require('perf_hooks');
const fs = require('fs');

console.log("Platform:", process.platform);
console.log("Version:", process.version);
console.log("Arch:", process.arch);
console.log("OracleDB:", oracledb.versionString);
console.log("Client:", oracledb.oracleClientVersionString);

async function run() {
  let connection;

  try {

    let result;

    connection = await oracledb.getConnection(dbConfig);
    console.log("Database:", connection.oracleServerVersionString);
    let t0 = performance.now();
    result = await connection.execute(
      `select geometry from highway_line where rownum <= 15000`,
      [],
      {outFormat: oracledb.OUT_FORMAT_OBJECT }
    );

    console.log("Num Rows:", result.rows.length);

    let t1 = performance.now();
    console.log("  execute -> " + (t1 - t0).toFixed(0) + " milliseconds.");

    let count = 0;
    let all = "";

    // Stringify rows
    t0 = performance.now();
    count = 0;
    all = "";
    for (const row of result.rows) {
      all += JSON.stringify(row.GEOMETRY);
      count++;
    }
    t1 = performance.now();
    console.log("  stringify " + count + " rows -> " + (t1 - t0).toFixed(0) + " milliseconds.");

    // write file
    t0 = performance.now();
    fs.writeFile('./data.json', all, function (err) {
      if (err) throw err;
    });
    t1 = performance.now();
    console.log("  write file " + (t1 - t0).toFixed(0) + " milliseconds.");

    // deep copy to new array
    t0 = performance.now();
    count = 0;
    let newa = [];
    for (const row of result.rows) {
      newa.push(JSON.parse(JSON.stringify(row)));
      count++;
    }
    t1 = performance.now();
    console.log("  copy " + count + " to new array -> " + (t1 - t0).toFixed(0) + " milliseconds.");

    // stringify new array
    t0 = performance.now();
    count = 0;
    all = "";
    for (const row of newa) {
      all += JSON.stringify(row.GEOMETRY);
      count++;
    }
    t1 = performance.now();
    console.log("  stringify " + count + " newarray -> " + (t1 - t0).toFixed(0) + " milliseconds.");

    // write new array to 2nd file
    t0 = performance.now();
    fs.writeFile('./datanewa.json', all, function (err) {
      if (err) throw err;
    });
    t1 = performance.now();
    console.log("  write file2 -> " + (t1 - t0).toFixed(0) + " milliseconds.");

  } catch (err) {
    console.error(err);
  } finally {
    if (connection) {
      try {
        await connection.close();
      } catch (err) {
        console.error(err);
      }
    }
  }
}

run();

I got the following results for 11.2:

Platform: linux
Version: v12.9.0
Arch: x64
OracleDB: 4.0.1
Client: 11.2.0.4.0
Database: 11.2.0.4.0
Num Rows: 15000
  execute -> 249 milliseconds.
  stringify 15000 rows -> 5334 milliseconds.
  write file 0 milliseconds.
  copy 15000 to new array -> 5493 milliseconds.
  stringify 15000 newarray -> 39 milliseconds.
  write file2 -> 0 milliseconds.

And very similar results for 19.3:

Platform: linux
Version: v12.9.0
Arch: x64
OracleDB: 4.0.1
Client: 19.3.0.0.0
Database: 19.3.0.0.0
Num Rows: 15000
  execute -> 259 milliseconds.
  stringify 15000 rows -> 4220 milliseconds.
  write file 0 milliseconds.
  copy 15000 to new array -> 4266 milliseconds.
  stringify 15000 newarray -> 39 milliseconds.
  write file2 -> 0 milliseconds.

This demonstrates that feching these objects is not partciularly slow at all. Stringifying is a slow operation and that could definitely be optimized a lot more than it is right now. We'll look into that. The README.txt that you provided, however, showed similar code with the execute and fetch substantially slower than what I am seeing. I'm not sure if that is configuration, platform or something else entirely. Hopefully your DBA can help you figure that one out!

@sharadraju sharadraju changed the title Fetch object data is extremely slow Stringifying of object data is extremely slow Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants