Connection Hang Intermittently in AWS Lambda #1734

shubsaini09 · 2025-04-02T17:21:25Z

I am using AWS Lambda with provisioned concurrency. I am creating pool as part of Lambda initialization. As part of this process, I create a pool with min 1 and max 100. Whenever the request comes I use getConnection() to get the connection and then execute() to run our insert query. Sometimes I see that code fails at the line where we do getConnection() without any error. Our lambda is attached to the APIG, as per the APIG logs, I see that lambda returned null. I have added detailed logging in our Lambda but no trace of any error. The console statement before getConnection() gets printed but not after, not even in catch block.

One of the pattern that I noticed is that this happens only when there is no request for sometime and then request comes. The first request fails but all subsequent requests are successful. I understand that connection might be stale and may not be available but I have a custom retry logic in place which would retry if the error comes but there is not error thrown to retry.

// index.mjs

import { createDBConnection } from './dbConnection.js';
import { insertData } from './insertData.js';

const poolConfig = {
	user,
	password,
	connectString,
	poolMax,
	poolMin,
	poolIncrement,
	transportConnectTimeout,
	poolAlias,
	enableStatistics: true
};
await createDBConnection(1, poolConfig, 'aliasName' );

export const handler = async (event) => {
    const dbResponse = await insertData(event.body);
    return { statusCode: 200, body: JSON.stringify(dbResponse) };
}

// insertData.js

const insertVoiceBotdata = async (payload) => {
	const binds = getQueryAndData(payload); // gets the query and create the formatted binds
	const dbResponse = await executeQuery(QUERY, binds, 'aliasName', true);
	return {
		status: dbResponse.rowsAffected ? dbResponse.rowsAffected > 0 : false,
		status_description: dbResponse.message,
	};
};

// dbConnection.js

const oracledb = require('oracledb');

oracledb.outFormat = oracledb.OUT_FORMAT_OBJECT;

async function createDBConnection(retryAttempt, poolConfig, poolAlias) {
	try {
		oracledb.getPool(poolAlias);
		return true;
	} catch (error) {
		try {
			await oracledb.createPool(poolConfig);
		} catch (connectionError) {
			console.error('Error creating pool: ', connectionError);
			if (retryAttempt > Number(retries)) {
				throw connectionError;
			}
			const delayInMilliseconds = Number(baseDelayInMilliseconds) * 2 ** retryAttempt;
			await new Promise((resolve) => { setTimeout(resolve, delayInMilliseconds); });
			const nextRetryAttempt = retryAttempt + 1;
			return createDBConnection(nextRetryAttempt, poolConfig, poolAlias);
		}
	}
}

const executeQuery = async (query, binds, poolAlias, autoCommit = false) => {
	let connection;
	try {
                console.log('getting connection...'); // printing
		connection = await oracledb.getConnection(poolAlias);
		console.log('received connection...'); // not printing
		const queryResponse = await connection.execute(query, binds, { autoCommit });
		console.log('queryResponse...', queryResponse);
		return queryResponse;
	} catch (error) {
		console.error('Error executing query:', error);
		return error;
	} finally {
		try {
			if (connection) await connection.close();
		} catch (closeError) {
			console.error('Error closing connection:', closeError);
		}
	}
};

module.exports = {
	executeQuery,
	createDBConnection,
};

Any help is appreciated. Thanks

Oracledb version - 6.7.0
platform.arch - x86_64
process.platform - linux

The text was updated successfully, but these errors were encountered:

sreguna · 2025-04-03T04:37:18Z

This could be due to a firewall dropping idle connections. Can you set expireTime to 1 and check?

shubsaini09 · 2025-04-03T19:58:22Z

Ok will add expireTime: 1 to my pool config and let you know the results.

Meanwhile, i tried something that works but the puzzle remains unsolved with original approach. I moved the code of DB inititilization from outside the handler to inside.

// index.mjs

import { insertData } from './insertData.js';

export const handler = async (event) => {
    const dbResponse = await insertData(event.body);
    return { statusCode: 200, body: JSON.stringify(dbResponse) };
}

// insertData.js

import { createDBConnection } from './dbConnection.js';

const initDb = async() => {
   const poolConfig = {
	user,
	password,
	connectString,
	poolMax,
	poolMin,
	poolIncrement,
	transportConnectTimeout,
	poolAlias,
	enableStatistics: true
    };
    await createDBConnection(1, poolConfig, 'aliasName' );
}

const insertVoiceBotdata = async (payload) => {
        await initDb();
	const binds = getQueryAndData(payload); // gets the query and create the formatted binds
	const dbResponse = await executeQuery(QUERY, binds, 'aliasName', true);
	return {
		status: dbResponse.rowsAffected ? dbResponse.rowsAffected > 0 : false,
		status_description: dbResponse.message,
	};
};

shubsaini09 · 2025-04-11T17:57:36Z

@sreguna expireTime did not work for me.

any suggestions @cjbj @sharadraju @Bigous

sreguna · 2025-04-15T06:26:05Z

Please also consider setting the pooPingTimeout parameter. The default is 5000 msecs. You may want to set it to a lower value. Depending on the number of open connections in the pool(n) getConnection may take upto n*poolPingTimeout msecs to return a connection, if the connections are all in a bad state. Since your poolMax is 100 this could potentially cause a hang of upto 8-9 minutes.
You could also reduce poolMax to a lower value and check if the hang occurs.

shubsaini09 · 2025-04-16T02:23:01Z

@sreguna Lambda freezes the environment after the request finishes so poolPingTimeout will not work. Why will high poolMax value cause a hang?

sreguna · 2025-04-16T07:23:00Z

@shubsaini09 poolPingTimeout will be used when trying to get a connection from the pool using getConnection(). Internally a healthy check(ping) is done if the connection has been idle for more than 60 secs(default). The idle time can be configured through the poolPingInterval parameter. If there is no response to the ping within the poolPingTimeout interval, the connection is forcefully closed. More details regarding connection pool health checks can be found here https://node-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#connection-pool-pinging.

My point regarding poolMax was, when you do a getConnection it may take a while to return as all the idle connections in the pool may endup going through the poolPingTimeout interval before the call returns.

sharadraju · 2025-04-17T09:18:36Z

@shubsaini09 Please note that we do not test on AWS Lambda and hence it will be difficult for us to reproduce this issue. Having said that, you can upgrade to the latest node-oracledb version (6.8), set the poolPingInterval parameter to a lower value as suggested by @sreguna and see if the hangs still happen

shubsaini09 added the question label Apr 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connection Hang Intermittently in AWS Lambda #1734

Connection Hang Intermittently in AWS Lambda #1734

shubsaini09 commented Apr 2, 2025 •

edited

Loading

sreguna commented Apr 3, 2025

shubsaini09 commented Apr 3, 2025

shubsaini09 commented Apr 11, 2025 •

edited

Loading

sreguna commented Apr 15, 2025 •

edited

Loading

shubsaini09 commented Apr 16, 2025

sreguna commented Apr 16, 2025 •

edited

Loading

sharadraju commented Apr 17, 2025 •

edited

Loading

Connection Hang Intermittently in AWS Lambda #1734

Connection Hang Intermittently in AWS Lambda #1734

Comments

shubsaini09 commented Apr 2, 2025 • edited Loading

sreguna commented Apr 3, 2025

shubsaini09 commented Apr 3, 2025

shubsaini09 commented Apr 11, 2025 • edited Loading

sreguna commented Apr 15, 2025 • edited Loading

shubsaini09 commented Apr 16, 2025

sreguna commented Apr 16, 2025 • edited Loading

sharadraju commented Apr 17, 2025 • edited Loading

shubsaini09 commented Apr 2, 2025 •

edited

Loading

shubsaini09 commented Apr 11, 2025 •

edited

Loading

sreguna commented Apr 15, 2025 •

edited

Loading

sreguna commented Apr 16, 2025 •

edited

Loading

sharadraju commented Apr 17, 2025 •

edited

Loading