Re: Speed dblink using alternate libpq tuple storage

Lists: pgsql-hackers
From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-16 05:24:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, sorry for long absense.

As far as I see, on an out-of-memory in getAnotherTuple() makes
conn->result->resultStatus = PGRES_FATAL_ERROR and
qpParseInputp[23]() skips succeeding 'D' messages consequently.

When exception raised within row processor, pg_conn->inCursor
always positioned in consistent and result->resultStatus ==
PGRES_TUPLES_OK.

The choices of the libpq user on that point are,

- Continue to read succeeding tuples.

Call PQgetResult() to read 'D' messages and hand it to row
processor succeedingly.

- Throw away the remaining results.

Call pqClearAsyncResult() and pqSaveErrorResult(), then call
PQgetResult() to skip over the succeeding 'D' messages. (Of
course the user can't do that on current implement.)

To make the users able to select the second choice (I think this
is rather major), we should only provide and export the new PQ*
function to do that, I think.

void
PQskipRemainingResult(PGconn *conn)
{
pqClearAsyncResult(conn);

/* conn->result is always NULL here */
pqSaveErrorResult(conn);

/* Skip over remaining 'D' messages. * /
PQgetResult(conn);
}

User may write code with this function.

...
PG_TRY();
{
...
res = PQexec(conn, "....");
...
}
PG_CATCH();
{
PQskipRemainingResult(conn);
goto error;
}
PG_END_TRY();

Of cource, this is applicable to C++ client in the same manner.

try {
...
res = PQexec(conn, "....");
...
} catch (const myexcep& ex) {
PQskipRemainingResult(conn);
throw ex;
}

By the way, where should I insert this function ?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-16 08:17:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 16, 2012 at 02:24:19PM +0900, Kyotaro HORIGUCHI wrote:
> As far as I see, on an out-of-memory in getAnotherTuple() makes
> conn->result->resultStatus = PGRES_FATAL_ERROR and
> qpParseInputp[23]() skips succeeding 'D' messages consequently.
>
> When exception raised within row processor, pg_conn->inCursor
> always positioned in consistent and result->resultStatus ==
> PGRES_TUPLES_OK.
>
> The choices of the libpq user on that point are,
>
> - Continue to read succeeding tuples.
>
> Call PQgetResult() to read 'D' messages and hand it to row
> processor succeedingly.
>
> - Throw away the remaining results.
>
> Call pqClearAsyncResult() and pqSaveErrorResult(), then call
> PQgetResult() to skip over the succeeding 'D' messages. (Of
> course the user can't do that on current implement.)

There is also third choice, which may be even more popular than
those ones - PQfinish().

> To make the users able to select the second choice (I think this
> is rather major), we should only provide and export the new PQ*
> function to do that, I think.

I think we already have such function - PQsetRowProcessor().
Considering the user can use that to install skipping callback
or simply set some flag in it's own per-connection state,
I suspect the need is not that big.

But if you want to set error state for skipping, I would instead
generalize PQsetRowProcessorErrMsg() to support setting error state
outside of callback. That would also help the external processing with
'return 2'. But I would keep the requirement that row processing must
be ongoing, standalone error setting does not make sense. Thus the name
can stay.

There seems to be 2 ways to do it:

1) Replace the PGresult under PGconn. This seems ot require that
PQsetRowProcessorErrMsg takes PGconn as argument instead of
PGresult. This also means callback should get PGconn as
argument. Kind of makes sense even.

2) Convert current partial PGresult to error state. That also
makes sense, current use ->rowProcessorErrMsg to transport
the message to later set the error in caller feels weird.

I guess I slightly prefer 2) to 1).

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 05:11:35
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I don't have any attachment to PQskipRemainingResults(), but I
think that some formal method to skip until Command-Complete('C')
without breaking session is necessary because current libpq does
so.

====
> On Thu, Feb 16, 2012 at 02:24:19PM +0900, Kyotaro HORIGUCHI wrote:
> > The choices of the libpq user on that point are,
> >
> > - Continue to read succeeding tuples.
> > - Throw away the remaining results.
>
> There is also third choice, which may be even more popular than
> those ones - PQfinish().

That's it. I've implicitly assumed not to tear off the current
session.

> I think we already have such function - PQsetRowProcessor().
> Considering the user can use that to install skipping callback
> or simply set some flag in it's own per-connection state,

PQsetRowProcessor() sets row processor not to PGresult but
PGconn. So using PGsetRowProcessor() makes no sense for the
PGresult on which the user currently working. Another interface
function is needed to do that on PGresult.

But of course the users can do that by signalling using flags
within their code without PQsetRowProcessor or
PQskipRemainingResults.

Or returning to the beginning implement using PG_TRY() to inhibit
longjmp out of the row processor itself makes that possible too.

Altough it is possible in that ways, it needs (currently)
undocumented (in human-readable langs :-) knowledge about the
client protocol and the handling manner of that in libpq which
might be changed with no description in the release notes.

> I suspect the need is not that big.

I think so, too. But current implement of libpq does so for the
out-of-memory on receiving result rows. So I think some formal
(documented, in other words) way to do that is indispensable.

> But if you want to set error state for skipping, I would instead
> generalize PQsetRowProcessorErrMsg() to support setting error state
> outside of callback. That would also help the external processing with
> 'return 2'. But I would keep the requirement that row processing must
> be ongoing, standalone error setting does not make sense. Thus the name
> can stay.

mmm.. I consider that the cause of the problem proposed here is
the exceptions raised by certain server-side functions called in
row processor especially in C/C++ code. And we shouldn't use
PG_TRY() to catch it there where is too frequently executed. I
think 'return 2' is not applicable for the case. Some aid for
non-local exit from row processors (PQexec and the link from
users's sight) is needed. And I think it should be formal.

> There seems to be 2 ways to do it:
>
> 1) Replace the PGresult under PGconn. This seems ot require that
> PQsetRowProcessorErrMsg takes PGconn as argument instead of
> PGresult. This also means callback should get PGconn as
> argument. Kind of makes sense even.
>
> 2) Convert current partial PGresult to error state. That also
> makes sense, current use ->rowProcessorErrMsg to transport
> the message to later set the error in caller feels weird.
>
> I guess I slightly prefer 2) to 1).

The former might be inappropreate from the point of view of the
`undocumented knowledge' above. The latter seems missing the
point about exceptions.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 07:41:18
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 21, 2012 at 02:11:35PM +0900, Kyotaro HORIGUCHI wrote:
> I don't have any attachment to PQskipRemainingResults(), but I
> think that some formal method to skip until Command-Complete('C')
> without breaking session is necessary because current libpq does
> so.

We have such function: PQgetResult(). Only question is how
to flag that the rows should be dropped.

> > I think we already have such function - PQsetRowProcessor().
> > Considering the user can use that to install skipping callback
> > or simply set some flag in it's own per-connection state,
>
> PQsetRowProcessor() sets row processor not to PGresult but
> PGconn. So using PGsetRowProcessor() makes no sense for the
> PGresult on which the user currently working. Another interface
> function is needed to do that on PGresult.

If we are talking about skipping incoming result rows,
it's PGconn feature, not PGresult. Because you want to do
network traffic for that, yes?

Also, as row handler is on connection, then changing it's
behavior is connection context, not result.

> But of course the users can do that by signalling using flags
> within their code without PQsetRowProcessor or
> PQskipRemainingResults.
>
> Or returning to the beginning implement using PG_TRY() to inhibit
> longjmp out of the row processor itself makes that possible too.
>
> Altough it is possible in that ways, it needs (currently)
> undocumented (in human-readable langs :-) knowledge about the
> client protocol and the handling manner of that in libpq which
> might be changed with no description in the release notes.

You might be right that how to do it may not be obvious.

Ok, lets see how it looks. But please do it like this:

- PQgetRowProcessor() that returns existing row processor.

- PQskipResult() that just replaces row processor, then calls
PQgetResult() to eat the result. It's behaviour fully
matches PQgetResult() then.

I guess the main thing that annoys me with skipping is that
it would require additional magic flag inside libpq.
With PQgetRowProcessor() it does not need anymore, it's
just a helper function that user can implement as well.

Only question here is what should happen if there are
several incoming resultsets (several queries in PQexec).
Should we leave to user to loop over them?

Seems there is 2 approaches here:

1) int PQskipResult(PGconn)
2) int PQskipResult(PGconn, int skipAll)

Both cases returning:
1 - got resultset, there might be more
0 - PQgetResult() returned NULL, connection is empty
-1 - error

Although 1) mirrors regular PGgetResult() better, most users
might not know that function as they are using sync API.
They have simpler time with 2). So 2) then?

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 09:42:34
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> > I don't have any attachment to PQskipRemainingResults(), but I
> > think that some formal method to skip until Command-Complete('C')
> > without breaking session is necessary because current libpq does
> > so.
>
> We have such function: PQgetResult(). Only question is how
> to flag that the rows should be dropped.

I agree with it. I did this by conn->result->resultStatus ==
PGRES_FATAL_ERROR that instructs pqParseInput[23]() to skip
calling getAnotherTuple() but another means to do the same thing
without marking error is needed.

> Also, as row handler is on connection, then changing it's
> behavior is connection context, not result.

OK, current implement copying the pointer to the row processor
from PGconn to PGresult and getAnotherTuple() using it on
PGresult to avoid unintended replacement of row processor by
PQsetRowProcessor(), and I understand that row processor setting
should be in PGconn context and the change by PGsetRowProcessor()
should immediately become effective. That's right?

> Ok, lets see how it looks. But please do it like this:
>
> - PQgetRowProcessor() that returns existing row processor.

This seems also can be done by the return value of
PQsetRowProcessor() (currently void). Anyway, I provide this
function and also change the return value of PQsetRowProcessor().

> - PQskipResult() that just replaces row processor, then calls
> PQgetResult() to eat the result. It's behaviour fully
> matches PQgetResult() then.

There seems to be two choices, one is that PQskipResult()
replaces the row processor with NULL pointer and
getAnotherTuple() calls row processor if not NULL. And the
another is PQskipResult() sets the empty function as row
processor. I do the latter for the present.

This approach does needless call of getAnotherTuple(). Seeing if
the pointer to row processor is NULL in pqParseInput[23]() could
avoid this extra calls and may reduce the time for skip, but I
think it is no need to do so for such rare cases.

> I guess the main thing that annoys me with skipping is that
> it would require additional magic flag inside libpq.
> With PQgetRowProcessor() it does not need anymore, it's
> just a helper function that user can implement as well.

Ok.

> Only question here is what should happen if there are
> several incoming resultsets (several queries in PQexec).
> Should we leave to user to loop over them?
>
> Seems there is 2 approaches here:
>
> 1) int PQskipResult(PGconn)
> 2) int PQskipResult(PGconn, int skipAll)
>
> Both cases returning:
> 1 - got resultset, there might be more
> 0 - PQgetResult() returned NULL, connection is empty
> -1 - error
>
> Although 1) mirrors regular PGgetResult() better, most users
> might not know that function as they are using sync API.
> They have simpler time with 2). So 2) then?

Let me confirm the effects of this function. Is the below
description right?

- PQskipResult(conn, false) makes just following PQgetResult() to
skip current bunch of rows and the consequent PQgetResult()'s
gathers rows as usual.

- PQskipResult(conn, true) makes all consequent PQgetResult()'s
to skip all the rows.

If this is right, row processor should stay also in PGresult
context. PQskipResult() replaces the row processor in PGconn when
the second parameter is true, and in PGresult for false.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 09:55:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Sorry, I should fix a wrong word selection..

> Let me confirm the effects of this function. Is the below
> description right?
>
> - PQskipResult(conn, false) makes just following PQgetResult() to
> skip current bunch of rows and the consequent PQgetResult()'s
> gathers rows as usual.

- PQskipResult(conn, false) makes just following PQgetResult() to
skip current bunch of rows and the succeeding PQgetResult()'s
gathers rows as usual. ~~~~~~~~~~

> - PQskipResult(conn, true) makes all consequent PQgetResult()'s
> to skip all the rows.

- PQskipResult(conn, true) makes all succeeding PQgetResult()'s
to skip all the rows. ~~~~~~~~~~

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 09:56:50
Message-ID: CACMqXCJeR-4qT=sA+O7CW9L7d+-qNq06N-Drvi-ocgoMbusweQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 21, 2012 at 11:42 AM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>> Also, as row handler is on connection, then changing it's
>> behavior is connection context, not result.
>
> OK, current implement copying the pointer to the row processor
> from PGconn to PGresult and getAnotherTuple() using it on
> PGresult to avoid unintended replacement of row processor by
> PQsetRowProcessor(), and I understand that row processor setting
> should be in PGconn context and the change by PGsetRowProcessor()
> should immediately become effective. That's right?

Note I dropped the row processor from under PGresult.
Please don't put it back there.

>> Ok, lets see how it looks.  But please do it like this:
>>
>> - PQgetRowProcessor() that returns existing row processor.
>
> This seems also can be done by the return value of
> PQsetRowProcessor() (currently void). Anyway, I provide this
> function and also change the return value of PQsetRowProcessor().

Note you need processorParam as well.
I think it's simpler to rely on PQgetProcessor()

>> - PQskipResult() that just replaces row processor, then calls
>>   PQgetResult() to eat the result.  It's behaviour fully
>>   matches PQgetResult() then.
>
> There seems to be two choices, one is that PQskipResult()
> replaces the row processor with NULL pointer and
> getAnotherTuple() calls row processor if not NULL. And the
> another is PQskipResult() sets the empty function as row
> processor. I do the latter for the present.

Yes, let's avoid NULLs.

> This approach does needless call of getAnotherTuple(). Seeing if
> the pointer to row processor is NULL in pqParseInput[23]() could
> avoid this extra calls and may reduce the time for skip, but I
> think it is no need to do so for such rare cases.

We should optimize for common case, which is non-skipping
row processor.

>> Only question here is what should happen if there are
>> several incoming resultsets (several queries in PQexec).
>> Should we leave to user to loop over them?
>>
>> Seems there is 2 approaches here:
>>
>> 1) int PQskipResult(PGconn)
>> 2) int PQskipResult(PGconn, int skipAll)
>>
>> Both cases returning:
>>     1 - got resultset, there might be more
>>     0 - PQgetResult() returned NULL, connection is empty
>>    -1 - error
>>
>> Although 1) mirrors regular PGgetResult() better, most users
>> might not know that function as they are using sync API.
>> They have simpler time with 2).  So 2) then?
>
> Let me confirm the effects of this function. Is the below
> description right?
>
> - PQskipResult(conn, false) makes just following PQgetResult() to
>  skip current bunch of rows and the consequent PQgetResult()'s
>  gathers rows as usual.

Yes.

> - PQskipResult(conn, true) makes all consequent PQgetResult()'s
>  to skip all the rows.
>
> If this is right, row processor should stay also in PGresult
> context. PQskipResult() replaces the row processor in PGconn when
> the second parameter is true, and in PGresult for false.

No, let's keep row processor only under PGconn.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 10:13:55
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> Note I dropped the row processor from under PGresult.
> Please don't put it back there.

I overlooked that. I understand it.

> > This seems also can be done by the return value of
> > PQsetRowProcessor() (currently void). Anyway, I provide this
> > function and also change the return value of PQsetRowProcessor().
>
> Note you need processorParam as well.
> I think it's simpler to rely on PQgetProcessor()

Hmm. Ok.

> > Let me confirm the effects of this function. Is the below
> > description right?
> >
> > - PQskipResult(conn, false) makes just following PQgetResult() to
> >  skip current bunch of rows and the consequent PQgetResult()'s
> >  gathers rows as usual.
>
> Yes.
>
> > - PQskipResult(conn, true) makes all consequent PQgetResult()'s
> >  to skip all the rows.

Well, Is this right?

> > If this is right, row processor should stay also in PGresult
> > context. PQskipResult() replaces the row processor in PGconn when
> > the second parameter is true, and in PGresult for false.
>
> No, let's keep row processor only under PGconn.

Then, Should I add the stash for the row processor (and needless
for param) to recall after in PGconn?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com, shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 10:44:35
Message-ID: CACMqXCLJWhq=2DnTa8iahvdVTYyc_-uts_jPJRva-YOiXLyVRA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 21, 2012 at 12:13 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>> > - PQskipResult(conn, true) makes all consequent PQgetResult()'s
>> >  to skip all the rows.
>
> Well, Is this right?

Yes, call getResult() until it returns NULL.

>> > If this is right, row processor should stay also in PGresult
>> > context. PQskipResult() replaces the row processor in PGconn when
>> > the second parameter is true, and in PGresult for false.
>>
>> No, let's keep row processor only under PGconn.
>
> Then, Should I add the stash for the row processor (and needless
> for param) to recall after in PGconn?

PQskipResult:
- store old callback and param in local vars
- set do-nothing row callback
- call PQgetresult() once, or until it returns NULL
- restore old callback
- return 1 if last result was non-NULL, 0 otherwise

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 11:15:07
Message-ID: CAM103DuBvgCta46RJgkwhMvTZd9VkPxYOhZ238Y+f8KmV18WXQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thank you. Everything seems clear.
Please wait for a while.

> PQskipResult:
> - store old callback and param in local vars
> - set do-nothing row callback
> - call PQgetresu
>
> On Tue, Feb 21, 2012 at 12:13 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
> >> > - PQskipResult(conn, true) makes all consequent PQgetResult()'s
> >> > to skip all the rows.
> >
> > Well, Is this right?
>
> Yes, call getResult() until it returns NULL.
>
> >> > If this is right, row processor should stay also in PGresult
> >> > context. PQskipResult() replaces the row processor in PGconn when
> >> > the second parameter is true, and in PGresult for false.
> >>
> >> No, let's keep row processor only under PGconn.
> >
> > Then, Should I add the stash for the row processor (and needless
> > for param) to recall after in PGconn?
>
> PQskipResult:
> - store old callback and param in local vars
> - set do-nothing row callback
> - call PQgetresult() once, or until it returns NULL
> - restore old callback
> - return 1 if last result was non-NULL, 0 otherwise
>
> --
> marko
>
>


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 15:27:57
Message-ID: CAM103DtAC+Dc7=C1mDF=u7jE5sn-=aeirNYjTT+YWu0MXkS7Uw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, this is ... nth version of the patch.

> Thank you. Everything seems clear.
> Please wait for a while.

libpq-fe.h
- PQskipResult() is added instead of PGskipRemainigResults().

fe-exec.c
- PQskipResult() is added instead of PGskipRemainigResults().
- PQgetRowProcessor() is added.

dblink.c
- Use PQskipReslt() to skip remaining rows.
- Shorten the path from catching exception to rethrowing it. And
storeInfo.edata has been removed due to that.
- dblink_fetch() now catches exceptions properly.

libpq.sgml
- PQskipResult() is added instead of PGskipRemainigResults().
- Some misspelling has been corrected.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120222.patch application/octet-stream 23.4 KB
libpq_rowproc_doc_20120222.patch application/octet-stream 9.5 KB
dblink_use_rowproc_20120222.patch application/octet-stream 12.8 KB
early_exit_20120222.diff application/octet-stream 2.1 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-21 23:36:30
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 22, 2012 at 12:27:57AM +0900, Kyotaro HORIGUCHI wrote:
> fe-exec.c
> - PQskipResult() is added instead of PGskipRemainigResults().

It must free the PGresults that PQgetResult() returns.

Also, please fix 2 issues mentined here:

http://archives.postgresql.org/message-id/CACMqXCLvpkjb9+c6sqJXitMHvrRCo+yu4q4bQ--0d7L=vw62Yg@mail.gmail.com

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-23 10:14:03
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, this is new version of the patch.

# This patch is based on the commit
# 2bbd88f8f841b01efb073972b60d4dc1ff1f6fd0 @ Feb 13 to avoid the
# compile error caused by undeclared LEAKPROOF in kwlist.h.

> It must free the PGresults that PQgetResult() returns.

I'm sorry. It slipped out of my mind. Add PQclear() for the
return value.

> Also, please fix 2 issues mentined here:

- PQsetRowProcessorErrMsg() now handles msg as const string.

- Changed the order of the parameters of the type PQrowProcessor.
New order is (PGresult *res, PGrowValue *columns, void *params).

# PQsetRowProcessorErrMsg outside of callback is not implemented.

- Documentation and dblink are modified according to the changes
above.

By the way, I would like to ask you one question. What is the
reason why void* should be head or tail of the parameter list?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120223.patch text/x-patch 23.5 KB
libpq_rowproc_doc_20120223.patch text/x-patch 9.5 KB
dblink_use_rowproc_20120223.patch text/x-patch 12.8 KB
early_exit_20120223.diff text/x-patch 2.1 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-23 12:34:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 23, 2012 at 07:14:03PM +0900, Kyotaro HORIGUCHI wrote:
> Hello, this is new version of the patch.

Looks good.

> By the way, I would like to ask you one question. What is the
> reason why void* should be head or tail of the parameter list?

Aesthetical reasons:

1) PGresult and PGrowValue belong together.

2) void* is probably the context object for handler. When doing
object-oriented programming in C the main object is usually first.
Like libpq does - PGconn is always first argument.

But as libpq does not know the actual meaning of void* for handler,
is can be last param as well.

When I wrote the demo code, I noticed that it is unnatural to have
void* in the middle.

Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
usable outside of handler, we can simplify internal usage as well:
the PGresult->rowProcessorErrMsg can be dropped and let's use
->errMsg to transport the error message.

The PGresult is long-lived structure and adding fields for such
temporary usage feels wrong. There is no other libpq code between
row processor and getAnotherTuple, so the use is safe.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-24 10:53:14
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, this is new version of the patch.

> > By the way, I would like to ask you one question. What is the
> > reason why void* should be head or tail of the parameter list?
>
> Aesthetical reasons:

I got it. Thank you.

> Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
> usable outside of handler, we can simplify internal usage as well:
> the PGresult->rowProcessorErrMsg can be dropped and let's use
> ->errMsg to transport the error message.
>
> The PGresult is long-lived structure and adding fields for such
> temporary usage feels wrong. There is no other libpq code between
> row processor and getAnotherTuple, so the use is safe.

I almost agree with it. Plus, I think it is no reason to consider
out of memory as particular because now row processor becomes
generic. But the previous patch does different process for OOM
and others, but I couldn't see obvious reason to do so.

- PGresult.rowProcessorErrMes is removed and use PGconn.errMsg
instead with the new interface function PQresultSetErrMes().

- Now row processors should set message for any error status
occurred within. pqAddRow and dblink.c is modified to do so.

- getAnotherTuple() sets the error message `unknown error' for
the condition rp == 0 && ->errMsg == NULL.

- Always forward input cursor and do pqClearAsyncResult() and
pqSaveErrorResult() when rp == 0 in getAnotherTuple()
regardless whether ->errMsg is NULL or not in fe-protocol3.c.

- conn->inStart is already moved to the start point of the next
message when row processor is called. So forwarding inStart in
outOfMemory1 seems redundant. I removed it.

- printfPQExpBuffer() compains for variable message. So use
resetPQExpBuffer() and appendPQExpBufferStr() instead.

=====
- dblink does not use conn->errorMessage before, and also now...
all errors are displayed as `Error occurred on dblink connection...'.

- TODO: No NLS messages for error messages.

- Somehow make check yields error for base revision. So I have
not done that.

- I have no idea how to do test for protocol 2...

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120224.patch text/x-patch 23.1 KB
libpq_rowproc_doc_20120224.patch text/x-patch 9.4 KB
dblink_use_rowproc_20120224.patch text/x-patch 13.0 KB
early_exit_20120224.diff text/x-patch 2.0 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-24 12:11:45
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 07:53:14PM +0900, Kyotaro HORIGUCHI wrote:
> Hello, this is new version of the patch.
>
> > > By the way, I would like to ask you one question. What is the
> > > reason why void* should be head or tail of the parameter list?
> >
> > Aesthetical reasons:
>
> I got it. Thank you.
>
> > Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
> > usable outside of handler, we can simplify internal usage as well:
> > the PGresult->rowProcessorErrMsg can be dropped and let's use
> > ->errMsg to transport the error message.
> >
> > The PGresult is long-lived structure and adding fields for such
> > temporary usage feels wrong. There is no other libpq code between
> > row processor and getAnotherTuple, so the use is safe.
>
> I almost agree with it. Plus, I think it is no reason to consider
> out of memory as particular because now row processor becomes
> generic. But the previous patch does different process for OOM
> and others, but I couldn't see obvious reason to do so.

On OOM, the old result is freed to have higher chance that
constructing new result succeeds. But if we want to transport
error message, we need to keep old PGresult around. Thus
two separate paths.

This also means your new code is broken, the errmsg becomes
invalid after pqClearAsyncResult().

It's better to restore old two-path error handling.

> - Now row processors should set message for any error status
> occurred within. pqAddRow and dblink.c is modified to do so.

I don't think that should be required. Just use a dummy msg.

> - getAnotherTuple() sets the error message `unknown error' for
> the condition rp == 0 && ->errMsg == NULL.

Ok. I think most client will want to drop connection
on error from rowproc, so exact message does not matter.

> - Always forward input cursor and do pqClearAsyncResult() and
> pqSaveErrorResult() when rp == 0 in getAnotherTuple()
> regardless whether ->errMsg is NULL or not in fe-protocol3.c.

Ok. Although skipping packet on OOM does is dubious,
we will skip all further packets anyway, so let's be
consistent on problems.

There is still one EOF in v3 getAnotherTuple() - pqGetInt(tupnfields),
please turn that one also to protocolerror.

> - conn->inStart is already moved to the start point of the next
> message when row processor is called. So forwarding inStart in
> outOfMemory1 seems redundant. I removed it.
>
> - printfPQExpBuffer() compains for variable message. So use
> resetPQExpBuffer() and appendPQExpBufferStr() instead.

Instead use ("%s", errmsg) as argument there. libpq code
is noisy enough, no need to add more.

> - I have no idea how to do test for protocol 2...

I have a urge to test with "rm fe-protocol2.c"...

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Let's drop V2 protocol
Date: 2012-02-24 13:52:10
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 02:11:45PM +0200, Marko Kreen wrote:
> On Fri, Feb 24, 2012 at 07:53:14PM +0900, Kyotaro HORIGUCHI wrote:
> > - I have no idea how to do test for protocol 2...
>
> I have a urge to test with "rm fe-protocol2.c"...

Now I tested with 7.3.21 and the non-error case works fine. Error state
does not - and not because patch is buggy, but because it has never
worked - V2 protocol has no working concept of skipping packets because
pending error state.

On OOM, V2 code does:

conn->inStart = conn->inEnd;

and hopes for the best, but it does not work, because on short results
it moves past ReadyForQuery, on long results it moves into middle of
some packet.

With user-specified row processor, we need to have a working
error state handling. With some surgery, it's possible to
introduce something like

if (conn->result->resultStatus != PGRES_TUPLES_OK)

into various places in the code, to ignore but still
parse the packets. But it will be rather non-trivial patch.

So could we like, uh, not do it and simply drop the V2 code?

Ofcourse, the row-processor patch does not make the situation worse,
so we could just say "don't use custom row processor with V2 servers",
but it still raises the question: "Does anyone have pre-7.4
servers around and if yes, then why does he need to use 9.2 libpq
to access those?"

--
marko


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 14:26:36
Message-ID: CAHyXU0wwR_nQZ3Dy+ZjCQ9_tr9E3FLRz_JAmP2y8gem8cYAj0w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 7:52 AM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
> On Fri, Feb 24, 2012 at 02:11:45PM +0200, Marko Kreen wrote:
>> On Fri, Feb 24, 2012 at 07:53:14PM +0900, Kyotaro HORIGUCHI wrote:
>> > - I have no idea how to do test for protocol 2...
>>
>> I have a urge to test with "rm fe-protocol2.c"...
>
> Now I tested with 7.3.21 and the non-error case works fine.  Error state
> does not - and not because patch is buggy, but because it has never
> worked - V2 protocol has no working concept of skipping packets because
> pending error state.
>
> On OOM, V2 code does:
>
>   conn->inStart = conn->inEnd;
>
> and hopes for the best, but it does not work, because on short results
> it moves past ReadyForQuery, on long results it moves into middle of
> some packet.
>
> With user-specified row processor, we need to have a working
> error state handling.  With some surgery, it's possible to
> introduce something like
>
>   if (conn->result->resultStatus != PGRES_TUPLES_OK)
>
> into various places in the code, to ignore but still
> parse the packets.  But it will be rather non-trivial patch.
>
> So could we like, uh, not do it and simply drop the V2 code?
>
>
> Ofcourse, the row-processor patch does not make the situation worse,
> so we could just say "don't use custom row processor with V2 servers",
> but it still raises the question: "Does anyone have pre-7.4
> servers around and if yes, then why does he need to use 9.2 libpq
> to access those?"

I think it's plausible that very old client libraries could connect to
a modern server. But it's pretty unlikely to have a 9.2 app contact
an ancient server IMO.

merlin


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 14:40:26
Message-ID: CACMqXCKmCONWb4Qejto+YSg6d3rvcnA7EbQd_E73UodNK6u1qQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 4:26 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Fri, Feb 24, 2012 at 7:52 AM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
>> So could we like, uh, not do it and simply drop the V2 code?
>
> I think it's plausible that very old client libraries could connect to
> a modern server.  But it's pretty unlikely to have a 9.2 app contact
> an ancient server IMO.

We can drop it from libpq but keep the server-side support,
the codebase is different.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 16:46:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Fri, Feb 24, 2012 at 4:26 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> I think it's plausible that very old client libraries could connect to
>> a modern server. But it's pretty unlikely to have a 9.2 app contact
>> an ancient server IMO.

> We can drop it from libpq but keep the server-side support,
> the codebase is different.

I believe it's still somewhat common among JDBC users to force
V2-protocol connections as a workaround for over-eager prepared
statement planning. Although I think the issue they're trying to dodge
will be fixed as of 9.2, that doesn't mean the server-side support can
go away.

As for taking it out of libpq, I would vote against doing that as long
as we have pg_dump support for pre-7.4 servers. Now, I think it would
be entirely reasonable to kill pg_dump's support for pre-7.3 servers,
because that would simplify life in numerous ways for pg_dump; but 7.4
was not a big compatibility break for pg_dump so it seems a bit
arbitrary to kill its support for 7.3 specifically.

As near as I can tell the argument here is basically that we don't want
to try to fix unfixable bugs in the V2-protocol code. Well, yeah,
they're unfixable ... why do you think we invented V3? But they are
what they are, and as long as you don't run into those limitations the
protocol does still work. There are plenty of features that require V3
already, so I see no reason not to classify the row-processor stuff as
one more.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 17:11:08
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 11:46:19AM -0500, Tom Lane wrote:
> As for taking it out of libpq, I would vote against doing that as long
> as we have pg_dump support for pre-7.4 servers. Now, I think it would
> be entirely reasonable to kill pg_dump's support for pre-7.3 servers,
> because that would simplify life in numerous ways for pg_dump; but 7.4
> was not a big compatibility break for pg_dump so it seems a bit
> arbitrary to kill its support for 7.3 specifically.

So we need to maintain V2 protocol in libpq to specifically support 7.3?

What's so special about 7.3?

> As near as I can tell the argument here is basically that we don't want
> to try to fix unfixable bugs in the V2-protocol code. Well, yeah,
> they're unfixable ... why do you think we invented V3? But they are
> what they are, and as long as you don't run into those limitations the
> protocol does still work. There are plenty of features that require V3
> already, so I see no reason not to classify the row-processor stuff as
> one more.

Agreed. But still - having to reorg the never-used V2 code in parallel
to actual development in V3 code makes all changes in the area
unnecessary harder.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 17:17:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Fri, Feb 24, 2012 at 11:46:19AM -0500, Tom Lane wrote:
>> As for taking it out of libpq, I would vote against doing that as long
>> as we have pg_dump support for pre-7.4 servers. Now, I think it would
>> be entirely reasonable to kill pg_dump's support for pre-7.3 servers,
>> because that would simplify life in numerous ways for pg_dump; but 7.4
>> was not a big compatibility break for pg_dump so it seems a bit
>> arbitrary to kill its support for 7.3 specifically.

> So we need to maintain V2 protocol in libpq to specifically support 7.3?

> What's so special about 7.3?

From pg_dump's viewpoint, the main thing about 7.3 is it's where we
added server-tracked object dependencies. It also has schemas, though
I don't recall at the moment how much effort pg_dump has to spend on
faking those for an older server.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Let's drop V2 protocol
Date: 2012-02-24 18:24:58
Message-ID: CAHyXU0zVHr47Y6o-Ahg9wx8Hgh_naGwu+Xm3mNfZy35Xrr=nWg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 10:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I believe it's still somewhat common among JDBC users to force
> V2-protocol connections as a workaround for over-eager prepared
> statement planning.  Although I think the issue they're trying to dodge
> will be fixed as of 9.2, that doesn't mean the server-side support can
> go away.

good point. I just went through this. The JDBC driver has a
'prepareThreshold' directive that *mostly* disables server side
prepared statements so you can work with tools like pgbouncer.
Unfortunately, it doesn't work for explicit transaction control
statements so that you have to downgrade jdbc protocol or patch the
driver (which is really the better way to go, but I can understand why
that won't fly for a lot of people).

merlin


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-27 08:20:30
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> On OOM, the old result is freed to have higher chance that
> constructing new result succeeds. But if we want to transport
> error message, we need to keep old PGresult around. Thus
> two separate paths.

Ok, I understood the aim. But now we can use non-local exit to do
that for not only asynchronous reading (PQgetResult()) but
synchronous (PQexec()). If we should provide a means other than
exceptions to do that, I think it should be usable for both
syncronous and asynchronous reading. conn->asyncStatus seems to
be used for the case.

Wow is the modification below?

- getAnotherTuple() now returns 0 to continue as before, and 1
instead of EOF to signal EOF state, and 2 to instruct to exit
immediately.

- pqParseInput3 set conn->asyncStatus to PGASYNC_BREAK for the
last case,

- then PQgetResult() returns immediately when
asyncStatus == PGASYNC_TUPLES_BREAK after parseInput() retunes.

- and PQexecFinish() returns immediately if PQgetResult() returns
with aysncStatys == PGASYNC_TUPLES_BREAK.

- PGgetResult() sets asyncStatus = PGRES_TUPLES_OK if called with
asyncStatus == PGRES_TUPLES_BREAK

- New libpq API PQisBreaked(PGconn*) returns if asyncStatus ==
PGRES_TUPLES_BREAK can be used to check if the transfer is breaked.

> Instead use ("%s", errmsg) as argument there. libpq code
> is noisy enough, no need to add more.

Ok. I will do so.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-27 10:20:07
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 27, 2012 at 05:20:30PM +0900, Kyotaro HORIGUCHI wrote:
> Hello,
>
> > On OOM, the old result is freed to have higher chance that
> > constructing new result succeeds. But if we want to transport
> > error message, we need to keep old PGresult around. Thus
> > two separate paths.
>
> Ok, I understood the aim. But now we can use non-local exit to do
> that for not only asynchronous reading (PQgetResult()) but
> synchronous (PQexec()). If we should provide a means other than
> exceptions to do that, I think it should be usable for both
> syncronous and asynchronous reading. conn->asyncStatus seems to
> be used for the case.
>
> Wow is the modification below?
>
> - getAnotherTuple() now returns 0 to continue as before, and 1
> instead of EOF to signal EOF state, and 2 to instruct to exit
> immediately.
>
> - pqParseInput3 set conn->asyncStatus to PGASYNC_BREAK for the
> last case,
>
> - then PQgetResult() returns immediately when
> asyncStatus == PGASYNC_TUPLES_BREAK after parseInput() retunes.
>
> - and PQexecFinish() returns immediately if PQgetResult() returns
> with aysncStatys == PGASYNC_TUPLES_BREAK.
>
> - PGgetResult() sets asyncStatus = PGRES_TUPLES_OK if called with
> asyncStatus == PGRES_TUPLES_BREAK
>
> - New libpq API PQisBreaked(PGconn*) returns if asyncStatus ==
> PGRES_TUPLES_BREAK can be used to check if the transfer is breaked.

I don't get where are you going with such changes. Are you trying to
make V2 code survive row-processor errors? (Only V2 has concept of
"EOF state".) Then forget about it. It's more important to not
destabilize V3 code.

And error from row processor is not something special from other errors.
Why does it need special state? And note that adding new PGASYNC or
PGRES state needs review of all if() and switch() statements in the
code that use those fields... So there must serious need for it.
At the moment I don't see such need.

I just asked you to replace ->rowProcessorErrMsg with ->errMsg
to get rid of unnecessary field.

But if you want to do bigger cleanup, then you can instead make
PQsetRowProcessorErrMsg() more generic:

PQsetErrorMessage(PGconn *conn, const char *msg)

Current code has the tiny problem that it adds new special state between
PQsetRowProcessorErrMsg() and return from callback to getAnotherTuple()
where getAnotherTuple() sets actual error state. When the callback
exits via exception, the state can leak to other code. It should not
break anything, but it's still annoying.

Also, with the PQgetRow() patch, the need for doing complex processing
under callback has decreased and the need to set error outside callback
has increased.

As a bonus, such generic error-setting function would lose any extra
special state introduced by row-processor patch.

Previously I mentioned that callback would need to have additional
PGconn* argument to make connection available to callback to use such
generic error setting function, but now I think it does not need it -
simple callbacks don't need to set errors and complex callback can make
the PGconn available via Param. Eg. the internal callback should set
Param to PGconn, instead keeping NULL there.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-28 02:59:02
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello.

I will show you fixed version patch later, please wait for a
while.

======
> It's more important to not destabilize V3 code.

Ok, I withdraw that agreeing with that point. And I noticed that
the proposal before is totally a crap becuase I have mixed up
asyncStatus with resultStatus in it.

> And error from row processor is not something special from
> other errors. Why does it need special state?

I'm sorry to have upset the discussion. What I wanted there is a
means other than exceptions to exit out of PQexec() by row
processor trigger without discarding the result built halfway,
like async.

> I just asked you to replace ->rowProcessorErrMsg with ->errMsg
> to get rid of unnecessary field.

Ok, I will remove extra code.

> Also, with the PQgetRow() patch, the need for doing complex processing
> under callback has decreased and the need to set error outside callback
> has increased.
>
> As a bonus, such generic error-setting function would lose any extra
> special state introduced by row-processor patch.

That sounds nice. I will show you the patch without it for the
present, then try to include.

> Previously I mentioned that callback would need to have additional
> PGconn* argument to make connection available to callback to use such
> generic error setting function, but now I think it does not need it -
> simple callbacks don't need to set errors and complex callback can make
> the PGconn available via Param. Eg. the internal callback should set
> Param to PGconn, instead keeping NULL there.

I agree with it.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-28 08:04:44
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

This is the new version of the patch.
It is not rebased to the HEAD because of a build error.

> It's better to restore old two-path error handling.

I restorerd "OOM and save result" route. But it seems needed to
get back any amount of memory on REAL OOM as the comment in
original code says.

So I restored the meaning of rp == 0 && errMsg == NULL as REAL
OOM which is to throw the async result away and the result will
be preserved if errMsg is not NULL. `unknown error' has been
removed.

As the result, if row processor returns 0 the parser skips to the
end of rows and returns the working result or an error result
according to whether errMsg is set or not in the row processor.

> I don't think that should be required. Just use a dummy msg.

Considering the above, pqAddRow is also restored to leave errMsg
NULL on OOM.

> There is still one EOF in v3 getAnotherTuple() -
> pqGetInt(tupnfields), please turn that one also to
> protocolerror.

pqGetInt() returns EOF only when it wants additional reading from
network if the parameter `bytes' is appropreate. Non-zero return
from it seems should be handled as EOF, not a protocol error. The
one point I had modified bugilly is also restored. The so-called
'protocol error' has been vanished eventually.

> Instead use ("%s", errmsg) as argument there. libpq code
> is noisy enough, no need to add more.

done

Is there someting left undone?

By the way, I noticed that dblink always says that the current
connection is 'unnamed' in messages the errors in
dblink_record_internal(at)dblink(dot) I could see that
dblink_record_internal defines the local variable conname = NULL
and pass it to dblink_res_error to display the error message. But
no assignment on it in the function.

It seemed properly shown when I added the code to set conname
from PG_GETARG_TEXT_PP(0) if available, in other words do that
just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the
dblink's manner... This is not included in this patch.

Furthurmore dblink_res_error looks only into returned PGresult to
display the error and always says only `Error occurred on dblink
connection..: could not execute query'..

Is it right to consider this as follows?

- dblink is wrong in error handling. A client of libpq should
see PGconn by PQerrorMessage() if (or regardless of whether?)
PGresult says nothing about error.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120228.patch text/x-patch 23.8 KB
libpq_rowproc_doc_20120228.patch text/x-patch 9.6 KB
dblink_use_rowproc_20120228.patch text/x-patch 13.7 KB
early_exit_20120228.diff text/x-patch 2.0 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-01 16:14:32
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 28, 2012 at 05:04:44PM +0900, Kyotaro HORIGUCHI wrote:
> > There is still one EOF in v3 getAnotherTuple() -
> > pqGetInt(tupnfields), please turn that one also to
> > protocolerror.
>
> pqGetInt() returns EOF only when it wants additional reading from
> network if the parameter `bytes' is appropreate. Non-zero return
> from it seems should be handled as EOF, not a protocol error. The
> one point I had modified bugilly is also restored. The so-called
> 'protocol error' has been vanished eventually.

But it's broken in V3 protocol - getAnotherTuple() will be called
only if the packet is fully read. If the packet contents do not
agree with packet header, it's protocol error. Only valid EOF
return in V3 getAnotherTuple() is when row processor asks
for early exit.

> Is there someting left undone?

* Convert old EOFs to protocol errors in V3 getAnotherTuple()

* V2 getAnotherTuple() can leak PGresult when handling custom
error from row processor.

* remove pqIsnonblocking(conn) check when row processor returned 2.
I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
on sync connection.

* It seems the return codes from callback should be remapped,
(0, 1, 2) is unusual pattern. Better would be:

-1 - error
0 - stop parsing / early exit ("I'm not done yet")
1 - OK ("I'm done with the row")

* Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
Main problem is that it needs to be synced with error handling
in rest of libpq, which is unlike the rest of row processor patch,
which consists only of local changes. All solutions here
are either ugly hacks or too complex to be part of this patch.

Also considering that we have working exceptions and PQgetRow,
I don't see much need for custom error messages. If really needed,
it should be introduced as separate patch, as the area of code it
affects is completely different.

Currently the custom error messaging seems to be the blocker for
this patch, because of raised complexity when implementing it and
when reviewing it. Considering how unimportant the provided
functionality is, compared to rest of the patch, I think we should
simply drop it.

My suggestion - check in getAnotherTuple whether resultStatus is
already error and do nothing then. This allows internal pqAddRow
to set regular "out of memory" error. Otherwise give generic
"row processor error".

> By the way, I noticed that dblink always says that the current
> connection is 'unnamed' in messages the errors in
> dblink_record_internal(at)dblink(dot) I could see that
> dblink_record_internal defines the local variable conname = NULL
> and pass it to dblink_res_error to display the error message. But
> no assignment on it in the function.
>
> It seemed properly shown when I added the code to set conname
> from PG_GETARG_TEXT_PP(0) if available, in other words do that
> just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the
> dblink's manner... This is not included in this patch.
>
> Furthurmore dblink_res_error looks only into returned PGresult to
> display the error and always says only `Error occurred on dblink
> connection..: could not execute query'..
>
> Is it right to consider this as follows?
>
> - dblink is wrong in error handling. A client of libpq should
> see PGconn by PQerrorMessage() if (or regardless of whether?)
> PGresult says nothing about error.

Yes, it seems like bug.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-06 02:13:45
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, I'm sorry for the abesnce.

> But it's broken in V3 protocol - getAnotherTuple() will be called
> only if the packet is fully read. If the packet contents do not
> agree with packet header, it's protocol error. Only valid EOF
> return in V3 getAnotherTuple() is when row processor asks
> for early exit.

Original code of getAnotherTuple returns EOF when the bytes to
be read is not fully loaded. I understand that this was
inappropriately (redundant checks?) written at least for the
pqGetInt() for the field length in getAnotherTuple. But I don't
understand how to secure the rows (or table data) fully loaded at
the point of getAnotherTuple called...

Nevertheles the first pgGetInt() can return EOF when the
previsous row is fully loaded but the next row is not loaded so
the EOF-rerurn seems necessary even if the each row will passed
after fully loaded.

> * Convert old EOFs to protocol errors in V3 getAnotherTuple()

Ok. I will do that.

> * V2 getAnotherTuple() can leak PGresult when handling custom
> error from row processor.

mmm. I will confirm it.

> * remove pqIsnonblocking(conn) check when row processor returned 2.
> I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
> on sync connection.

mmm. EOF from getAnotherTuple makes PQgetResult try furthur
reading until asyncStatus != PGASYNC_BUSY as far as I saw. And It
seemed to do so when I tried to remove 'return 2'. I think that
it is needed at least one additional state for asyncStatus to
work EOF as desied here.

> * It seems the return codes from callback should be remapped,
> (0, 1, 2) is unusual pattern. Better would be:
>
> -1 - error
> 0 - stop parsing / early exit ("I'm not done yet")
> 1 - OK ("I'm done with the row")

I almost agree with it. I will consider the suggestion related to
pqAddRow together.

> * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
> Main problem is that it needs to be synced with error handling
> in rest of libpq, which is unlike the rest of row processor patch,
> which consists only of local changes. All solutions here
> are either ugly hacks or too complex to be part of this patch.

Ok, I will take your advice.

> Also considering that we have working exceptions and PQgetRow,
> I don't see much need for custom error messages. If really needed,
> it should be introduced as separate patch, as the area of code it
> affects is completely different.

I agree with it.

> Currently the custom error messaging seems to be the blocker for
> this patch, because of raised complexity when implementing it and
> when reviewing it. Considering how unimportant the provided
> functionality is, compared to rest of the patch, I think we should
> simply drop it.

Ok.

> My suggestion - check in getAnotherTuple whether resultStatus is
> already error and do nothing then. This allows internal pqAddRow
> to set regular "out of memory" error. Otherwise give generic
> "row processor error".

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-06 08:13:31
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 06, 2012 at 11:13:45AM +0900, Kyotaro HORIGUCHI wrote:
> > But it's broken in V3 protocol - getAnotherTuple() will be called
> > only if the packet is fully read. If the packet contents do not
> > agree with packet header, it's protocol error. Only valid EOF
> > return in V3 getAnotherTuple() is when row processor asks
> > for early exit.
>
> Original code of getAnotherTuple returns EOF when the bytes to
> be read is not fully loaded. I understand that this was
> inappropriately (redundant checks?) written at least for the
> pqGetInt() for the field length in getAnotherTuple. But I don't
> understand how to secure the rows (or table data) fully loaded at
> the point of getAnotherTuple called...

Look into pqParseInput3():

if (avail < msgLength)
{
...
return;
}

> > * remove pqIsnonblocking(conn) check when row processor returned 2.
> > I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
> > on sync connection.
>
> mmm. EOF from getAnotherTuple makes PQgetResult try furthur
> reading until asyncStatus != PGASYNC_BUSY as far as I saw. And It
> seemed to do so when I tried to remove 'return 2'. I think that
> it is needed at least one additional state for asyncStatus to
> work EOF as desied here.

No. It's valid to do PQisBusy() + PQconsumeInput() loop until
PQisBusy() returns 0. Otherwise, yes, PQgetResult() will
block until final result is available. But thats OK.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-07 00:43:54
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> But I don't understand how to secure the rows (or table data)
> fully loaded at the point of getAnotherTuple called...

I found how pqParseInput ensures the entire message is loaded
before getAnotherTuple called.

fe-protocol3.c:107
| avail = conn->inEnd - conn->inCursor;
| if (avail < msgLength)
| {
| if (pqCheckInBufferSpace(conn->inCursor + (size_t)msgLength, conn))

So now I convinced that the whole row data is loaded at the point
that getAnotherTuple is called. I agree that getAnotherTuple
should not return EOF to request for unloaded part of the
message.

Please wait for a while for the new patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-07 01:04:28
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

Nevertheless, the problem that exit from parseInput() caused by
non-zero return of getAnotherTuple() results in immediate
re-enter into getAnotherTuple() via parseInput() and no other
side effect is still there. But I will do that in the next patch,
though.

> So now I convinced that the whole row data is loaded at the point
^am
> that getAnotherTuple is called. I agree that getAnotherTuple
> should not return EOF to request for unloaded part of the
> message.
>
> Please wait for a while for the new patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-07 06:14:57
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, this is new version of the patch.

# early_exit.diff is not included for this time and maybe also
# later. The set of the return values of PQrowProcessor looks
# unnatural if the 0 is removed.

> * Convert old EOFs to protocol errors in V3 getAnotherTuple()

done.

> * V2 getAnotherTuple() can leak PGresult when handling custom
> error from row processor.

Custom error message is removed from both V2 and V3.

> * remove pqIsnonblocking(conn) check when row processor returned 2.
> I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
> on sync connection.

done. This affects PQisBusy, but PQgetResult won't be affected as
far as I see. I have no idea for PQconsumeInput()..

> * It seems the return codes from callback should be remapped,
> (0, 1, 2) is unusual pattern. Better would be:
>
> -1 - error
> 0 - stop parsing / early exit ("I'm not done yet")
> 1 - OK ("I'm done with the row")

done.

This might be described more precisely as follows,

| -1 - error - erase result and change result status to
| - FATAL_ERROR All the rest rows in current result
| - will skipped(consumed).
| 0 - stop parsing / early exit ("I'm not done yet")
| - getAnotherTuple returns EOF without dropping PGresult.
# - We expect PQisBusy(), PQconsumeInput()(?) and
# - PQgetResult() to exit immediately and we can
# - call PQgetResult(), PQskipResult() or
# - PQisBusy() after.
| 1 - OK ("I'm done with the row")
| - save result and getAnotherTuple returns 0.

The lines prefixed with '#' is the desirable behavior I have
understood from the discussion so far. And I doubt that it works
as we expected for PQgetResult().

> * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().

done.

> My suggestion - check in getAnotherTuple whether resultStatus is
> already error and do nothing then. This allows internal pqAddRow
> to set regular "out of memory" error. Otherwise give generic
> "row processor error".

Current implement seems already doing this in
parseInput3(). Could you give me further explanation?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120307.patch text/x-patch 22.1 KB
libpq_rowproc_doc_20120307.patch text/x-patch 8.1 KB
dblink_use_rowproc_20120307.patch text/x-patch 13.5 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-07 20:21:57
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 07, 2012 at 03:14:57PM +0900, Kyotaro HORIGUCHI wrote:
> > * remove pqIsnonblocking(conn) check when row processor returned 2.
> > I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
> > on sync connection.
>
> done. This affects PQisBusy, but PQgetResult won't be affected as
> far as I see. I have no idea for PQconsumeInput()..

PQconsumeInput does not parse the row and there is no need to
do anything with PQgetResult().

> > * It seems the return codes from callback should be remapped,
> > (0, 1, 2) is unusual pattern. Better would be:
> >
> > -1 - error
> > 0 - stop parsing / early exit ("I'm not done yet")
> > 1 - OK ("I'm done with the row")
>
> done.
>
> This might be described more precisely as follows,
>
> | -1 - error - erase result and change result status to
> | - FATAL_ERROR All the rest rows in current result
> | - will skipped(consumed).
> | 0 - stop parsing / early exit ("I'm not done yet")
> | - getAnotherTuple returns EOF without dropping PGresult.
> # - We expect PQisBusy(), PQconsumeInput()(?) and
> # - PQgetResult() to exit immediately and we can
> # - call PQgetResult(), PQskipResult() or
> # - PQisBusy() after.
> | 1 - OK ("I'm done with the row")
> | - save result and getAnotherTuple returns 0.
>
> The lines prefixed with '#' is the desirable behavior I have
> understood from the discussion so far. And I doubt that it works
> as we expected for PQgetResult().

No, the desirable behavior is already implemented and documented -
the "stop parsing" return code affects only PQisBusy(). As that
is the function that does the actual parsing.

The main plus if such scheme is that we do not change the behaviour
of any existing APIs.

> > * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
>
> done.

You optimized libpq_gettext() calls, but it's wrong - they must
wrap the actual strings so that the strings can be extracted
for translating.

Fixed in attached patch.

> > My suggestion - check in getAnotherTuple whether resultStatus is
> > already error and do nothing then. This allows internal pqAddRow
> > to set regular "out of memory" error. Otherwise give generic
> > "row processor error".
>
> Current implement seems already doing this in
> parseInput3(). Could you give me further explanation?

The suggestion was about getAnotherTuple() - currently it sets
always "error in row processor". With such check, the callback
can set the error result itself. Currently only callbacks that
live inside libpq can set errors, but if we happen to expose
error-setting function in outside API, then the getAnotherTuple()
would already be ready for it.

See attached patch, I did some generic comment/docs cleanups
but also minor code cleanups:

- PQsetRowProcessor(NULL,NULL) sets Param to PGconn, instead NULL,
this makes PGconn available to pqAddRow.

- pqAddRow sets "out of memory" error itself on PGconn.

- getAnotherTuple(): when callback returns -1, it checks if error
message is set, does nothing then.

- put libpq_gettext() back around strings.

- dropped the error_saveresult label, it was unnecessary branch.

- made functions survive conn==NULL.

- dblink: changed skipAll parameter for PQskipResult() to TRUE,
as dblink uses PQexec which can send several queries.

- dblink: refreshed regtest result, as now we get actual
connection name in error message.

- Synced PQgetRow patch with return value changes.

- Synced demos at https://github.com/markokr/libpq-rowproc-demos
with return value changes.

I'm pretty happy with current state. So tagging it
ReadyForCommitter.

--
marko

Attachment Content-Type Size
libpq_rowproc_doc_2012-03-07v2.diff text/x-diff 9.6 KB
libpq_rowproc_2012-03-07v2.diff text/x-diff 26.0 KB
dblink_rowproc_2012-03-07v2.diff text/x-diff 16.2 KB
pqgetrow-v2.diff text/x-diff 8.1 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-08 05:35:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> > # - We expect PQisBusy(), PQconsumeInput()(?) and
> > # - PQgetResult() to exit immediately and we can
> > # - call PQgetResult(), PQskipResult() or
> > # - PQisBusy() after.
> > | 1 - OK ("I'm done with the row")
> > | - save result and getAnotherTuple returns 0.
> >
> > The lines prefixed with '#' is the desirable behavior I have
> > understood from the discussion so far. And I doubt that it works
> > as we expected for PQgetResult().
>
> No, the desirable behavior is already implemented and documented -
> the "stop parsing" return code affects only PQisBusy(). As that
> is the function that does the actual parsing.

I am satisfied with your answer. Thank you.

> The main plus if such scheme is that we do not change the behaviour
> of any existing APIs.

I agree with the policy.

> You optimized libpq_gettext() calls, but it's wrong - they must
> wrap the actual strings so that the strings can be extracted
> for translating.

Ouch. I remember to have had an build error about that before...

> Fixed in attached patch.

I'm sorry to annoy you.

> > > My suggestion - check in getAnotherTuple whether resultStatus is
> > > already error and do nothing then. This allows internal pqAddRow
> > > to set regular "out of memory" error. Otherwise give generic
> > > "row processor error".
..
> The suggestion was about getAnotherTuple() - currently it sets
> always "error in row processor". With such check, the callback
> can set the error result itself. Currently only callbacks that
> live inside libpq can set errors, but if we happen to expose
> error-setting function in outside API, then the getAnotherTuple()
> would already be ready for it.

I see. And I found it implemented in your patch.

> See attached patch, I did some generic comment/docs cleanups
> but also minor code cleanups:
>
> - PQsetRowProcessor(NULL,NULL) sets Param to PGconn, instead NULL,
> - pqAddRow sets "out of memory" error itself on PGconn.
> - getAnotherTuple(): when callback returns -1, it checks if error
> - dropped the error_saveresult label, it was unnecessary branch.

Ok, I've confirmed them.

> - put libpq_gettext() back around strings.
> - made functions survive conn==NULL.
> - dblink: refreshed regtest result, as now we get actual

Thank you for fixing my bugs.

> - dblink: changed skipAll parameter for PQskipResult() to TRUE,
> as dblink uses PQexec which can send several queries.

I agree to the change. I intended to allow to receive the results
after skipping the current result for failure. But that seems not
only not very likely, but also to be something dangerous.

> - Synced PQgetRow patch with return value changes.
>
> - Synced demos at https://github.com/markokr/libpq-rowproc-demos
> with return value changes.

> I'm pretty happy with current state. So tagging it
> ReadyForCommitter.

Thank you very much for all your help.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-22 22:07:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> [ patches against libpq and dblink ]

I think this patch needs a lot of work.

AFAICT it breaks async processing entirely, because many changes have been
made that fail to distinguish "insufficient data available as yet" from
"hard error". As an example, this code at the beginning of
getAnotherTuple:

/* Get the field count and make sure it's what we expect */
if (pqGetInt(&tupnfields, 2, conn))
! return EOF;

is considering three cases: it got a 2-byte integer (and can continue on),
or there aren't yet 2 more bytes available in the buffer, in which case it
should return EOF without doing anything, or pqGetInt detected a hard
error and updated the connection error state accordingly, in which case
again there is nothing to do except return EOF. In the patched code we
have:

/* Get the field count and make sure it's what we expect */
if (pqGetInt(&tupnfields, 2, conn))
! {
! /* Whole the message must be loaded on the buffer here */
! errmsg = libpq_gettext("protocol error\n");
! goto error_and_forward;
! }

which handles neither the second nor third case correctly: it thinks that
"data not here yet" is a hard error, and then makes sure it is an error by
destroying the parsing state :-(. And if in fact pqGetInt did log an
error, that possibly-useful error message is overwritten with an entirely
useless "protocol error" text.

I don't think the error return cases for the row processor have been
thought out too well either. The row processor is not in charge of what
happens to the PGresult, and it certainly has no business telling libpq to
just "exit immediately from the topmost libpq function". If we do that
we'll probably lose sync with the data stream and be unable to recover use
of the connection at all. Also, do we need to consider any error cases
for the row processor other than out-of-memory? If so it might be a good
idea for it to have some ability to store a custom error message into the
PGconn, which it cannot do given the current function API.

In the same vein, I am fairly uncomfortable with the blithe assertion that
a row processor can safely longjmp out of libpq. This was never foreseen
in the original library coding and there are any number of places that
that might break, now or in the future. Do we really need to allow it?
If we do, it would be a good idea to decorate the libpq functions that are
now expected to possibly longjmp with comments saying so. Tracing all the
potential call paths that might be aborted by a longjmp is an essential
activity anyway.

Another design deficiency is PQskipResult(). This is badly designed for
async operation because once you call it, it will absolutely not give back
control until it's read the whole query result. (It'd be better to have
it set some kind of flag that makes future library calls discard data
until the query result end is reached.) Something else that's not very
well-designed about it is that it might throw away more than just incoming
tuples. As an example, suppose that the incoming data at the time you
call it consists of half a dozen rows followed by an ErrorResponse. The
ErrorResponse will be eaten and discarded, leaving the application no clue
why its transaction has been aborted, or perhaps even the entire session
cancelled. What we probably want here is just to transiently install a
row processor that discards all incoming data, but the application is
still expected to eventually fetch a PGresult that will tell it whether
the server side of the query completed or not.

In the dblink patch, given that you have to worry about elogs coming out
of BuildTupleFromCStrings and tuplestore_puttuple anyway, it's not clear
what is the benefit of using malloc rather than palloc to manage the
intermediate buffers in storeHandler --- what that seems to mostly
accomplish is increase the risk of permanent memory leaks. I don't see
much value in per-column buffers either; it'd likely be cheaper to just
palloc one workspace that's big enough for all the column strings
together. And speaking of leaks, doesn't storeHandler leak the
constructed tuple on each call, not to mention whatever might be leaked by
the called datatype input functions?

It also seems to me that the dblink patch breaks the case formerly handled
in materializeResult() of a PGRES_COMMAND_OK rather than PGRES_TUPLES_OK
result. The COMMAND case is supposed to be converted to a single-column
text result, and I sure don't see where that's happening now.

BTW, am I right in thinking that some of the hunks in this patch are meant
to fix a pre-existing bug of failing to report the correct connection
name? If so, it'd likely be better to split those out and submit as a
separate patch, instead of conflating them with a feature addition.

Lastly, as to the pqgetrow patch, the lack of any demonstrated test case
for these functions makes me uncomfortable as to whether they are well
designed. Again, I'm unconvinced that the error handling is good or that
they work sanely in async mode. I'm inclined to just drop these for this
go-round, and to stick to providing the features that we can test via the
dblink patch.

regards, tom lane


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-23 09:08:04
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thank you for picking up.

> is considering three cases: it got a 2-byte integer (and can continue on),
> or there aren't yet 2 more bytes available in the buffer, in which case it
> should return EOF without doing anything, or pqGetInt detected a hard
> error and updated the connection error state accordingly, in which case
> again there is nothing to do except return EOF. In the patched code we
> have:
...
> which handles neither the second nor third case correctly: it thinks that
> "data not here yet" is a hard error, and then makes sure it is an error by
> destroying the parsing state :-(.

Marko and I think that, in protocol 3, all bytes of the incoming
message should have been surely loaded when entering
getAnotherTuple(). The following part In pqParseInput3() does
this.

| if (avail < msgLength)
| {
| /*
| * Before returning, enlarge the input buffer if needed to hold
| * the whole message.
| (snipped)..
| */
| if (pqCheckInBufferSpace(conn->inCursor + (size_t) msgLength,
| conn))
| {
| /*
| * XXX add some better recovery code...
| (snipped)..
| */
| handleSyncLoss(conn, id, msgLength);
| }
| return;

So, if cursor state is broken just after exiting
getAnotherTuple(), it had already been broken BEFORE entering
getAnotherTuple() according to current disign. That is the
'protocol error' means. pqGetInt there should not detect any
errors except for broken message.

> error, that possibly-useful error message is overwritten with an entirely
> useless "protocol error" text.

Plus, current pqGetInt seems to set its own error message only
for the wrong parameter 'bytes'.

On the other hand, in protocol 2 (to be removed ?) the error
handling mechanism get touched, because full-load of the message
is not guraranteed.

> I don't think the error return cases for the row processor have been
> thought out too well either. The row processor is not in charge of what
> happens to the PGresult,

Default row processor stuffs PGresult with tuples, another (say
that of dblink) leave it empty. Row processor manages PGresult by
the extent of their own.

> and it certainly has no business telling libpq to just "exit
> immediately from the topmost libpq function". If we do that
> we'll probably lose sync with the data stream and be unable to
> recover use of the connection at all.

I don't think PGresult has any charge of error handling system in
current implement. The phrase 'exit immediately from the topmost
libpq function' should not be able to be seen in the patch.

The exit routes from row processor are following,

- Do longjmp (or PG_PG_TRY-CATCH mechanism) out of the row
processor.

- Row processor returns 0 when entered from PQisBusy(),
immediately exit from PQisBusy().

Curosor consistency will be kept in both case. The cursor already
be on the next to the last byte of the current message.

> Also, do we need to consider any error cases for the row
> processor other than out-of-memory? If so it might be a good
> idea for it to have some ability to store a custom error
> message into the PGconn, which it cannot do given the current
> function API.

It seems not have so strong necessity concerning dblink or
PQgetRow comparing to expected additional complexity around. So
this patch does not include it.

> In the same vein, I am fairly uncomfortable with the blithe assertion that
> a row processor can safely longjmp out of libpq. This was never foreseen
> in the original library coding and there are any number of places that
> that might break, now or in the future. Do we really need to allow it?

To protect row processor from longjmp'ing out, I enclosed the
functions potentially throw exception by PG_TRY-CATCH clause in
the early verson. This was totally safe but the penalty was not
negligible because the TRY-CATCH was passed for every row.

> If we do, it would be a good idea to decorate the libpq
> functions that are now expected to possibly longjmp with
> comments saying so. Tracing all the potential call paths that
> might be aborted by a longjmp is an essential activity anyway.

Concerning now but the future, I can show you the trail of
confirmation process.

- There is no difference between with and without the patch at
the level of getAnotherTuple() from the view of consistency.

- Assuming pqParseInput3 detects the next message has not come
after getAnotherTuple returned. It exits immediately on reading
the length of the next message. This is the same condition to
longjumping.

> if (pqGetInt(&msgLength, 4, conn))
> return;

- parseInput passes it through and immediately exits in
consistent state.

- The caller of PQgetResult, PQisBusy, PQskipResult, PQnotifies,
PQputCopyData, pqHandleSendFailure gain the control finally. I
am convinced that the async status at the time must be
PGASYNC_BUSY and the conn cursor in consistent state.

So the ancestor of row processor is encouraged to call
PQfinish, PQgetResult, PQskipResult after getting longjmped in
the document. These functions should resolve the intermediate
status described above created by longjmp by restarting parsing
the stream afterwards

And about the future, altough it is a matter of cource that every
touch on the code will cause every destruction, longjmp stepping
over libpq internal code seems something complicated which is
enough to cause trouble. I will marking as 'This function is
skipped over by the longjmp invoked in the descendents.' (Better
expressions are welcome..)

> Another design deficiency is PQskipResult(). This is badly
> designed for async operation because once you call it, it will
> absolutely not give back control until it's read the whole
> query result. (It'd be better to have it set some kind of flag
> that makes future library calls discard data until the query
> result end is reached.)

If this function is called just after getting longjmp from row
processor, the async status of the connection at the time must be
PGASYNC_BUSY. So I think this function should always returns even
if the longjmp takes place at the last row in a result. There
must be following 'C' message if not broken.

> Something else that's not very well-designed about it is that
> it might throw away more than just incoming tuples. As an
> example, suppose that the incoming data at the time you call it
> consists of half a dozen rows followed by an ErrorResponse.
> The ErrorResponse will be eaten and discarded, leaving the
> application no clue why its transaction has been aborted, or
> perhaps even the entire session cancelled.

If the caller needs to see the ErrorResposes following, I think
calling PQgetResult seems enough.

> What we probably want here is just to transiently install a
> row processor that
> discards all incoming data, but the application is still
> expected to eventually fetch a PGresult that will tell it
> whether the server side of the query completed or not.

The all-dicarding row processor should not be visible to the
user. The users could design the row processor to behave so if
they want. This is mere a shortcut but it seems difficult to do
so without.

> In the dblink patch, ... it's not clear what is the benefit of
> using malloc rather than palloc to manage the intermediate
> buffers in storeHandler --- what that seems to mostly
> accomplish is increase the risk of permanent memory leaks.

Hmm. I thought that palloc is heavier than malloc, and the
allocated blocks looked well controlled so that there won't be
likely to leak. I will change to use palloc's counting the
discussion about block granurality below.

> I don't see much value in per-column buffers either; it'd
> likely be cheaper to just palloc one workspace that's big
> enough for all the column strings together.

I had hated to count the total length prior to copying the
contents in the early version. In the latest the total
requirement of the memory is easily obatined. So it seems good to
alloc together. I'll do so.

> And speaking of leaks, doesn't storeHandler leak the
> constructed tuple on each call, not to mention whatever might
> be leaked by the called datatype input functions?

I copied the process in the original materializeResult into
storeHandler. tuplestore_donestoring was removed because it is
obsoleted. So I think it is no problem about it. Am I wrong?

> It also seems to me that the dblink patch breaks the case
> formerly handled in materializeResult() of a PGRES_COMMAND_OK
> rather than PGRES_TUPLES_OK result. The COMMAND case is
> supposed to be converted to a single-column text result, and I
> sure don't see where that's happening now.

I'm sorry. I will restore that route.

> BTW, am I right in thinking that some of the hunks in this
> patch are meant to fix a pre-existing bug of failing to report
> the correct connection name? If so, it'd likely be better to
> split those out and submit as a separate patch, instead of
> conflating them with a feature addition.

Ok. I will split it.

> Lastly, as to the pqgetrow patch, the lack of any demonstrated
> test case for these functions makes me uncomfortable as to
> whether they are well designed. Again, I'm unconvinced that
> the error handling is good or that they work sanely in async
> mode. I'm inclined to just drop these for this go-round, and
> to stick to providing the features that we can test via the
> dblink patch.

Testing pqGetRow via dblink?

Do you mean 'drop these' as pqGetRow? So, this part might be
droppable apart from the rest.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-24 00:22:24
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I saw Kyotaro already answered, but I give my view as well.

On Thu, Mar 22, 2012 at 06:07:16PM -0400, Tom Lane wrote:
> AFAICT it breaks async processing entirely, because many changes have been
> made that fail to distinguish "insufficient data available as yet" from
> "hard error". As an example, this code at the beginning of
> getAnotherTuple:
>
> /* Get the field count and make sure it's what we expect */
> if (pqGetInt(&tupnfields, 2, conn))
> ! return EOF;
>
> is considering three cases: it got a 2-byte integer (and can continue on),
> or there aren't yet 2 more bytes available in the buffer, in which case it
> should return EOF without doing anything, or pqGetInt detected a hard
> error and updated the connection error state accordingly, in which case
> again there is nothing to do except return EOF. In the patched code we
> have:
>
> /* Get the field count and make sure it's what we expect */
> if (pqGetInt(&tupnfields, 2, conn))
> ! {
> ! /* Whole the message must be loaded on the buffer here */
> ! errmsg = libpq_gettext("protocol error\n");
> ! goto error_and_forward;
> ! }
>
> which handles neither the second nor third case correctly: it thinks that
> "data not here yet" is a hard error, and then makes sure it is an error by
> destroying the parsing state :-(. And if in fact pqGetInt did log an
> error, that possibly-useful error message is overwritten with an entirely
> useless "protocol error" text.

No, "protocol error" really is only error case here.

- pqGetInt() does not set errors.

- V3 getAnotherTuple() is called only if packet is fully in buffer.

> I don't think the error return cases for the row processor have been
> thought out too well either. The row processor is not in charge of what
> happens to the PGresult, and it certainly has no business telling libpq to
> just "exit immediately from the topmost libpq function". If we do that
> we'll probably lose sync with the data stream and be unable to recover use
> of the connection at all. Also, do we need to consider any error cases
> for the row processor other than out-of-memory?

No, the rule is *not* "exit to topmost", but "exit PQisBusy()".

This is exactly so that if any code that does not expect row-processor
behaviour continues to work.

Also, from programmers POV, this also means row-processor callback causes
minimal changes to existing APIs.

> If so it might be a good
> idea for it to have some ability to store a custom error message into the
> PGconn, which it cannot do given the current function API.

There already was such function, but it was row-processor specific hack
that could leak out and create confusion. I rejected it. Instead there
should be generic error setting function, equivalent to current libpq
internal error setting.

But such generic error setting function would need review all libpq
error states as it allows error state appear in new situations. Also
we need to have well-defined behaviour of client-side errors vs. incoming
server errors.

Considering that even current cut-down patch is troubling committers,
I would definitely suggest postponing such generic error setter to 9.3.

Especially as it does not change anything coding-style-wise.

> In the same vein, I am fairly uncomfortable with the blithe assertion that
> a row processor can safely longjmp out of libpq. This was never foreseen
> in the original library coding and there are any number of places that
> that might break, now or in the future. Do we really need to allow it?
> If we do, it would be a good idea to decorate the libpq functions that are
> now expected to possibly longjmp with comments saying so. Tracing all the
> potential call paths that might be aborted by a longjmp is an essential
> activity anyway.

I think we *should* allow exceptions, but in limited number of APIs.

Basically, the usefulness for users vs. effort from our side
is clearly on the side of providing it.

But its up to us to define what the *limited* means (what needs
least effort from us), so that later when users want to use exceptions
in callback, they need to pick right API.

Currently it seems only PQexec() + multiple SELECTS can give trouble,
as previous PGresult is kept in stack. Should we unsupport
PQexec or multiple SELECTS?

But such case it borken even without exceptions - or at least
very confusing. Not sure what to do with it.

In any case, "decorating" libpq functions is wrong approach. This gives
suggestion that caller of eg. PQexec() needs to take care of any possible
behaviour of unknown callback. This will not work. Instead allowed
functions should be simply listed in row-processor documentation.

Basically custom callback should be always matched by caller that
knows about it and knows how to handle it. Not sure how to put
such suggestion into documentation tho'.

> Another design deficiency is PQskipResult(). This is badly designed for
> async operation because once you call it, it will absolutely not give back
> control until it's read the whole query result. (It'd be better to have
> it set some kind of flag that makes future library calls discard data
> until the query result end is reached.) Something else that's not very
> well-designed about it is that it might throw away more than just incoming
> tuples. As an example, suppose that the incoming data at the time you
> call it consists of half a dozen rows followed by an ErrorResponse. The
> ErrorResponse will be eaten and discarded, leaving the application no clue
> why its transaction has been aborted, or perhaps even the entire session
> cancelled. What we probably want here is just to transiently install a
> row processor that discards all incoming data, but the application is
> still expected to eventually fetch a PGresult that will tell it whether
> the server side of the query completed or not.

I guess it's designed for rolling connection forward in exception
handler... And it's blocking-only indeed. Considering that better
approach is to drop the connection, instead trying to save it,
it's usefulness is questionable.

I'm OK with dropping it.

> Lastly, as to the pqgetrow patch, the lack of any demonstrated test case
> for these functions makes me uncomfortable as to whether they are well
> designed. Again, I'm unconvinced that the error handling is good or that
> they work sanely in async mode. I'm inclined to just drop these for this
> go-round, and to stick to providing the features that we can test via the
> dblink patch.

I simplified the github test cases and attached as patch.

Could you please give more concrete critique of the API?

Main idea behing PQgetRow is that it does not replace any existing
API function, instead acts as addition:

* Sync case: PQsendQuery() + PQgetResult - PQgetRow should be called
before PQgetResult until it returns NULL, then proceed with PQgetResult
to get final state.

* Async case: PQsendQuery() + PQconsumeInput() + PQisBusy() + PQgetResult().
Here PQgetRow() should be called before PQisBusy() until it returns
NULL, then proceed with PQisBusy() as usual.

It only returns rows, never any error state PGresults.

Main advantage of including PQgetRow() together with low-level
rowproc API is that it allows de-emphasizing more complex parts of
rowproc API (exceptions, early exit, skipresult, custom error msg).
And drop/undocument them or simply call them postgres-internal-only.

--
marko

Attachment Content-Type Size
rowproc-demos.diff text/x-diff 10.8 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-26 08:54:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, This is new version of patch for dblink using row processor.

- Use palloc to allocate temporaly memoriy blocks.

- Memory allocation is now done in once. Preallocate a block of
initial size and palloc simplified reallocation code.

- Resurrected the route for command invoking. And small
adjustment of behavior on error.

- Modification to fix connection name missing bug is removed out
to another patch.

- Commenting on the functions skipped over by lonjmp is
withholded according to Marko's discussion.

- rebased to e8476f46fc847060250c92ec9b310559293087fc

dblink_use_rowproc_20120326.patch - dblink row processor patch.
dblink_connname_20120326.patch - dblink connname fix patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
dblink_use_rowproc_20120326.patch text/x-patch 13.0 KB
dblink_connname_20120326.patch text/x-patch 625 bytes

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-27 02:20:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I'm sorry to have coded a silly bug.

The previous patch has a bug in realloc size calculation.
And separation of the 'connname patch' was incomplete in regtest.
It is fixed in this patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
dblink_use_rowproc_20120327.patch text/x-patch 13.1 KB
dblink_connname_20120327.patch text/x-patch 1.2 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-27 20:41:18
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 24, 2012 at 02:22:24AM +0200, Marko Kreen wrote:
> Main advantage of including PQgetRow() together with low-level
> rowproc API is that it allows de-emphasizing more complex parts of
> rowproc API (exceptions, early exit, skipresult, custom error msg).
> And drop/undocument them or simply call them postgres-internal-only.

I thought more about exceptions and PQgetRow and found
interesting pattern:

- Exceptions are problematic if always allowed. Although PQexec() is
easy to fix in current code, trying to keep to promise of "exceptions
are allowed from everywhere" adds non-trivial maintainability overhead
to future libpq changes, so instead we should simply fix documentation.
Especially as I cannot see any usage scenario that would benefit from
such promise.

- Multiple SELECTs from PQexec() are problematic even without
exceptions: additional documentation is needed how to detect
that rows are coming from new statement.

Now the interesting one:

- PQregisterProcessor() API allows changing the callback permanently.
Thus breaking any user code which simply calls PQexec()
and expects regular PGresult back. Again, nothing to fix
code-wise, need to document that callback should be set
only for current query, later changed back.

My conclusion is that row-processor API is low-level expert API and
quite easy to misuse. It would be preferable to have something more
robust as end-user API, the PQgetRow() is my suggestion for that.
Thus I see 3 choices:

1) Push row-processor as main API anyway and describe all dangerous
scenarios in documentation.
2) Have both PQgetRow() and row-processor available in <libpq-fe.h>,
PQgetRow() as preferred API and row-processor for expert usage,
with proper documentation what works and what does not.
3) Have PQgetRow() in <libpq-fe.h>, move row-processor to <libpq-int.h>.

I guess this needs committer decision which way to go?

Second conclusion is that current dblink row-processor usage is broken
when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
Simplest fix would be to use PQexecParams() instead, but if keeping old
behaviour is important, then dblink needs to emulate PQexec() resultset
behaviour with row-processor or PQgetRow().

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-29 21:56:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> I'm sorry to have coded a silly bug.
> The previous patch has a bug in realloc size calculation.
> And separation of the 'connname patch' was incomplete in regtest.
> It is fixed in this patch.

I've applied a modified form of the conname update patch. It seemed to
me that the fault is really in the DBLINK_GET_CONN and
DBLINK_GET_NAMED_CONN macros, which ought to be responsible for setting
the surrounding function's conname variable along with conn, rconn, etc.
There was actually a second error of the same type visible in the dblink
regression test, which is also fixed by this more general method.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-29 22:56:30
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> My conclusion is that row-processor API is low-level expert API and
> quite easy to misuse. It would be preferable to have something more
> robust as end-user API, the PQgetRow() is my suggestion for that.
> Thus I see 3 choices:

> 1) Push row-processor as main API anyway and describe all dangerous
> scenarios in documentation.
> 2) Have both PQgetRow() and row-processor available in <libpq-fe.h>,
> PQgetRow() as preferred API and row-processor for expert usage,
> with proper documentation what works and what does not.
> 3) Have PQgetRow() in <libpq-fe.h>, move row-processor to <libpq-int.h>.

I still am failing to see the use-case for PQgetRow. ISTM the entire
point of a special row processor is to reduce the per-row processing
overhead, but PQgetRow greatly increases that overhead. And it doesn't
reduce complexity much either IMO: you still have all the primary risk
factors arising from processing rows in advance of being sure that the
whole query completed successfully. Plus it conflates "no more data"
with "there was an error receiving the data" or "there was an error on
the server side". PQrecvRow alleviates the per-row-overhead aspect of
that but doesn't really do a thing from the complexity standpoint;
it doesn't look to me to be noticeably easier to use than a row
processor callback.

I think PQgetRow and PQrecvRow just add more API calls without making
any fundamental improvements, and so we could do without them. "There's
more than one way to do it" is not necessarily a virtue.

> Second conclusion is that current dblink row-processor usage is broken
> when user uses multiple SELECTs in SQL as dblink uses plain PQexec().

Yeah. Perhaps we should tweak the row-processor callback API so that
it gets an explicit notification that "this is a new resultset".
Duplicating PQexec's behavior would then involve having the dblink row
processor throw away any existing tuplestore and start over when it
gets such a call.

There's multiple ways to express that but the most convenient thing
from libpq's viewpoint, I think, is to have a callback that occurs
immediately after collecting a RowDescription message, before any
rows have arrived. So maybe we could express that as a callback
with valid "res" but "columns" set to NULL?

A different approach would be to add a row counter to the arguments
provided to the row processor; then you'd know a new resultset had
started if you saw rowcounter == 0. This might have another advantage
of not requiring the row processor to count the rows for itself, which
I think many row processors would otherwise have to do.

regards, tom lane


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 04:30:35
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I've applied a modified form of the conname update patch. It seemed to
> me that the fault is really in the DBLINK_GET_CONN and
> DBLINK_GET_NAMED_CONN macros, which ought to be responsible for setting
> the surrounding function's conname variable along with conn, rconn, etc.
> There was actually a second error of the same type visible in the dblink
> regression test, which is also fixed by this more general method.

Come to think of it, the patch is mere a verifying touch for the
bug mingled with the other part of the dblink patch to all
appearances. I totally agree with you. It should be dropped for
this time and done another time.

I'am sorry for bothering you by such a damn thing.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 15:52:47
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > My conclusion is that row-processor API is low-level expert API and
> > quite easy to misuse. It would be preferable to have something more
> > robust as end-user API, the PQgetRow() is my suggestion for that.
> > Thus I see 3 choices:
>
> > 1) Push row-processor as main API anyway and describe all dangerous
> > scenarios in documentation.
> > 2) Have both PQgetRow() and row-processor available in <libpq-fe.h>,
> > PQgetRow() as preferred API and row-processor for expert usage,
> > with proper documentation what works and what does not.
> > 3) Have PQgetRow() in <libpq-fe.h>, move row-processor to <libpq-int.h>.
>
> I still am failing to see the use-case for PQgetRow. ISTM the entire
> point of a special row processor is to reduce the per-row processing
> overhead, but PQgetRow greatly increases that overhead.

No, decreasing CPU overhead is minor win. I guess in realistic
application, like dblink, you can't even measure the difference.

The *major* win comes from avoiding buffering of all rows in PGresult.
Ofcourse, this is noticeable only with bigger resultsets.
I guess such buffering pessimizes memory usage: code always
works on cold cache. And simply keeping RSS low is good for
long-term health of a process.

Second major win is avoiding the need to use cursor with small chunks
to access resultset of unknown size. Thus stalling application
until next block arrives from network.

The "PGresult *PQgetRow()" is for applications that do not convert
rows immediately to some internal format, but keep using PGresult.
So they can be converted to row-by-row processing with minimal
changes to actual code.

Note that the PGrowValue is temporary struct that application *must*
move data away from. If app internally uses PGresult, then it's
pretty annoying to invent a new internal format for long-term
storage.

But maybe I'm overestimating the number of such applications.

> And it doesn't
> reduce complexity much either IMO: you still have all the primary risk
> factors arising from processing rows in advance of being sure that the
> whole query completed successfully.

It avoids the complexity of:

* How to pass error from callback to upper code

* Needing to know how exceptions behave

* How to use early exit to pass rows to upper code one-by-one,
(by storing the PGresult and PGrowValue in temp place
and later checking their values)

* How to detect that new resultset has started. (keeping track
of previous PGresult or noting some quirky API behaviour
we may invent for such case)

* Needing to make sure the callback does not leak to call-sites
that expect regular libpq behaviour.
("Always call PQregisterRowProcessor(db, NULL, NULL) after query finishes" )
["But now I'm in exception handler, how do I find the connection?"]

I've now reviewed the callback code and even done some coding with it
and IMHO it's too low-level to be end-user-API.

Yes, the "query-may-still-fail" complexity remains, but thats not unlike
the usual "multi-statement-transaction-is-not-guaranteed-to-succeed"
complexity.

Another compexity that remains is "how-to-skip-current-resultset",
but that is a problem only on sync connections and the answer is
simple - "call PQgetResult()". Or "call PQgetRow/PQrecvRow" if
user wants to avoid buffering.

> Plus it conflates "no more data"
> with "there was an error receiving the data" or "there was an error on
> the server side".

Well, current PQgetRow() is written with style: "return only single-row
PGresult, to see errors user must call PQgetResult()". Basically
so that user it forced to fall back familiar libpq usage pattern.

It can be changed, so that PQgetRow() returns also errors.

Or we can drop it and just keep PQrecvRow().

> PQrecvRow alleviates the per-row-overhead aspect of
> that but doesn't really do a thing from the complexity standpoint;
> it doesn't look to me to be noticeably easier to use than a row
> processor callback.

> I think PQgetRow and PQrecvRow just add more API calls without making
> any fundamental improvements, and so we could do without them. "There's
> more than one way to do it" is not necessarily a virtue.

Please re-read the above list of problematic situations that this API
fixes. Then, if you still think that PQrecvRow() is pointless, sure,
let's drop it.

We can also postpone it to 9.3, to poll users whether they want
easier API, or is maximum performance important. (PQrecvRow()
*does* have few cycles of overhead compared to callbacks.)

Only option that we have on the table for 9.2 but not later
is moving the callback API to <libpq-int.h>.

> > Second conclusion is that current dblink row-processor usage is broken
> > when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
>
> Yeah. Perhaps we should tweak the row-processor callback API so that
> it gets an explicit notification that "this is a new resultset".
> Duplicating PQexec's behavior would then involve having the dblink row
> processor throw away any existing tuplestore and start over when it
> gets such a call.
>
> There's multiple ways to express that but the most convenient thing
> from libpq's viewpoint, I think, is to have a callback that occurs
> immediately after collecting a RowDescription message, before any
> rows have arrived. So maybe we could express that as a callback
> with valid "res" but "columns" set to NULL?
>
> A different approach would be to add a row counter to the arguments
> provided to the row processor; then you'd know a new resultset had
> started if you saw rowcounter == 0. This might have another advantage
> of not requiring the row processor to count the rows for itself, which
> I think many row processors would otherwise have to do.

Try to imagine how final documentation will look like.

Then imagine documentation for PGrecvRow() / PQgetRow().

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 15:59:12
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
>> Marko Kreen <markokr(at)gmail(dot)com> writes:
>>> Second conclusion is that current dblink row-processor usage is broken
>>> when user uses multiple SELECTs in SQL as dblink uses plain PQexec().

>> Yeah. Perhaps we should tweak the row-processor callback API so that
>> it gets an explicit notification that "this is a new resultset".
>> Duplicating PQexec's behavior would then involve having the dblink row
>> processor throw away any existing tuplestore and start over when it
>> gets such a call.
>>
>> There's multiple ways to express that but the most convenient thing
>> from libpq's viewpoint, I think, is to have a callback that occurs
>> immediately after collecting a RowDescription message, before any
>> rows have arrived. So maybe we could express that as a callback
>> with valid "res" but "columns" set to NULL?
>>
>> A different approach would be to add a row counter to the arguments
>> provided to the row processor; then you'd know a new resultset had
>> started if you saw rowcounter == 0. This might have another advantage
>> of not requiring the row processor to count the rows for itself, which
>> I think many row processors would otherwise have to do.

> Try to imagine how final documentation will look like.

> Then imagine documentation for PGrecvRow() / PQgetRow().

What's your point, exactly? PGrecvRow() / PQgetRow() aren't going to
make that any better as currently defined, because there's noplace to
indicate "this is a new resultset" in those APIs either.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 16:04:59
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 30, 2012 at 11:59:12AM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
> >> Marko Kreen <markokr(at)gmail(dot)com> writes:
> >>> Second conclusion is that current dblink row-processor usage is broken
> >>> when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
>
> >> Yeah. Perhaps we should tweak the row-processor callback API so that
> >> it gets an explicit notification that "this is a new resultset".
> >> Duplicating PQexec's behavior would then involve having the dblink row
> >> processor throw away any existing tuplestore and start over when it
> >> gets such a call.
> >>
> >> There's multiple ways to express that but the most convenient thing
> >> from libpq's viewpoint, I think, is to have a callback that occurs
> >> immediately after collecting a RowDescription message, before any
> >> rows have arrived. So maybe we could express that as a callback
> >> with valid "res" but "columns" set to NULL?
> >>
> >> A different approach would be to add a row counter to the arguments
> >> provided to the row processor; then you'd know a new resultset had
> >> started if you saw rowcounter == 0. This might have another advantage
> >> of not requiring the row processor to count the rows for itself, which
> >> I think many row processors would otherwise have to do.
>
> > Try to imagine how final documentation will look like.
>
> > Then imagine documentation for PGrecvRow() / PQgetRow().
>
> What's your point, exactly? PGrecvRow() / PQgetRow() aren't going to
> make that any better as currently defined, because there's noplace to
> indicate "this is a new resultset" in those APIs either.

Have you looked at the examples? PQgetResult() is pretty good hint
that one resultset finished...

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 16:30:24
Message-ID: CACMqXCKi1RjBgeY2mr=xns-WW-aGywXvMXO9Mk6Rb6XRdarrig@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 30, 2012 at 7:04 PM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
> Have you looked at the examples?  PQgetResult() is pretty good hint
> that one resultset finished...

Ok, the demos are around this long thread and hard to find,
so here is a summary of links:

Original design mail:

http://archives.postgresql.org/message-id/[email protected]

First patch with quick demos:

http://archives.postgresql.org/message-id/[email protected]

Demos as diff:

http://archives.postgresql.org/message-id/[email protected]

Demos/experiments/tests (bit messier than the demos-as-diffs):

https://github.com/markokr/libpq-rowproc-demos

Note - the point is that user *must* call PQgetResult() when resultset ends.
Thus also the "PQgetRow() does not return errors" decision.

I'll put this mail into commitfest page too, seems I've forgotten to
put some mails there.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 21:18:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Wed, Mar 07, 2012 at 03:14:57PM +0900, Kyotaro HORIGUCHI wrote:
>>> My suggestion - check in getAnotherTuple whether resultStatus is
>>> already error and do nothing then. This allows internal pqAddRow
>>> to set regular "out of memory" error. Otherwise give generic
>>> "row processor error".

>> Current implement seems already doing this in
>> parseInput3(). Could you give me further explanation?

> The suggestion was about getAnotherTuple() - currently it sets
> always "error in row processor". With such check, the callback
> can set the error result itself. Currently only callbacks that
> live inside libpq can set errors, but if we happen to expose
> error-setting function in outside API, then the getAnotherTuple()
> would already be ready for it.

I'm pretty dissatisfied with the error reporting situation for row
processors. You can't just decide not to solve it, which seems to be
the current state of affairs. What I'm inclined to do is to add a
"char **" parameter to the row processor, and say that when the
processor returns -1 it can store an error message string there.
If it does so, that's what we report. If it doesn't (which we'd detect
by presetting the value to NULL), then use a generic "error in row
processor" message. This is cheap and doesn't prevent the row processor
from using some application-specific error reporting method if it wants;
but it does allow the processor to make use of libpq's error mechanism
when that's preferable.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 21:59:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > On Wed, Mar 07, 2012 at 03:14:57PM +0900, Kyotaro HORIGUCHI wrote:
> >>> My suggestion - check in getAnotherTuple whether resultStatus is
> >>> already error and do nothing then. This allows internal pqAddRow
> >>> to set regular "out of memory" error. Otherwise give generic
> >>> "row processor error".
>
> >> Current implement seems already doing this in
> >> parseInput3(). Could you give me further explanation?
>
> > The suggestion was about getAnotherTuple() - currently it sets
> > always "error in row processor". With such check, the callback
> > can set the error result itself. Currently only callbacks that
> > live inside libpq can set errors, but if we happen to expose
> > error-setting function in outside API, then the getAnotherTuple()
> > would already be ready for it.
>
> I'm pretty dissatisfied with the error reporting situation for row
> processors. You can't just decide not to solve it, which seems to be
> the current state of affairs. What I'm inclined to do is to add a
> "char **" parameter to the row processor, and say that when the
> processor returns -1 it can store an error message string there.
> If it does so, that's what we report. If it doesn't (which we'd detect
> by presetting the value to NULL), then use a generic "error in row
> processor" message. This is cheap and doesn't prevent the row processor
> from using some application-specific error reporting method if it wants;
> but it does allow the processor to make use of libpq's error mechanism
> when that's preferable.

Yeah.

But such API seems to require specifying allocator, which seems ugly.
I think it would be better to just use Kyotaro's original idea
of PQsetRowProcessorError() which nicer to use.

Few thoughts on the issue:

----------

As libpq already provides quite good coverage of PGresult
manipulation APIs, then how about:

void PQsetResultError(PGresult *res, const char *msg);

that does:

res->errMsg = pqResultStrdup(msg);
res->resultStatus = PGRES_FATAL_ERROR;

that would also cause minimal fuss in getAnotherTuple().

------------

I would actually like even more:

void PQsetConnectionError(PGconn *conn, const char *msg);

that does full-blown libpq error logic. Thus it would be
useful everywherewhere in libpq. But it seems bit too disruptive,
so I would like a ACK from a somebody who knows libpq better.
(well, from you...)

-----------

Another thought - if we have API to set error from *outside*
of row-processor callback, that would immediately solve the
"how to skip incoming resultset without buffering it" problem.

And it would be usable for PQgetRow()/PQrecvRow() too.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 22:13:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
>> I'm pretty dissatisfied with the error reporting situation for row
>> processors. You can't just decide not to solve it, which seems to be
>> the current state of affairs. What I'm inclined to do is to add a
>> "char **" parameter to the row processor, and say that when the
>> processor returns -1 it can store an error message string there.

> But such API seems to require specifying allocator, which seems ugly.

Not if the message is a constant string, which seems like the typical
situation (think "out of memory"). If the row processor does need a
buffer for a constructed string, it could make use of some space in its
"void *param" area, for instance.

> I think it would be better to just use Kyotaro's original idea
> of PQsetRowProcessorError() which nicer to use.

I don't particularly care for that idea because it opens up all sorts of
potential issues when such a function is called at the wrong time.
Moreover, you have to remember that the typical situation here is that
we're going to be out of memory or otherwise in trouble, which means
you've got to be really circumspect about what you assume will work.
Row processors that think they can do a lot of fancy message
construction should be discouraged, and an API that requires
construction of a new PGresult in order to return an error is right out.
(This is why getAnotherTuple is careful to clear the failed result
before it tries to build a new one. But that trick isn't going to be
available to an external row processor.)

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 22:26:47
Message-ID: CACMqXC+FiqoUzHopXWczJUJUPDFS5C1RCx+jtpe1GNFrUD-uyg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 31, 2012 at 1:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
>> On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
>>> I'm pretty dissatisfied with the error reporting situation for row
>>> processors.  You can't just decide not to solve it, which seems to be
>>> the current state of affairs.  What I'm inclined to do is to add a
>>> "char **" parameter to the row processor, and say that when the
>>> processor returns -1 it can store an error message string there.
>
>> But such API seems to require specifying allocator, which seems ugly.
>
> Not if the message is a constant string, which seems like the typical
> situation (think "out of memory").  If the row processor does need a
> buffer for a constructed string, it could make use of some space in its
> "void *param" area, for instance.

If it's specified as string that libpq does not own, then I'm fine with it.

>> I think it would be better to just use Kyotaro's original idea
>> of PQsetRowProcessorError() which nicer to use.
>
> I don't particularly care for that idea because it opens up all sorts of
> potential issues when such a function is called at the wrong time.
> Moreover, you have to remember that the typical situation here is that
> we're going to be out of memory or otherwise in trouble, which means
> you've got to be really circumspect about what you assume will work.
> Row processors that think they can do a lot of fancy message
> construction should be discouraged, and an API that requires
> construction of a new PGresult in order to return an error is right out.
> (This is why getAnotherTuple is careful to clear the failed result
> before it tries to build a new one.  But that trick isn't going to be
> available to an external row processor.)

Kyotaro's original idea was to assume out-of-memory if error
string was not set, thus the callback needed to set the string
only when it really had something to say.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-30 22:35:30
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Sat, Mar 31, 2012 at 1:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Not if the message is a constant string, which seems like the typical
>> situation (think "out of memory"). If the row processor does need a
>> buffer for a constructed string, it could make use of some space in its
>> "void *param" area, for instance.

> If it's specified as string that libpq does not own, then I'm fine with it.

Check. Let's make it "const char **" in fact, just to be clear on that.

>> (This is why getAnotherTuple is careful to clear the failed result
>> before it tries to build a new one. But that trick isn't going to be
>> available to an external row processor.)

> Kyotaro's original idea was to assume out-of-memory if error
> string was not set, thus the callback needed to set the string
> only when it really had something to say.

Hmm. We could still do that in conjunction with the idea of returning
the string from the row processor, but I'm not sure if it's useful or
just overly cute.

[ thinks... ] A small advantage of assuming NULL means that is that
we could postpone the libpq_gettext("out of memory") call until after
clearing the overflowed PGresult, which would greatly improve the odds
of getting a nicely translated result and not just ASCII. Might be
worth it just for that.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-31 15:50:27
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
>> Yeah. Perhaps we should tweak the row-processor callback API so that
>> it gets an explicit notification that "this is a new resultset".
>> Duplicating PQexec's behavior would then involve having the dblink row
>> processor throw away any existing tuplestore and start over when it
>> gets such a call.
>>
>> There's multiple ways to express that but the most convenient thing
>> from libpq's viewpoint, I think, is to have a callback that occurs
>> immediately after collecting a RowDescription message, before any
>> rows have arrived. So maybe we could express that as a callback
>> with valid "res" but "columns" set to NULL?
>>
>> A different approach would be to add a row counter to the arguments
>> provided to the row processor; then you'd know a new resultset had
>> started if you saw rowcounter == 0. This might have another advantage
>> of not requiring the row processor to count the rows for itself, which
>> I think many row processors would otherwise have to do.

I had been leaning towards the second approach with a row counter,
because it seemed cleaner, but I thought of another consideration that
makes the first way seem better. Suppose that your row processor has to
do some setup work at the start of a result set, and that work needs to
see the resultset properties (eg number of columns) so it can't be done
before starting PQgetResult. In the patch as submitted, the
only way to manage that is to keep enough state to recognize that the
current row processor call is the first one, which we realized is
inadequate for multiple-result-set cases. With a row counter argument
you can do the setup whenever rowcount == 0, which fixes that. But
neither of these methods deals correctly with an empty result set!
To make that work, you need to add extra logic after the PQgetResult
call to do the setup work the row processor should have done but never
got a chance to. So that's ugly, and it makes for an easy-to-miss bug.
A call that occurs when we receive RowDescription, independently of
whether the result set contains any rows, makes this a lot cleaner.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-01 21:51:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I've been thinking some more about the early-termination cases (where
the row processor returns zero or longjmps), and I believe I have got
proposals that smooth off most of the rough edges there.

First off, returning zero is really pretty unsafe as it stands, because
it only works more-or-less-sanely if the connection is being used in
async style. If the row processor returns zero within a regular
PQgetResult call, that will cause PQgetResult to block waiting for more
input. Which might not be forthcoming, if we're in the last bufferload
of a query response from the server. Even in the async case, I think
it's a bad design to have PQisBusy return true when the row processor
requested stoppage. In that situation, there is work available for the
outer application code to do, whereas normally PQisBusy == true means
we're still waiting for the server.

I think we can fix this by introducing a new PQresultStatus, called say
PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
status PGRES_SUSPENDED after the row processor has returned zero.
Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
which we'd set before exiting from the getAnotherTuple call. This would
cause PQisBusy and PQgetResult to do the right things. In PQgetResult,
we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
result, so that subsequent calls would resume parsing input.

With this design, a suspending row processor can be used safely in
either async or non-async mode. It does cost an extra PGresult creation
and deletion per cycle, but that's not much more than a malloc and free.

Also, we can document that a longjmp out of the row processor leaves the
library in the same state as if the row processor had returned zero and
a PGRES_SUSPENDED result had been returned to the application; which
will be a true statement in all cases, sync or async.

I also mentioned earlier that I wasn't too thrilled with the design of
PQskipResult; in particular that it would encourage application writers
to miss server-sent error results, which would inevitably be a bad idea.
I think what we ought to do is define (and implement) it as being
exactly equivalent to PQgetResult, except that it temporarily installs
a dummy row processor so that data rows are discarded rather than
accumulated. Then, the documented way to clean up after deciding to
abandon a suspended query will be to do PQskipResult until it returns
null, paying normal attention to any result statuses other than
PGRES_TUPLES_OK. This is still not terribly helpful for async-mode
applications, but what they'd probably end up doing is installing their
own dummy row processors and then flushing results as part of their
normal outer loop. The only thing we could do for them is to expose
a dummy row processor, which seems barely worthwhile given that it's
a one-line function.

I remain of the opinion that PQgetRow/PQrecvRow aren't adding much
usability-wise.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-01 23:03:27
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 01, 2012 at 05:51:19PM -0400, Tom Lane wrote:
> I've been thinking some more about the early-termination cases (where
> the row processor returns zero or longjmps), and I believe I have got
> proposals that smooth off most of the rough edges there.
>
> First off, returning zero is really pretty unsafe as it stands, because
> it only works more-or-less-sanely if the connection is being used in
> async style. If the row processor returns zero within a regular
> PQgetResult call, that will cause PQgetResult to block waiting for more
> input. Which might not be forthcoming, if we're in the last bufferload
> of a query response from the server. Even in the async case, I think
> it's a bad design to have PQisBusy return true when the row processor
> requested stoppage. In that situation, there is work available for the
> outer application code to do, whereas normally PQisBusy == true means
> we're still waiting for the server.
>
> I think we can fix this by introducing a new PQresultStatus, called say
> PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
> status PGRES_SUSPENDED after the row processor has returned zero.
> Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
> which we'd set before exiting from the getAnotherTuple call. This would
> cause PQisBusy and PQgetResult to do the right things. In PQgetResult,
> we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
> result, so that subsequent calls would resume parsing input.
>
> With this design, a suspending row processor can be used safely in
> either async or non-async mode. It does cost an extra PGresult creation
> and deletion per cycle, but that's not much more than a malloc and free.

I added extra magic to PQisBusy(), you are adding extra magic to
PQgetResult(). Not much difference.

Seems we both lost sight of actual usage scenario for the early-exit
logic - that both callback and upper-level code *must* cooperate
for it to be useful. Instead, we designed API for non-cooperating case,
which is wrong.

So the proper approach would be to have new API call, designed to
handle it, and allow early-exit only from there.

That would also avoid any breakage of old APIs. Also it would avoid
any accidental data loss, if the user code does not have exactly
right sequence of calls.

How about PQisBusy2(), which returns '2' when early-exit is requested?
Please suggest something better...

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-01 23:23:06
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> Seems we both lost sight of actual usage scenario for the early-exit
> logic - that both callback and upper-level code *must* cooperate
> for it to be useful. Instead, we designed API for non-cooperating case,
> which is wrong.

Exactly. So you need an extra result state, or something isomorphic.

> So the proper approach would be to have new API call, designed to
> handle it, and allow early-exit only from there.

> That would also avoid any breakage of old APIs. Also it would avoid
> any accidental data loss, if the user code does not have exactly
> right sequence of calls.

> How about PQisBusy2(), which returns '2' when early-exit is requested?
> Please suggest something better...

My proposal is way better than that. You apparently aren't absorbing my
point, which is that making this behavior unusable with every existing
API (whether intentionally or by oversight) isn't an improvement.
The row processor needs to be able to do this *without* assuming a
particular usage style, and most particularly it should not force people
to use async mode.

An alternative that I'd prefer to that one is to get rid of the
suspension return mode altogether. However, that leaves us needing
to document what it means to longjmp out of a row processor without
having any comparable API concept, so I don't really find it better.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-03 15:33:38
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I've whacked the libpq part of this patch around to the point where I'm
reasonably satisfied with it (attached), and am now looking at the
dblink part. I soon realized that there's a rather nasty issue with the
dblink patch, which is that it fundamentally doesn't work for async
operations. In an async setup what you would do is dblink_send_query(),
then periodically poll with dblink_is_busy(), then when it says the
query is done, collect the results with dblink_get_result(). The
trouble with this is that PQisBusy will invoke the standard row
processor, so by the time dblink_get_result runs it's way too late to
switch row processors.

I thought about fixing that by installing dblink's custom row processor
permanently, but that doesn't really work because we don't know the
expected tuple column datatypes until we see the call environment for
dblink_get_result().

A hack on top of that hack would be to collect the data into a
tuplestore that contains all text columns, and then convert to the
correct rowtype during dblink_get_result, but that seems rather ugly
and not terribly high-performance.

What I'm currently thinking we should do is just use the old method
for async queries, and only optimize the synchronous case.

I thought for awhile that this might represent a generic deficiency
in the whole concept of a row processor, but probably it's mostly
down to dblink's rather bizarre API. It would be unusual I think for
people to want a row processor that couldn't know what to do until
after the entire query result is received.

regards, tom lane

Attachment Content-Type Size
libpq-rowproc-20120403.patch text/x-patch 78.0 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-03 18:47:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 01, 2012 at 07:23:06PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > So the proper approach would be to have new API call, designed to
> > handle it, and allow early-exit only from there.
>
> > That would also avoid any breakage of old APIs. Also it would avoid
> > any accidental data loss, if the user code does not have exactly
> > right sequence of calls.
>
> > How about PQisBusy2(), which returns '2' when early-exit is requested?
> > Please suggest something better...
>
> My proposal is way better than that. You apparently aren't absorbing my
> point, which is that making this behavior unusable with every existing
> API (whether intentionally or by oversight) isn't an improvement.
> The row processor needs to be able to do this *without* assuming a
> particular usage style,

I don't get what kind of usage scenario you think of when you
"early-exit without assuming anything about upper-level usage."

Could you show example code where it is useful?

The fact remains that upper-level code must cooperate with callback.
Why is it useful to hijack PQgetResult() to do so? Especially as the
PGresult it returns is not useful in any way and the callback still needs
to use side channel to pass actual values to upper level.

IMHO it's much better to remove the concept of early-exit from public
API completely and instead give "get" style API that does the early-exit
internally. See below for example.

> and most particularly it should not force people
> to use async mode.

Seems our concept of "async mode" is different. I associate
PQisnonblocking() with it. And old code, eg. PQrecvRow()
works fine in both modes.

> An alternative that I'd prefer to that one is to get rid of the
> suspension return mode altogether.

That might be good idea. I would prefer to postpone controversial
features instead having hurried design for them.

Also - is there really need to make callback API ready for *all*
potential usage scenarios? IMHO it would be better to make sure
it's good enough that higher-level and easier to use APIs can
be built on top of it. And thats it.

> However, that leaves us needing
> to document what it means to longjmp out of a row processor without
> having any comparable API concept, so I don't really find it better.

Why do you think any new API concept is needed? Why is following rule
not enough (for sync connection):

"After exception you need to call PQgetResult() / PQfinish()".

Only thing that needs fixing is storing lastResult under PGconn
to support exceptions for PQexec(). (If we need to support
exceptions everywhere.)

---------------

Again, note that if we would provide PQrecvRow()-style API, then we *don't*
need early-exit in callback API. Nor exceptions...

If current PQrecvRow() / PQgetRow() are too bloated for you, then how about
thin wrapper around PQisBusy():

/* 0 - need more data, 1 - have result, 2 - have row */
int PQhasRowOrResult(PGconn *conn, PGresult **hdr, PGrowValue **cols)
{
int gotrow = 0;
PQrowProcessor oldproc;
void *oldarg;
int busy;

/* call PQisBusy with our callback */
oldproc = PQgetRowProcessor(conn, &oldarg);
PQsetRowProcessor(conn, hasrow_cb, &flag);
busy = PQisBusy(conn);
PQsetRowProcessor(conn, oldproc, oldarg);

if (gotrow)
{
*hdr = conn->result;
*cols = conn->rowBuf;
return 2;
}
return busy ? 0 : 1;
}

static int hasrow_cb(PGresult *res, PGrowValue *columns, void *param)
{
int *gotrow = param;
*gotrow = 1;
return 0;
}

Also, instead hasrow_cb(), we could have integer flag under PGconn that
getAnotherTuple() checks, thus getting rid if it even in internal
callback API.

Yes, it requires async-style usage pattern, but works for both
sync and async connections. And it can be used to build even
simpler API on top of it.

Summary: it would be better to keep "early-exit" internal detail,
as the actual feature needed is processing rows outside of callback.
So why not provide API for it?

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-03 21:32:25
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> The fact remains that upper-level code must cooperate with callback.
> Why is it useful to hijack PQgetResult() to do so?

Because that's the defined communications channel. We're not
"hijacking" it. If we're going to start using pejorative words here,
I will assert that your proposal hijacks PQisBusy to make it do
something substantially different than the traditional understanding
of it (more about that below).

> Seems our concept of "async mode" is different. I associate
> PQisnonblocking() with it.

Well, there are really four levels to the API design:

* Plain old PQexec.
* Break down PQexec into PQsendQuery and PQgetResult.
* Avoid waiting in PQgetResult by testing PQisBusy.
* Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
on socket writes) by using PQisnonblocking.

Any given app might choose to run at any one of those four levels,
although the first one probably isn't interesting for an app that would
care about using a suspending row processor. But I see no reason that
we should design the suspension feature such that it doesn't work at the
second level. PQisBusy is, and should be, an optional state-testing
function, not something that changes the set of things you can do with
the connection.

> IMHO it's much better to remove the concept of early-exit from public
> API completely and instead give "get" style API that does the early-exit
> internally.

If the row processor has an early-exit option, it hardly seems
reasonable to me to claim that that's not part of the public API.

In particular, I flat out will not accept a design in which that option
doesn't work unless the current call came via PQisBusy, much less some
entirely new call like PQhasRowOrResult. It's unusably fragile (ie,
timing sensitive) if that has to be true.

There's another way in which I think your proposal breaks the existing
API, which is that without an internal PQASYNC_SUSPENDED state that is
cleared by PQgetResult, it's unsafe to probe PQisBusy multiple times
before doing something useful. That shouldn't be the case. PQisBusy
is supposed to test whether data is ready for the application to do
something with --- it should not throw away data until a separate call
has been made to consume the data. Changing its behavior like that
would risk creating subtle bugs in existing event-loop logic. A
possibly useful analogy is that select() and poll() don't clear a
socket's read-ready state, you have to do a separate read() to do that.
There are good reasons for that separation of actions.

> Again, note that if we would provide PQrecvRow()-style API, then we *don't*
> need early-exit in callback API. Nor exceptions...

AFAICS, we *do* need exceptions for dblink's usage. So most of what's
at stake here is not avoidable, unless you're proposing we put this
whole set of patches off till 9.3 so we can think it over some more.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-03 22:20:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 03, 2012 at 05:32:25PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > The fact remains that upper-level code must cooperate with callback.
> > Why is it useful to hijack PQgetResult() to do so?
>
> Because that's the defined communications channel. We're not
> "hijacking" it. If we're going to start using pejorative words here,
> I will assert that your proposal hijacks PQisBusy to make it do
> something substantially different than the traditional understanding
> of it (more about that below).

And I would agree with it - and I already did:

Seems we both lost sight of actual usage scenario for the early-exit
logic - that both callback and upper-level code *must* cooperate
for it to be useful. Instead, we designed API for non-cooperating case,
which is wrong.

> > Seems our concept of "async mode" is different. I associate
> > PQisnonblocking() with it.
>
> Well, there are really four levels to the API design:
>
> * Plain old PQexec.
> * Break down PQexec into PQsendQuery and PQgetResult.
> * Avoid waiting in PQgetResult by testing PQisBusy.
> * Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
> on socket writes) by using PQisnonblocking.
>
> Any given app might choose to run at any one of those four levels,
> although the first one probably isn't interesting for an app that would
> care about using a suspending row processor. But I see no reason that
> we should design the suspension feature such that it doesn't work at the
> second level. PQisBusy is, and should be, an optional state-testing
> function, not something that changes the set of things you can do with
> the connection.

Thats actually nice overview. I think our basic disagreement comes
from how we map the early-exit into those modes.

I want to think of the early-exit row-processing as 5th and 6th modes:

* Row-by-row processing on sync connection (PQsendQuery() + ???)
* Row-by-row processing on async connection (PQsendQuery() + ???)

But instead you want work with almost no changes on existing modes.

And I don't like it because as I've said it previously, the upper
level must know about callback and handle it properly, so it does
not make sense keep existing loop structure in stone.

Instead we should design the new mode that is logical for user
and also logical inside libpq.

> > IMHO it's much better to remove the concept of early-exit from public
> > API completely and instead give "get" style API that does the early-exit
> > internally.
>
> If the row processor has an early-exit option, it hardly seems
> reasonable to me to claim that that's not part of the public API.

Please keep in mind the final goal - nobody is interested in
"early-exit" on it's own, the interesting goal is processing rows
outside of libpq.

And the early-exit return code is clumsy hack to make it possible
with callbacks.

But why should we provide complex way of achieving something,
when we can provide easy and direct way?

> In particular, I flat out will not accept a design in which that option
> doesn't work unless the current call came via PQisBusy, much less some
> entirely new call like PQhasRowOrResult. It's unusably fragile (ie,
> timing sensitive) if that has to be true.

Agreed for PQisBusy, but why is PQhasRowOrResult() fragile?

It's easy to make PQhasRowOrResult more robust - let it set flag under
PGconn and let getAnotherTuple() do early exit on it's own, thus keeping
callback completely out of loop. Thus avoiding any chance user callback
can accidentally trigger the behaviour.

> There's another way in which I think your proposal breaks the existing
> API, which is that without an internal PQASYNC_SUSPENDED state that is
> cleared by PQgetResult, it's unsafe to probe PQisBusy multiple times
> before doing something useful. That shouldn't be the case. PQisBusy
> is supposed to test whether data is ready for the application to do
> something with --- it should not throw away data until a separate call
> has been made to consume the data. Changing its behavior like that
> would risk creating subtle bugs in existing event-loop logic. A
> possibly useful analogy is that select() and poll() don't clear a
> socket's read-ready state, you have to do a separate read() to do that.
> There are good reasons for that separation of actions.

It does look like argument for new "modes" for row-by-row processing,
preferably with new API calls.

Because adding "magic" to existing APIs seems to only cause confusion.

> > Again, note that if we would provide PQrecvRow()-style API, then we *don't*
> > need early-exit in callback API. Nor exceptions...
>
> AFAICS, we *do* need exceptions for dblink's usage. So most of what's
> at stake here is not avoidable, unless you're proposing we put this
> whole set of patches off till 9.3 so we can think it over some more.

My point was that if we have an API for processing rows outside libpq,
the need for exceptions for callbacks is mostly gone. Converting dblink
to PQhasRowOrResult() should not be hard.

I don't mind keeping exceptions, but only if we don't need to add extra
complexity just for them.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 14:46:28
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Tue, Apr 03, 2012 at 05:32:25PM -0400, Tom Lane wrote:
>> Well, there are really four levels to the API design:
>> * Plain old PQexec.
>> * Break down PQexec into PQsendQuery and PQgetResult.
>> * Avoid waiting in PQgetResult by testing PQisBusy.
>> * Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
>> on socket writes) by using PQisnonblocking.

> Thats actually nice overview. I think our basic disagreement comes
> from how we map the early-exit into those modes.
> I want to think of the early-exit row-processing as 5th and 6th modes:

> * Row-by-row processing on sync connection (PQsendQuery() + ???)
> * Row-by-row processing on async connection (PQsendQuery() + ???)

> But instead you want work with almost no changes on existing modes.

Well, the trouble with the proposed PQgetRow/PQrecvRow is that they only
work safely at the second API level. They're completely unsafe to use
with PQisBusy, and I think that is a show-stopper. In your own terms,
the "6th mode" doesn't work.

More generally, it's not very safe to change the row processor while a
query is in progress. PQskipResult can get away with doing so, but only
because the entire point of that function is to lose data, and we don't
much care whether some rows already got handled differently. For every
other use-case, you have to set up the row processor in advance and
leave it in place, which is a guideline that PQgetRow/PQrecvRow violate.

So I think the only way to use row-by-row processing is to permanently
install a row processor that normally returns zero. It's possible that
we could provide a predefined row processor that acts that way and
invite people to install it. However, I think it's premature to suppose
that we know all the details of how somebody might want to use this.
In particular the notion of cloning the PGresult for each row seems
expensive and not obviously more useful than direct access to the
network buffer. So I'd rather leave it as-is and see if any common
usage patterns arise, then add support for those patterns.

>> In particular, I flat out will not accept a design in which that option
>> doesn't work unless the current call came via PQisBusy, much less some
>> entirely new call like PQhasRowOrResult. It's unusably fragile (ie,
>> timing sensitive) if that has to be true.

> Agreed for PQisBusy, but why is PQhasRowOrResult() fragile?

Because it breaks if you use PQisBusy *anywhere* in the application.
That's not just a bug hazard but a loss of functionality. I think it's
important to have a pure "is data available" state test function that
doesn't cause data to be consumed from the connection, and there's no
way to have that if there are API functions that change the row
processor setting mid-query. (Another way to say this is that PQisBusy
ought to be idempotent from the standpoint of the application --- we
know that it does perform work inside libpq, but it doesn't change the
state of the connection so far as the app can tell, and so it doesn't
matter if you call it zero, one, or N times between other calls.)

regards, tom lane


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 17:28:41
Message-ID: CAM103DtrZUskPo7Au3PDsSer7SQ2F8VGx=0DgQfFUdSzo5ckew@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello, This is the new version of dblink patch.

- Calling dblink_is_busy prevents row processor from being used.

- some PGresult leak fixed.

- Rebased to current head.

> A hack on top of that hack would be to collect the data into a
> tuplestore that contains all text columns, and then convert to the
> correct rowtype during dblink_get_result, but that seems rather ugly
> and not terribly high-performance.
>
> What I'm currently thinking we should do is just use the old method
> for async queries, and only optimize the synchronous case.

Ok, I agree with you except for performance issue. I give up to use
row processor for async query with dblink_is_busy called.

> I thought for awhile that this might represent a generic deficiency
> in the whole concept of a row processor, but probably it's mostly
> down to dblink's rather bizarre API.  It would be unusual I think for
> people to want a row processor that couldn't know what to do until
> after the entire query result is received.

I hope so. Thank you.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
dblink_rowproc_20120405.patch application/octet-stream 28.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 19:17:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> What I'm currently thinking we should do is just use the old method
>> for async queries, and only optimize the synchronous case.

> Ok, I agree with you except for performance issue. I give up to use
> row processor for async query with dblink_is_busy called.

Yeah, that seems like a reasonable idea.

Given the lack of consensus around the suspension API, maybe the best
way to get the underlying libpq patch to a committable state is to take
it out --- that is, remove the "return zero" option for row processors.
Since we don't have a test case for it in dblink, it's hard to escape
the feeling that we may be expending a lot of effort for something that
nobody really wants, and/or misdesigning it for lack of a concrete use
case. Is anybody going to be really unhappy if that part of the patch
gets left behind?

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 21:18:44
Message-ID: CACMqXC+T6au3zHZBnnYNnSHogqXEXNYhBjE-+00PQsQmeuGBjA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Given the lack of consensus around the suspension API, maybe the best
> way to get the underlying libpq patch to a committable state is to take
> it out --- that is, remove the "return zero" option for row processors.
> Since we don't have a test case for it in dblink, it's hard to escape
> the feeling that we may be expending a lot of effort for something that
> nobody really wants, and/or misdesigning it for lack of a concrete use
> case.  Is anybody going to be really unhappy if that part of the patch
> gets left behind?

Agreed.

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Kreen <markokr(at)gmail(dot)com>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 22:04:18
Message-ID: CAM103Dsgq1dmVngtmZd4+faY8w=pWajtHtFERDS6rAziSsP8-g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I'm afraid not re-initializing materialize_needed for the next query
in the latest dblink patch.
I will confirm that and send the another one if needed in a few hours.

# I need to catch the train I usually get on..

> Hello, This is the new version of dblink patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 22:41:00
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Given the lack of consensus around the suspension API, maybe the best
>> way to get the underlying libpq patch to a committable state is to take
>> it out --- that is, remove the "return zero" option for row processors.

> Agreed.

Done that way.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-04 22:42:29
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> I'm afraid not re-initializing materialize_needed for the next query
> in the latest dblink patch.
> I will confirm that and send the another one if needed in a few hours.

I've committed a revised version of the previous patch. I'm not sure
that the case of dblink_is_busy not being used is interesting enough
to justify contorting the logic, and I'm worried about introducing
corner case bugs for that.

regards, tom lane


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp, markokr(at)gmail(dot)com, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-05 00:28:58
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> > I'm afraid not re-initializing materialize_needed for the next query
> > in the latest dblink patch.

I've found no need to worry about the re-initializing issue.

> I've committed a revised version of the previous patch.

Thank you for that.

> I'm not sure that the case of dblink_is_busy not being used is
> interesting enough to justify contorting the logic, and I'm
> worried about introducing corner case bugs for that.

I'm afraid of indefinite state by mixing sync and async queries
or API call sequence for async query out of my expectations
(which is rather narrow).

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-05 16:30:04
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 04, 2012 at 06:41:00PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Given the lack of consensus around the suspension API, maybe the best
> >> way to get the underlying libpq patch to a committable state is to take
> >> it out --- that is, remove the "return zero" option for row processors.
>
> > Agreed.
>
> Done that way.

Minor cleanups:

* Change callback return value to be 'bool': 0 is error.
Currently the accepted return codes are 1 and -1,
which is weird.

If we happen to have the 'early-exit' logic in the future,
it should not work via callback return code. So keeping the 0
in reserve is unnecessary.

* Support exceptions in multi-statement PQexec() by storing
finished result under PGconn temporarily. Without doing it,
the result can leak if callback longjmps while processing
next result.

* Add <caution> to docs for permanently keeping custom callback.

This API fragility is also reason why early-exit (if it appears)
should not work via callback - instead it should give safer API.

--
marko

Attachment Content-Type Size
rowproc-cleanups.diff text/x-diff 6.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-05 16:49:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> Minor cleanups:

> * Change callback return value to be 'bool': 0 is error.
> Currently the accepted return codes are 1 and -1,
> which is weird.

No, I did it that way intentionally, with the thought that we might add
back return code zero (or other return codes) in the future. If we go
with bool then sensible expansion is impossible, or at least ugly.
(I think it was you that objected to 0/1/2 in the first place, no?)

> If we happen to have the 'early-exit' logic in the future,
> it should not work via callback return code.

This assumption seems unproven to me, and even if it were,
it doesn't mean we will never have any other exit codes.

> * Support exceptions in multi-statement PQexec() by storing
> finished result under PGconn temporarily. Without doing it,
> the result can leak if callback longjmps while processing
> next result.

I'm unconvinced this is a good thing either.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-04-05 17:05:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 05, 2012 at 12:49:37PM -0400, Tom Lane wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > Minor cleanups:
>
> > * Change callback return value to be 'bool': 0 is error.
> > Currently the accepted return codes are 1 and -1,
> > which is weird.
>
> No, I did it that way intentionally, with the thought that we might add
> back return code zero (or other return codes) in the future. If we go
> with bool then sensible expansion is impossible, or at least ugly.
> (I think it was you that objected to 0/1/2 in the first place, no?)

Well, I was the one who added the 0/1/2 in the first place,
then I noticed that -1/0/1 works better as a triple.

But the early-exit from callback creates unnecessary fragility
so now I'm convinced we don't want to do it that way.

> > If we happen to have the 'early-exit' logic in the future,
> > it should not work via callback return code.
>
> This assumption seems unproven to me, and even if it were,
> it doesn't mean we will never have any other exit codes.

I cannot even imagine why we would want new codes, so it seems
like unnecessary mess in API.

But it's a minor issue, so if you intend to keep it, I won't
push it further.

> > * Support exceptions in multi-statement PQexec() by storing
> > finished result under PGconn temporarily. Without doing it,
> > the result can leak if callback longjmps while processing
> > next result.
>
> I'm unconvinced this is a good thing either.

This is less minor issue. Do we support longjmp() or not?

Why are you convinced that it's not needed?

--
marko