Skip to content

Add Option to Keep Query Prefixes #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vitalis-wiens
Copy link

This PR extends the options paramaters of the SparqlGenerator.

The newly introduced option parameter keepQueryPrefixes allows to combine the used prefixes of the query body and the once mentioned in the Prefix defintions.

The use-case is the following:

Some database vendors can provide hints to the database using Prefixes, however the current available options (used Prefixes or all Prefixes) either ignore the optimization prefixes or add all available prefixes which results in a very large prefix definitions and more challenging debuging of queries.

Related Issue: #186

* Added an option to the generator to keep query prefixes even if they are not used in the query
@vitalis-wiens
Copy link
Author

Hi @RubenVerborgh, metaphacts contributed this fix and we would highly appreciate if it would be included in and distributed through a new sparql.js release. Do you have any plans/schedule for the next release or other dependencies? Just asking for planning purpose. Thanks

@RubenVerborgh RubenVerborgh self-assigned this Mar 10, 2025
Copy link
Owner

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vitalis-wiens Thanks for this PR! I like the case at #186, but have some different suggestions for implementation.

The current proposal has a weird interaction between allPrefixes and keepQueryPrefixes, and does not leave room for an intermediate case where the user wants some prefixes to always be present, and some prefixes to only be present when used.

What do you think of an interface along the lines of { allPrefixes: ['db_opt'] } as a third option, where:

  1. false means only prefixes that are actually used
  2. true means all prefixes, regardless of usage
  3. String[] means always include the listed prefixes, others as used

An alternative proposal could be { fixedPrefixes: { db_opt: 'https://db.control.param_10' } }.

We'll also need an addition to the README.

@@ -90,10 +90,12 @@ Generator.prototype.toQuery = function (q) {
Generator.prototype.baseAndPrefixes = function (q) {
var base = q.base ? ('BASE <' + q.base + '>' + this._newline) : '';
var prefixes = '';
var queryPrefixes=Object.keys(q.prefixes || {});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var queryPrefixes=Object.keys(q.prefixes || {});
var queryPrefixes = Object.keys(q.prefixes || {});

@@ -451,6 +453,7 @@ function mapJoin(array, sep, func, self) {
/**
* @param options {
* allPrefixes: boolean,
* keepQueryPrefixes: boolean,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you call it fixedPrefixes and make it a key/value object?

var parsedQuery = parser.parse('PREFIX db_opt: <https://db.control.param_10> \n SELECT * WHERE {?s a owl:Class}');
var generator = new SparqlGenerator({allPrefixes: false, keepQueryPrefixes: true});
var generatedQuery = generator.stringify(parsedQuery);
var expectedQuery ='PREFIX db_opt: <https://db.control.param_10>\nPREFIX owl: <http://www.w3.org/2002/07/owl#>\nSELECT * WHERE { ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> owl:Class. }';
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var expectedQuery ='PREFIX db_opt: <https://db.control.param_10>\nPREFIX owl: <http://www.w3.org/2002/07/owl#>\nSELECT * WHERE { ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> owl:Class. }';
var expectedQuery = 'PREFIX db_opt: <https://db.control.param_10>\nPREFIX owl: <http://www.w3.org/2002/07/owl#>\nSELECT * WHERE { ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> owl:Class. }';

@aschwarte10
Copy link

Hi @RubenVerborgh

thanks for the quick response and the additional perspective. I can follow your reasoning of allowing the user to provide a list of "fixedPrefixes" that are always added to the query (independent of their usages inside the query). Whether this is added by overloading the allPrefixes setting, or by introducing a new dedicated option fixedPrefixes is for me a bit matter of taste. For clarity from an interface perspective I personally would prefer the additional fixedPrefixes option.

Our original use-case / improvement request is however slightly different from "fixedPrefixes":

we want to configure the generator to explicitly keep all those prefixes in the query prologue that are defined by the user (i.e., in the original query string). The reason for that in our case is that some database vendors use specialized prefixes as a kind of query hint.

In addition (what we are already able to do now): we want to inject a fixed list of prefixes, but from those only keep the ones that are used in the query. Note that this is already technically possible now in sparqljs.

Considering this: what do you think of the following abstract implementation sketch on configuration level

GeneratorSettings {
    // whether to render all prefixes known to the context of the query (independent of the usage)
    allPrefixes: true | false

    // list of additional prefixes to always add (independent of usage)
    fixedPrefixes: List<Prefix> 
 
    // keep all prefixes of the query prologue as specified by the user in the original query string (independent of the usage)
    keepQueryPrefixes: true | false
}

@vitalis-wiens would be able to do an iteration on the PR if this proposal makes sense to you

@RubenVerborgh
Copy link
Owner

Considering this: what do you think of the following abstract implementation sketch on configuration level

I don't understand the interaction between the three options? Are they fully independent?

In particular, the generation has no notion about original query string.

* Added option to the generator to use a fixed prefix list which is added to generated query string
* Adjusted tests for the new option
@vitalis-wiens
Copy link
Author

Hi @RubenVerborgh I have prepared a second commit with the suggested implementation for a new options parameter
fixedPrefixes

I will try to summarize the different use-cases, expected behavior and interactions between the options

  • With the current PR we would have three options to control the generation of the prefix block of the query string
      1. allPrefixes: boolean
      1. keepQueryPrefixes: booean
      1. fixedPrefixes: object ( similar to prefix list that is used in the Parser )

Possible Combinations of options parameters
Definitions:

  • All Prefixes : prefixes defined by the query and the parser
  • Used Prefixes : subset from All Prefixes where only the used once are written to the query string
  • Fixed Prefixes : subset from All Prefixes
  • Query Prefixes : Prefixes defined in the query
keepQueryPrefixes fixedPrefixes allPrefixes Results
false null false Used Prefixes
false null true All Prefixes
false listOfFixedPrefixes false Used Prefixes + Fixed Prefixes (without duplicates)
false listOfFixedPrefixes true All Prefixes
true null false Used Prefixes + Query Prefixes (without duplicates)
true null true All Prefixes
true listOfFixedPrefixes false Used Prefixes + Fixed Prefixes + Query Prefixes (without duplicates)
true listOfFixedPrefixes true All Prefixes

In our use case we would like to go for the version where we have used prefixes and the ones defined in the query.

Form debugging and experimenting this are maybe relevant pointers:

q = parser.parse(queryString) parses the query string

q.prefixes is the list of query defined prefixes but q has also a prototype where the prefixes from the parser are added.

Note: In the current implementation I assume that the fixed prefixes is a subset of all prefixes
This means if neither the parser configuration nor the query provide a prefix which is mentioned in the fixedPrefixes it will not be exported. If we want to support the scenario where the fixed prefix list does not have an intersection with All Prefixes I could extend the PR once more :)

I hope this explanation is helpfull to find a path forward

On a side note: For the next release could you also consider the following PR #184 ?

@RubenVerborgh
Copy link
Owner

@vitalis-wiens Thanks for the detailed explanation!

Regarding the table, unless I'm missing something, we have 8 possible input combinations, but only 4 possible output combinations (and I actually think that's 3, because used + fixed + query prefixes sounds like simply “all” to me). That's why I proposed the more simple configuration above.

allPrefixes Generated prefixes
false The query.prefixes that occur in the query body
true Everything from `query.prefixes
Array/Object The query.prefixes that occur in the query body OR are listed in the Array

The misunderstanding might stem from the fact that all of your tests (and perhaps your use case) start from SparqlParser. But that's not the common case. You can manipulate the input to generator.stringify.

I think this simple extension of the allPrefixes flag allows you to do everything you need, without the complexity of 3 flags that are not orthogonal.

prefixes += 'PREFIX ' + key + ': <' + q.prefixes[key] + '>' + this._newline;
}
// ensure that we do not add prefixes multiple times
if (addedPrefixes.includes(key)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could that happen when q.prefixes does not change?

The option `allPrefixes` can now receive an object for the list of prefixes

This prefixes will be added to the query string if they are either provided by the query or are defined in the Parser
@vitalis-wiens
Copy link
Author

Hi @RubenVerborgh, thanks for the explanation and the pointer about use case

I have now revised the PR:

  • it extends the allPrefixes with an opbject that should hold the prefixes
  • in the funtion baseAndPrefixes we now have an object fixedPrefixes which is extracted from the options

I have also revised the tests

  • the first test just provides a static prefix list to the generator options
  • the second test provides a dynamic prefix list which is extracted from the query after it has been parsed --> this show how our use-case can be realized with this option

@aschwarte10
Copy link

Hi @RubenVerborgh , did you have a chance to review the latest iteration / refinement from Vitalis?

We would ideally want to integrate the new version of sparqljs with an improvement for this issue in our upcoming release, where we are aiming for end of March.

Any feedback highly appreciated allowing us to align the plans

@RubenVerborgh
Copy link
Owner

Thanks; will get to this ASAP; I'm a volunteer on this and it's teaching semester.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants