Skip to content

C++: Multi-Level Member Function Calls Not Modeled as DataFlow::Node #19457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mcc0612mcc0612 opened this issue May 2, 2025 · 1 comment
Open
Labels
question Further information is requested

Comments

@mcc0612mcc0612
Copy link

Description of the issue

In my CodeQL analysis, I encountered an issue where multi-level function pointer calls are not modeled as DataFlow::Node.

Here is a minimal example to reproduce the issue:

class A {
    public:
        void doSomething() {}
};

class B {
    public:
        A getA() {
            return a;
        }
    private:
        A a;
};

int main() {
    B b;
    b.getA().doSomething();
    return 0;
}

Specifically, while I can find getA() modeled as a DataFlow::Node with findNodeforGetA, I fail to find the corresponding node for doSomething() when searching with findNodeforDoSomething.

Here is my query to find corresponding nodes:

Expr findNodeforGetA() {
    exists(Call c, DataFlow::Node node
      | node.asExpr() = c and
        resolveCall(c.(Call)).getName() = "getA"
      | result = node.asExpr()
    )
}

Expr findNodeforDoSomething() {
    exists(Call c, DataFlow::Node node
      | node.asExpr() = c and
        resolveCall(c.(Call)).getName() = "doSomething"
      | result = node.asExpr()
    )
}

So, is this desgned intentionally or due to some other reasons?

More Context:
By the way, my goal is to check the domination relationship between functions. For example, given the following code.

b.getA().doSomething();
doSomethingElse();

I want to check if A::doSomething dominates doSomethingElse using the following query:

predicate defaultDominate(DataFlow::Node dom, DataFlow::Node sub) {
  exists(IRBlock b1, int i1, IRBlock b2, int i2 |
    dom.hasIndexInBlock(b1, i1) and
    sub.hasIndexInBlock(b2, i2) and
    (
      b1 = b2 and
      i1 < i2
      or
      b1.dominates(b2)
    )
  )
}

The failure to find the corresponding DataFlow::Node for doSomething() prevents me from using the defaultDominate predicate to analyze the domination relationship.

@mcc0612mcc0612 mcc0612mcc0612 added the question Further information is requested label May 2, 2025
@mcc0612mcc0612 mcc0612mcc0612 changed the title Multi-Level Function Pointer Calls Not Modeled as DataFlow::Node C++: Multi-Level Function Pointer Calls Not Modeled as DataFlow::Node May 2, 2025
@mcc0612mcc0612 mcc0612mcc0612 changed the title C++: Multi-Level Function Pointer Calls Not Modeled as DataFlow::Node C++: Multi-Level Member Function Calls Not Modeled as DataFlow::Node May 2, 2025
@intrigus-lgtm
Copy link
Contributor

The problem you're encountering is that DataFlow nodes only exist for operations that can carry flow.
But if you call a method that does not return any value, there isn't any flow so CodeQL doesn't create dataflow nodes for those methods.

You aren't the first one to run into the problem and as far as I know this isn't documented more prominently.
(In my humble opinion it would make sense to also create dataflow nodes for such functions, because there are many APIs that work on the dataflow level instead of the AST level)

Instead of working on the dataflow level you can use the AST level to find the dominance relation:

/**
 * @kind alert
 */

import cpp

predicate interesting(FunctionCall fc, FunctionCall fc2) {
  fc.getTarget().hasName("doSomething") and
  fc2.getTarget().hasName("fooVoid") and
  dominates(fc, fc2)
}

from FunctionCall fc, FunctionCall fc2
where interesting(fc, fc2)
select fc2, "$@ dominates $@", fc, fc.toString(), fc2, fc2.toString()

on your slightly changed example file:

class A {
    public:
        void doSomething() {}
};

class B {
    public:
        A getA() {
            return a;
        }
    private:
        A a;
};

void fooVoid() {}
int fooInt() { return 1; }

int main() {
    B b;
    b.getA().doSomething();
    fooVoid();
    return fooInt();
}

Your original problem can also be shown by this CodeQL query:

import cpp
import semmle.code.cpp.dataflow.new.DataFlow

boolean hasDataFlowNode(FunctionCall fc) {
  if exists(DataFlow::Node node | node.asExpr() = fc) then result = true else result = false
}

from FunctionCall fc
select "hasDataFlowNode: " + hasDataFlowNode(fc), fc

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants