Skip to content

v1.6.2 can't nslookup httproute endpoint IP from the fabric pod shell #3353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mathematicalsystems opened this issue May 2, 2025 · 7 comments
Labels
community waiting for response Waiting for author's response

Comments

@mathematicalsystems
Copy link

Describe the bug
when I call my endpoint of my pod using curl from the host server I get correct response
$ curl http://10.128.19.135:19000
Running Spring service authentication version 0.0.0

but I can't nslookup that IP from inside nginx fabric this behaviour prevents nginx-gatway httroutes from connecting the serivce

firewalld is disabled.

$ kubectl exec -it $NGINX_FABRIC_POD_NAME -c nginx -n nginx-gateway -- /bin/sh
/ $ cat /etc/nginx/conf.d/http.conf

....

upstream mathematicalsystems-spring_com-mathematicalsystems-api-authentication-service_80 {
random two least_conn;
zone mathematicalsystems-spring_com-mathematicalsystems-api-authentication-service_80 512k;
` server 10.128.19.135:19000;`
}

/ $ nslookup 10.128.19.135:19000
Server: 10.96.0.10
Address: 10.96.0.10:53

** server can't find 10.128.19.135:19000: NXDOMAIN

** server can't find 10.128.19.135:19000: NXDOMAIN

/ $ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if13: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP qlen 1000
link/ether 2e:16:00:ce:95:3e brd ff:ff:ff:ff:ff:ff
inet 10.128.19.136/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::2c16:ff:fece:953e/64 scope link
valid_lft forever preferred_lft forever

/ $ curl http://10.128.19.135:19000
curl: (28) Failed to connect to 10.128.19.135 port 19000 after 134045 ms: Could not connect to server

To Reproduce
Steps to reproduce the behavior:

  1. Deploy some service
  2. create httproute
  3. try to curl nginx-fabric endpoint to that service

Expected behavior
A clear and concise description of what you expected to happen.

Your environment

  • Version of the NGINX Gateway Fabric - release version or a specific commit. The first line of the nginx-gateway container logs includes the commit info.
    $ kubectl logs $NGINX_FABRIC_POD_NAME -n nginx-gateway -c nginx-gateway
    {"level":"info","ts":"2025-05-02T16:31:11Z","msg":"Starting NGINX Gateway Fabric in static mode","version":"1.6.2","commit":"532db6a20b2912fe397211eef9f8d564d46a4bdd","date":"2025-03-11T17:28:32Z","dirty":"false"}

  • Version of Kubernetes
    $ kubectl version
    Client Version: v1.32.4
    Kustomize Version: v5.5.0
    Server Version: v1.32.4

  • Kubernetes platform (e.g. Mini-kube or GCP)
    self deployed cluster

  • Details on how you expose the NGINX Gateway Fabric Pod (e.g. Service of type LoadBalancer or port-forward)
    --set service.type=NodePort

  • Logs of NGINX container: kubectl -n nginx-gateway logs -l app=nginx-gateway -c nginx
    $ kubectl -n nginx-gateway logs -l app=nginx-gateway -c nginx
    No resources found in nginx-gateway namespace.

  • NGINX Configuration: kubectl -n nginx-gateway exec <gateway-pod> -c nginx -- nginx -T
    $ kubectl -n nginx-gateway exec $NGINX_FABRIC_POD_NAME -c nginx -- nginx -T
    2025/05/02 19:51:55 [notice] 98#98: js vm init njs: 00007F281640DA80
    nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
    # configuration file /etc/nginx/nginx.conf:
    load_module /usr/lib/nginx/modules/ngx_http_js_module.so;
    include /etc/nginx/main-includes/*.conf;

worker_processes auto;

pid /var/run/nginx/nginx.pid;

events {
worker_connections 1024;
}

http {
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/mime.types;
js_import /usr/lib/nginx/modules/njs/httpmatches.js;

default_type application/octet-stream;

proxy_headers_hash_bucket_size 512;
proxy_headers_hash_max_size 1024;
server_names_hash_bucket_size 256;
server_names_hash_max_size 1024;
variables_hash_bucket_size 512;
variables_hash_max_size 1024;

sendfile on;
tcp_nopush on;

server_tokens off;

server {
listen unix:/var/run/nginx/nginx-status.sock;
access_log off;

location /stub_status {
stub_status;
}
}
}

stream {
variables_hash_bucket_size 512;
variables_hash_max_size 1024;

map_hash_max_size 2048;
map_hash_bucket_size 256;

log_format stream-main '$remote_addr [$time_local] '
'$protocol $status $bytes_sent $bytes_received '
'$session_time "$ssl_preread_server_name"';
access_log /dev/stdout stream-main;
include /etc/nginx/stream-conf.d/*.conf;
}

# configuration file /etc/nginx/main-includes/main.conf:

error_log stderr info;

# configuration file /etc/nginx/conf.d/config-version.conf:

server {
listen unix:/var/run/nginx/nginx-config-version.sock;
access_log off;

location /version {
return 200 9;
}
}

# configuration file /etc/nginx/conf.d/http.conf:
http2 on;

# Set $gw_api_compliant_host variable to the value of $http_host unless $http_host is empty, then set it to the value
# of $host. We prefer $http_host because it contains the original value of the host header, which is required by the
# Gateway API. However, in an HTTP/1.0 request, it's possible that $http_host can be empty. In this case, we will use
# the value of $host. See http://nginx.org/en/docs/http/ngx_http_core_module.html#var_host.
map $http_host $gw_api_compliant_host {
'' $host;
default $http_host;
}

# Set $connection_header variable to upgrade when the $http_upgrade header is set, otherwise, set it to close. This
# allows support for websocket connections. See https://nginx.org/en/docs/http/websocket.html.
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}

## Returns just the path from the original request URI.
map $request_uri $request_uri_path {
"~^(?P<path>[^?]*)(\?.*)?$" $path;
}

js_preload_object matches from /etc/nginx/conf.d/matches.json;
server {
listen 80 default_server;
listen [::]:80 default_server;
default_type text/html;
return 404;
}

server {
listen 80;
listen [::]:80;

server_name mathematicalsystems.com;

location /mathematicalsystems-authentication/v1/ {

set $match_key 1_0;
js_content httpmatches.redirect;

proxy_http_version 1.1;
}
location = /mathematicalsystems-authentication/v1 {

set $match_key 1_0;
js_content httpmatches.redirect;

proxy_http_version 1.1;
}
location /_ngf-internal-rule0-route0 {
internal;

proxy_http_version 1.1;
proxy_set_header Host "$gw_api_compliant_host";
proxy_set_header X-Forwarded-For "$proxy_add_x_forwarded_for";
proxy_set_header X-Real-IP "$remote_addr";
proxy_set_header X-Forwarded-Proto "$scheme";
proxy_set_header X-Forwarded-Host "$host";
proxy_set_header X-Forwarded-Port "$server_port";
proxy_set_header Upgrade "$http_upgrade";
proxy_set_header Connection "$connection_upgrade";
proxy_pass http://mathematicalsystems-spring_com-mathematicalsystems-api-authentication-service_80$request_uri;

}
location / {

return 404 "";

proxy_http_version 1.1;
}
}

server {
listen unix:/var/run/nginx/nginx-503-server.sock;
access_log off;

return 503;
}

server {
listen unix:/var/run/nginx/nginx-500-server.sock;
access_log off;

return 500;
}

upstream mathematicalsystems-spring_com-mathematicalsystems-api-authentication-service_80 {
random two least_conn;
zone mathematicalsystems-spring_com-mathematicalsystems-api-authentication-service_80 512k;

server 10.128.19.135:19000;

}

upstream invalid-backend-ref {
random two least_conn;

server unix:/var/run/nginx/nginx-500-server.sock;

}

# configuration file /etc/nginx/mime.types:

types {
text/html html htm shtml;
text/css css;
text/xml xml;
image/gif gif;
...
}

# configuration file /etc/nginx/stream-conf.d/stream.conf:

server {
listen unix:/var/run/nginx/connection-closed-server.sock;
return "";
}

nginx: configuration file /etc/nginx/nginx.conf test is successful

Additional context
$ uname -a
Linux mathematicalsystems 6.12.22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.22-1 (2025-04-10) x86_64 GNU/Linux

$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux trixie/sid"
NAME="Debian GNU/Linux"
VERSION_CODENAME=trixie
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Copy link

nginx-bot bot commented May 2, 2025

Hi @mathematicalsystems! Welcome to the project! 🎉

Thanks for opening this issue!
Be sure to check out our Contributing Guidelines and the Issue Lifecycle while you wait for someone on the team to take a look at this.

@nginx-bot nginx-bot bot added the community label May 2, 2025
@sjberman
Copy link
Collaborator

sjberman commented May 2, 2025

Hi @mathematicalsystems, not sure exactly what's going on here, seems like a potential DNS or networking issue in the cluster that might be causing restrictions from sending Pod to Pod traffic? The nginx configuration itself looks good, so it's not that. Though the fact that you can't even curl the Pod from within the nginx Pod shows that nginx itself is not the problem.

Does nslookup work for the full Service name of your backend? <service-name>.<namespace>.svc.cluster.local

@mathematicalsystems
Copy link
Author

@sjberman

Does nslookup work for the full Service name of your backend? <service-name>.<namespace>.svc.cluster.local

yes
$ kubectl exec -it $NGINX_FABRIC_POD_NAME -c nginx -n nginx-gateway -- /bin/sh
/ $ nslookup com-mathematicalsystems-api-authentication-service.mathematicalsystems-spring.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10:53
Name: com-mathematicalsystems-api-authentication-service.mathematicalsystems-spring.svc.cluster.local
Address: 10.107.125.56

@sjberman
Copy link
Collaborator

sjberman commented May 2, 2025

How about if you send a curl request from within the nginx pod to that Service name?

@mathematicalsystems
Copy link
Author

How about if you send a curl request from within the nginx pod to that Service name?

$ kubectl exec -it $NGINX_FABRIC_POD_NAME -c nginx -n nginx-gateway -- /bin/sh
/ $ curl http://com-mathematicalsystems-api-authentication-service.mathematicalsystems-spring.svc.cluster.local
curl: (28) Failed to connect to com-mathematicalsystems-api-authentication-service.mathematicalsystems-spring.svc.cluster.local port 80 after 134611 ms: Could not connect to server

@sjberman
Copy link
Collaborator

sjberman commented May 5, 2025

I'm thinking this must be a pod networking issue, where the pods are unable to communicate with each other. I do see that you're using a self-deployed cluster, so maybe the network is not installed properly?

What I can say is that from our point of view (NGINX Gateway Fabric), everything appears to be configured properly.

@sindhushiv sindhushiv added the waiting for response Waiting for author's response label May 6, 2025
@mathematicalsystems
Copy link
Author

@sjberman the reason was a misconfigured NetworkPolicy that's what was blocking connection like curl without fail return with some response code clarifying that there is a network policy preventing the traffic, this go without return is what made me think it is a networking issue or a bug.

I do believe, from my point of view, that fabric shall tell some meaningful response code to point attention to a misconfigured network policy preventing connection to resource. like resource NAME_SPACE is not reachable by policy, that will alarm what policy, who made that policy ...etc.

how that happened?!! that is because the one makes network policies is away from the second implements the routing, and I'm the third tries to find out why?!, sure it is a true bad management but it happens, sorry.

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in NGINX Gateway Fabric May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community waiting for response Waiting for author's response
Projects
Status: Done
Development

No branches or pull requests

3 participants