!jUsknEqNdJobySpXuN:matrix.org

IC Node Providers

213 Members
The objective of this room is not only to facilitate the communication between DFINITY and Node Providers, but also between the Node Providers directly. And those who want to become Node Providers.3 Servers

Load older messages


SenderMessageTime
21 Jun 2024
@gorazd_o:matrix.orgGorazd Ocvirkhere's also the version of the updated firmware that we're using if of any help08:15:59
@gorazd_o:matrix.orgGorazd Ocvirkthe image installation is successul but no node can register08:17:04
@sasa-tomic:matrix.orgSaša Tomić [DFINITY]Hi Gorazd. The message is actually pretty clear in this case -- the node was still in a subnet when you wiped it and redeployed. AFAIK firmware upgrades do not require the machine to be reinstalled.08:55:22
@sasa-tomic:matrix.orgSaša Tomić [DFINITY]

eerokelly checked these few machines that you mentioned, and shared the following:

  • dztkl - Online, cannot re-join, previous record still in a subnet
  • 3zbn7 - Online, cannot re-join, previous record still in a subnet
  • eulny - It appears this node installed, upgraded successfully, then upgraded successfully again due to the latest release. It then goes offline without notice. We're still looking into it.

Can you please share any information (such as screenshot) from eulny ?

08:57:07
@gorazd_o:matrix.orgGorazd Ocvirkimage.png
Download image.png
09:48:30
@gorazd_o:matrix.orgGorazd Ocvirk hey eerokelly this is a screenshot from eulny from yesterday 09:49:07
@gorazd_o:matrix.orgGorazd Ocvirkwe observed the sam for eulny it joined and then went down09:51:46
@andrewbattat:matrix.organdrewbattat [DFINITY]

Sorry you're still having issues. Thank you for trying to redeploy without IPv4.

Have you tried restarting everything you can between the node machine and the internet (router, firewall, etc.)? If you haven't, can you try to? Doing this solves a great number of the networking issues we see.

After you do so, you can try to restart the node (restart, not redeploy), and that may solve your issue.

We are also working on a follow-up to get more diagnostic information into your node if this doesn't resolve your issue, but this wouldn't be included until the next release or the one after.

17:30:40
22 Jun 2024
@alexnod:matrix.orgOleksandr How to solve the problem IC_PrometheusTargetMissing 10:46:45
24 Jun 2024
@gorazd_o:matrix.orgGorazd Ocvirk Saša Tomić [DFINITY]: ok, we'll only update firmware fro now, what do we do with the 3 that are currently down? 06:40:33
@alexnod:matrix.orgOleksandrLast time I solved the problem by reinstalling IC-OS, would hate to do it again, what other options are there and why this can happen?10:45:33
@sasa-tomic:matrix.orgSaša Tomić [DFINITY] Oleksandr: sorry but you're going to need to share more details. Also, do you see anything on the BMC / screen? 13:39:26
@sasa-tomic:matrix.orgSaša Tomić [DFINITY]

Gorazd Ocvirk: another NP reported success deploying nodes with the following image:

We're using the following image: https://download.dfinity.systems/ic/246d0ce0784d9990c06904809722ce5c2c816269/setup-os/disk-img/disk-img.tar.gz

13:41:00
@alexnod:matrix.orgOleksandr 2024-06-24 .png
Download 2024-06-24 .png
16:41:08
@alexnod:matrix.orgOleksandr
In reply to @sasa-tomic:matrix.org
Oleksandr: sorry but you're going to need to share more details. Also, do you see anything on the BMC / screen?

According to the logs, everything seems to be stable, but it's not.

I've had this before, I reinstalled ic-os, noda worked for 20 days and now IC_PrometheusTargetMissing problem again.

16:44:04
@rzakrzyk:matrix.orgRadek Zakrzyk [dfinity]

Hi Oleksandr could you please verify the routing with your ISP toward our prefix 2602:fb2b:100::/48
As it looks, there is a routing loop when going from SF1 to Warsaw over AS20853 eTOP sp. z o.o .

rzakrzyk@sf1-netmgr:~$ traceroute6 -I 2a00:c90:0:7:6801:9dff:fea1:f2a8
traceroute to 2a00:c90:0:7:6801:9dff:fea1:f2a8 (2a00:c90:0:7:6801:9dff:fea1:f2a8), 30 hops max, 80 byte packets
 1  2602:fb2b:100:10::1 (2602:fb2b:100:10::1)  0.284 ms  0.253 ms  0.299 ms
 2  2602:fb2b:100:1::5 (2602:fb2b:100:1::5)  0.491 ms * *
 3  * * *
 4  * 2604:980:8001:120::1 (2604:980:8001:120::1)  3.015 ms  2.986 ms
 5  sjo-b23-link.ip.twelve99.net (2001:2035:0:11a2::1)  3.179 ms  3.335 ms  3.369 ms
 6  nyk-bb2-v6.ip.twelve99.net (2001:2034:1:b8::1)  72.739 ms  72.585 ms  72.551 ms
 7  ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1)  145.835 ms  146.494 ms  146.465 ms
 8  hbg-bb3-v6.ip.twelve99.net (2001:2034:1:6f::1)  159.655 ms  159.649 ms  159.642 ms
 9  * * *
10  metroport-svc071681-ic358236.ip.twelve99-cust.net (2001:2000:3080:4b2::2)  173.791 ms  173.784 ms  173.767 ms
11  2a00:c90::9 (2a00:c90::9)  169.554 ms  169.544 ms  169.481 ms
12  2a00:c90::54 (2a00:c90::54)  188.731 ms *  169.528 ms
13  2a00:c90::9 (2a00:c90::9)  169.494 ms  169.572 ms  169.535 ms
14  2a00:c90::54 (2a00:c90::54)  169.837 ms  169.813 ms  169.893 ms
15  2a00:c90::9 (2a00:c90::9)  169.724 ms  169.784 ms  169.798 ms
16  2a00:c90::54 (2a00:c90::54)  170.269 ms  170.289 ms  170.340 ms
17  2a00:c90::9 (2a00:c90::9)  169.890 ms  169.649 ms  169.615 ms
18  2a00:c90::54 (2a00:c90::54)  174.230 ms  174.189 ms  174.172 ms
19  * * *
20  2a00:c90::54 (2a00:c90::54)  174.072 ms  173.749 ms  173.702 ms
21  * 2a00:c90::9 (2a00:c90::9)  169.713 ms  169.682 ms
22  2a00:c90::54 (2a00:c90::54)  170.113 ms  170.114 ms  170.214 ms
23  2a00:c90::9 (2a00:c90::9)  169.980 ms * *
24  2a00:c90::54 (2a00:c90::54)  170.310 ms  170.277 ms  170.068 ms
25  2a00:c90::9 (2a00:c90::9)  170.032 ms  170.021 ms  169.958 ms
26  2a00:c90::54 (2a00:c90::54)  170.147 ms  170.113 ms  169.987 ms
27  2a00:c90::9 (2a00:c90::9)  169.907 ms  169.970 ms  169.977 ms
28  2a00:c90::54 (2a00:c90::54)  174.835 ms  174.803 ms  174.314 ms
29  2a00:c90::9 (2a00:c90::9)  169.984 ms  170.040 ms  170.056 ms
30  2a00:c90::54 (2a00:c90::54)  170.481 ms  170.453 ms  170.416 ms
18:21:59
@rzakrzyk:matrix.orgRadek Zakrzyk [dfinity]

To compare here is the working flow from FR1

rzakrzyk@fr1-netmgr:~$ traceroute6 -I 2a00:c90:0:7:6801:9dff:fea1:f2a8
traceroute to 2a00:c90:0:7:6801:9dff:fea1:f2a8 (2a00:c90:0:7:6801:9dff:fea1:f2a8), 30 hops max, 80 byte packets
 1  _gateway (2602:fb2b:110:10::1)  0.254 ms  0.223 ms  0.211 ms
 2  2602:fb2b:110:1::1 (2602:fb2b:110:1::1)  0.401 ms * *
 3  * 2a0b:21c0:1002:15::1 (2a0b:21c0:1002:15::1)  12.262 ms *
 4  * * *
 5  2404:ff40:1:151::1 (2404:ff40:1:151::1)  0.456 ms  0.479 ms  0.526 ms
 6  2404:ff40:1:252::1 (2404:ff40:1:252::1)  0.406 ms  0.652 ms  0.623 ms
 7  2404:ff40:1:38::2 (2404:ff40:1:38::2)  4.755 ms  1.431 ms  1.406 ms
 8  as57463.v6.netix.net (2001:67c:29f0::5:7463:168)  0.553 ms  0.543 ms  0.558 ms
 9  2001:7f8:60::2:853:1 (2001:7f8:60::2:853:1)  17.440 ms  17.439 ms  17.394 ms
10  2a00:c90::49 (2a00:c90::49)  17.179 ms  17.159 ms  17.122 ms
11  2a00:c90:0:7:6801:9dff:fea1:f2a8 (2a00:c90:0:7:6801:9dff:fea1:f2a8)  17.631 ms * *
18:24:40
@rzakrzyk:matrix.orgRadek Zakrzyk [dfinity] *

Hi Oleksandr could you please verify the routing with your ISP toward our prefix 2602:fb2b:100::/48
As it looks, there is a routing loop when going from SF1 to Warsaw over AS20853 2a00:c90::/32 eTOP sp. z o.o .

rzakrzyk@sf1-netmgr:~$ traceroute6 -I 2a00:c90:0:7:6801:9dff:fea1:f2a8
traceroute to 2a00:c90:0:7:6801:9dff:fea1:f2a8 (2a00:c90:0:7:6801:9dff:fea1:f2a8), 30 hops max, 80 byte packets
 1  2602:fb2b:100:10::1 (2602:fb2b:100:10::1)  0.284 ms  0.253 ms  0.299 ms
 2  2602:fb2b:100:1::5 (2602:fb2b:100:1::5)  0.491 ms * *
 3  * * *
 4  * 2604:980:8001:120::1 (2604:980:8001:120::1)  3.015 ms  2.986 ms
 5  sjo-b23-link.ip.twelve99.net (2001:2035:0:11a2::1)  3.179 ms  3.335 ms  3.369 ms
 6  nyk-bb2-v6.ip.twelve99.net (2001:2034:1:b8::1)  72.739 ms  72.585 ms  72.551 ms
 7  ldn-bb1-v6.ip.twelve99.net (2001:2034:1:7a::1)  145.835 ms  146.494 ms  146.465 ms
 8  hbg-bb3-v6.ip.twelve99.net (2001:2034:1:6f::1)  159.655 ms  159.649 ms  159.642 ms
 9  * * *
10  metroport-svc071681-ic358236.ip.twelve99-cust.net (2001:2000:3080:4b2::2)  173.791 ms  173.784 ms  173.767 ms
11  2a00:c90::9 (2a00:c90::9)  169.554 ms  169.544 ms  169.481 ms
12  2a00:c90::54 (2a00:c90::54)  188.731 ms *  169.528 ms
13  2a00:c90::9 (2a00:c90::9)  169.494 ms  169.572 ms  169.535 ms
14  2a00:c90::54 (2a00:c90::54)  169.837 ms  169.813 ms  169.893 ms
15  2a00:c90::9 (2a00:c90::9)  169.724 ms  169.784 ms  169.798 ms
16  2a00:c90::54 (2a00:c90::54)  170.269 ms  170.289 ms  170.340 ms
17  2a00:c90::9 (2a00:c90::9)  169.890 ms  169.649 ms  169.615 ms
18  2a00:c90::54 (2a00:c90::54)  174.230 ms  174.189 ms  174.172 ms
19  * * *
20  2a00:c90::54 (2a00:c90::54)  174.072 ms  173.749 ms  173.702 ms
21  * 2a00:c90::9 (2a00:c90::9)  169.713 ms  169.682 ms
22  2a00:c90::54 (2a00:c90::54)  170.113 ms  170.114 ms  170.214 ms
23  2a00:c90::9 (2a00:c90::9)  169.980 ms * *
24  2a00:c90::54 (2a00:c90::54)  170.310 ms  170.277 ms  170.068 ms
25  2a00:c90::9 (2a00:c90::9)  170.032 ms  170.021 ms  169.958 ms
26  2a00:c90::54 (2a00:c90::54)  170.147 ms  170.113 ms  169.987 ms
27  2a00:c90::9 (2a00:c90::9)  169.907 ms  169.970 ms  169.977 ms
28  2a00:c90::54 (2a00:c90::54)  174.835 ms  174.803 ms  174.314 ms
29  2a00:c90::9 (2a00:c90::9)  169.984 ms  170.040 ms  170.056 ms
30  2a00:c90::54 (2a00:c90::54)  170.481 ms  170.453 ms  170.416 ms
18:25:34
@gorazd_o:matrix.orgGorazd Ocvirkty will try19:14:16
@computerbutler:matrix.orgChris Butler joined the room.20:48:24
@computerbutler:matrix.orgChris ButlerGreetings all. I'm curious, do nodes with "awaiting" status earn the same amount of rewards as nodes with "active" status? 20:52:43
@alexnod:matrix.orgOleksandrYes22:22:29
25 Jun 2024
@novisystems:matrix.orgNoviSystems andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again and still not joining despite a successful installation. 02:30:51
@novisystems:matrix.orgNoviSystems * andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again (node included) and still not joining despite a successful installation. 02:32:07
@novisystems:matrix.orgNoviSystems * andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again (node included) and still not joining despite a successful installation. Also it is worth mentioning that while these devices are behind a firewall traffic to and from the node subnet is wide open and currently working fine for the two previously set up nodes. 02:34:47
@novisystems:matrix.orgNoviSystems * andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again (node included) and still not joining despite a successful installation. Also it is worth mentioning that while these devices are behind a firewall, traffic to and from the node subnet is wide open and currently working fine for the two previously set up nodes. 02:35:00
@novisystems:matrix.orgNoviSystems *

andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again (node included) and still not joining despite a successful installation. Also it is worth mentioning that while these devices are behind a firewall, traffic to and from the node subnet is wide open and currently working fine for the two previously set up nodes.

I exported the private key again and verified that the key's principle matches the operator ID. I also made sure the BIOS options and BIOS version are the same as the current working nodes to further try and eliminate discrepancies.

02:51:45
@novisystems:matrix.orgNoviSystems *

andrewbattat [DFINITY]: Yes, everything has been previously restarted (2 Cisco Switches and a Netgate router). I tried the install again with today's IC release (IPv6 only again). I then restarted everything again (node included) and still not joining despite a successful installation. Also it is worth mentioning that while these devices are behind a firewall, traffic to and from the node subnet is wide open and currently working fine for the two previously set up nodes.

I exported the private key again and verified that the key's principle matches the operator ID. I also made sure the BIOS options and BIOS version are the same as the current working nodes to further try and eliminate discrepancies. Without more info I don't think theres much else I can do on my own.

02:52:24
@sasa-tomic:matrix.orgSaša Tomić [DFINITY]

Oleksandr: I just checked, the node isn't accessible from San Francisco (where we have one observability cluster), but is accessible from Chicago (where we have one more observability cluster) and Frankfurt (yet another one).
As a short-term tactical measure (next 1-2 days), we'll try to suppress the alerts from San Francisco.
For a strategic measure:

  • we will ask the San Francisco ISP to check the connectivity to your DC
  • you Oleksandr please also reach out to your ISP to check the connectivity from the IPv6 prefix of our San Francisco DC 2602:fb2b:100::/48 -- from where we can't connect.
11:31:07
@sasa-tomic:matrix.orgSaša Tomić [DFINITY] Oleksandr: We just applied the short-term tactical measure by not using the SF1 readings for evaluating node health. The node again shows up as healthy in the public dashboard, although it STILL DOES NOT HAVE connectivity from SF1.
It would be appreciated if you could reach out to your ISP and try to restore the connectivity.
11:34:13

Show newer messages


Back to Room ListRoom Version: 9