!AhUnkatMeayVVxZprp:matrix.org

talos-support

103 Members
Talos.dev product support2 Servers

Load older messages


SenderMessageTime
2 Aug 2021
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
Yea...I wish I had looked at those devices on older instances to see what permissions it had. Could a kernel version bump have changed what permissions are set on it?
I should be able to just boot on B slot (assuming I'm currently on A) to go back to 0.11.0 (where I was at before upgrading), right?
15:35:23
@_slack_taloscommunity_UGL0YU56H:matrix.organdrey
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
I should be able to just boot on B slot (assuming I'm currently on A) to go back to 0.11.0 (where I was at before upgrading), right?
yes, it should work
15:36:34
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_UGL0YU56H:matrix.org
yes, it should work
👍🏻 thanks...I try that in a bit to see what I can figure out.
15:37:52
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
👍🏻 thanks...I try that in a bit to see what I can figure out.
same thing on 0.11.0. i’ll have to keep digging. just wish i could remember the last time i had checked for hardware transcoding. for most of the stuff we watch it isn’t an issue to do software transcoding on the cpu. but the 4k stuff needs it to prevent buffering and nearly maxing out the cpu
16:11:45
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk if theila shows an error state for a specific server, where is it seeing that? 19:22:07
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk i do indeed have a server that doesnt seem to want to build after its allocation is requested and its showing up as in an error state with theila 19:22:49
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk i cant seem to figure out where to find the errors itself using kubectl. there is nothing in describe server 19:23:05
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk Maybe just Ready: false? 19:24:30
@_slack_taloscommunity_U017CK35MFA:matrix.orgartem
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
Maybe just Ready: false?
yeah, error means Ready: false right now
19:30:05
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U017CK35MFA:matrix.org
yeah, error means Ready: false right now
thanks artem. where else can i look as to what can cause that?
19:31:05
@_slack_taloscommunity_U017CK35MFA:matrix.orgartem
In reply toundefined
you can see the other flags if you expand the server item. Theila can help you if server is not accepted. You can accept the server in the context menu under ... button. And if that's not the case, then any deeper investigation may require using kubectl and looking at sidero container's logs
19:33:12
@_slack_taloscommunity_U017CK35MFA:matrix.orgartem
In reply toundefined
(edited) ... sidero container => ... sidero container's logs
19:33:14
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U017CK35MFA:matrix.org
you can see the other flags if you expand the server item. Theila can help you if server is not accepted. You can accept the server in the context menu under ... button. And if that's not the case, then any deeper investigation may require using kubectl and looking at sidero container's logs
ah i think i need to look at the logs...
19:33:43
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
ah i think i need to look at the logs...
not much useful stuff in the kubectl describe server output
19:34:20
@_slack_taloscommunity_U017CK35MFA:matrix.orgartem
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
not much useful stuff in the kubectl describe server output
Makes me feel that we need to add a way to view pods logs in Theila next.
19:36:53
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U017CK35MFA:matrix.org
Makes me feel that we need to add a way to view pods logs in Theila next.
that would be cool. im mostly CLI guy, but its nice to have a ui to run to in a pinch
19:37:20
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
that would be cool. im mostly CLI guy, but its nice to have a ui to run to in a pinch
also being able to view a pie chart or something if available hw based on serverclass would be nice
19:37:58
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
also being able to view a pie chart or something if available hw based on serverclass would be nice
we have a bunch of different types of hw
19:38:09
@_slack_taloscommunity_UEGUHLTR9:matrix.orgAndrew Rynhard
In reply to@_slack_taloscommunity_U01KV18KND9:matrix.org
we have a bunch of different types of hw
what makes a server "ready"?
19:38:51
@_slack_taloscommunity_UEGUHLTR9:matrix.orgAndrew Rynhard
In reply to@_slack_taloscommunity_UEGUHLTR9:matrix.org
what makes a server "ready"?
clean and accepted
19:38:56
@_slack_taloscommunity_U01KV18KND9:matrix.orgBJ Badyk
In reply to@_slack_taloscommunity_UEGUHLTR9:matrix.org
clean and accepted
this is def a server issue, but nice to know how that error state happens
19:46:44
3 Aug 2021
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
same thing on 0.11.0. i’ll have to keep digging. just wish i could remember the last time i had checked for hardware transcoding. for most of the stuff we watch it isn’t an issue to do software transcoding on the cpu. but the 4k stuff needs it to prevent buffering and nearly maxing out the cpu
FYI andrey i’ve put up a feature request issue for this. I’ve got a workaround I can use for now until we we can get something more proper figured out. https://github.com/talos-systems/talos/issues/4001
00:58:54
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
FYI andrey i’ve put up a feature request issue for this. I’ve got a workaround I can use for now until we we can get something more proper figured out. https://github.com/talos-systems/talos/issues/4001
i just realized the actual path to the current rules.d is /usr/etc/udev/rules.d rather than the typical /etc/udev/rules.d. and looks like /usr/etc/udev is an ‘overlay mount on top of the read-only filesystem’. Soooo…with that I had suspected that it was actually R/W…so i tossed a job on that node to write a udev rules file, rebooted the node, and put plex back on that node and voila i’ve got the correct permissions and plex is hardware transcoding again. So really, i think if we just updated whatever that restriction is that only allows the files in the machine config to only write to /var, so that it allows writing to /usr/etc/udev/rules.d , then we should be able to support udev rules via the files in the machine config.
---
apiVersion: batch/v1
kind: Job
metadata:
  name: udev-rules
  namespace: default
spec:
  template:
    spec:
      restartPolicy: Never
      volumes:
        - name: rulesd
          hostPath:
            path: /usr/etc/udev/rules.d
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - talos-192-168-32-4
      containers:
        - name: write-udev-rules
          image: alpine:3.14.0
          resources:
            requests:
              memory: 30Mi
              cpu: 20m
            limits:
              memory: 30Mi
              cpu: 20m
          volumeMounts:
            - mountPath: /usr/etc/udev/rules.d
              name: rulesd
          command:
            - sh
            - -c
            - ,-
              set -eux
              echo 'SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="44", MODE="0660"' > /usr/etc/udev/rules.d/99-intel-gpu.rules
              echo 'SUBSYSTEM=="drm", KERNEL=="card*", GROUP="44", MODE="0660"' >> /usr/etc/udev/rules.d/99-intel-gpu.rules
  backoffLimit: 0
04:32:56
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply toundefined
(edited) ... content: :- SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="video", MODE="0666"``` => ... content: |- SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="video", MODE="0666"```
05:47:46
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply toundefined
after finding the relevant code, i realized I can actually still use the files machine config by making the path relative to /var. obviously quite hacky and potential for breakage in the future. The downside here is that the file gets written after udev has already started its thing. So for already present devices (like the intel gpu), it won’t get picked up until you reboot again after having the boot process write the file. ~Although maybe restarting the udev service would allow it to pick it up without having to reboot. I’ll try that next time…for now it’s late and I need to sleep.~ Nope: looks like udevd doesn’t support restart via API
files:
    - path: /var/../usr/etc/udev/rules.d/99-intel-gpu.rules
        permissions: 0o644
        op: create
        content: ,-
        SUBSYSTEM=="drm", KERNEL=="card*", GROUP="44", MODE="0660"
        SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="44", MODE="0660"
05:59:59
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply toundefined
(edited) ... file. Although maybe restarting the udev service would allow it to pick it up without having to reboot. I’ll try that next time…for now it’s late and I need to sleep. ```files: - path: /var/../usr/etc/udev/rules.d/99-intel-gpu.rules permissions: 0o644 op: create content: |- SUBSYSTEM=="drm", KERNEL=="card*", GROUP="44", MODE="0660" SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="44", MODE="0660"``` => ... file. ~Although maybe restarting the udev service would allow it to pick it up without having to reboot. I’ll try that next time…for now it’s late and I need to sleep.~ Nope: looks like udevd doesn’t support restart via API ```files: - path: /var/../usr/etc/udev/rules.d/99-intel-gpu.rules permissions: 0o644 op: create content: |- SUBSYSTEM=="drm", KERNEL=="card*", GROUP="44", MODE="0660" SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="44", MODE="0660"```
06:04:40
@_slack_taloscommunity_UGL0YU56H:matrix.organdrey
In reply to@_slack_taloscommunity_U01Q4R344CU:matrix.org
after finding the relevant code, i realized I can actually still use the files machine config by making the path relative to /var. obviously quite hacky and potential for breakage in the future. The downside here is that the file gets written after udev has already started its thing. So for already present devices (like the intel gpu), it won’t get picked up until you reboot again after having the boot process write the file. ~Although maybe restarting the udev service would allow it to pick it up without having to reboot. I’ll try that next time…for now it’s late and I need to sleep.~ Nope: looks like udevd doesn’t support restart via API
files:
    - path: /var/../usr/etc/udev/rules.d/99-intel-gpu.rules
        permissions: 0o644
        op: create
        content: ,-
        SUBSYSTEM=="drm", KERNEL=="card*", GROUP="44", MODE="0660"
        SUBSYSTEM=="drm", KERNEL=="renderD*", GROUP="44", MODE="0660"
That was a cool hack btw, yes, I feel we should expose udev rules as the first class citizens in the machine configuration. Thanks for digging into that!
07:06:23
@_slack_taloscommunity_U01Q4R344CU:matrix.orgBranden Cash
In reply to@_slack_taloscommunity_UGL0YU56H:matrix.org
That was a cool hack btw, yes, I feel we should expose udev rules as the first class citizens in the machine configuration. Thanks for digging into that!
👍 yea…being first class citizens would be awesome 🎉
16:03:48
4 Aug 2021
@_slack_taloscommunity_U01N8S9DANQ:matrix.orgBarnabas Kovacs joined the room.13:05:22
@_slack_taloscommunity_U02AJG7BBT3:matrix.orgMHS joined the room.14:03:42

There are no newer messages yet.


Back to Room List