23 Oct 2020 |
Marko | In reply to @mthompson:matrix.org Is there a way to use Prometheus/Grafana to push an Alert if Kafka external Producer/Consumer connection is down? Yeah, here is. You need to use i think kafka exprter to export kafka metrics to Prometheus format endpoint, and then configure Prometheus to scrape that endpoint, now Grafana gets in place here, you will use grafana to make alerts or you could do it in Prometheus too. | 15:52:09 |
28 Apr 2021 |
| kunal kothari joined the room. | 06:05:48 |
kunal kothari | hi all can anyone help me out having issue in kafka connector | 06:06:57 |
3 May 2021 |
| Doug Whitfield joined the room. | 16:34:44 |
Doug Whitfield | In reply to @shukuu:matrix.org hi all can anyone help me out having issue in kafka connector what is the issue? | 16:35:19 |
Doug Whitfield | In reply to @nakul_riot:matrix.org is there anyway that we can fix this without increasing the ulimit value well, depends on what you mean by ulimit, and how the service was started. systemd has some settings, like TasksMax, that don't show up specifically in the ulimit files | 16:36:34 |
Doug Whitfield | I think basically TasksMax is slices, but I'm by no means an expert | 16:36:50 |
Doug Whitfield | but, you could also put some limits on the Kafka side. I'm not sure you are going to get the performance you want. ulimits is the obvious place to start, as you seem to know | 16:37:37 |
Doug Whitfield | I have an issue where performance is low. I am seeing stuff like this:
[2021-05-03 12:53:26,871] INFO [GroupMetadataManager brokerId=0] Finished loading offsets and group metadata from __consumer_offsets-28 in 1674 milliseconds, of which 1674 milliseconds was spent in the scheduler.
There is a little bit of RAM pressure on the system, but not enough for me to say for sure RAM is the issue
For comparison, here's a random entry I found trying to search for the issue online:
Finished loading offsets and group metadata from __consumer_offsets-0 in 28 milliseconds, of which 0 milliseconds was spent in the scheduler. | 16:41:01 |
Doug Whitfield | I'm less concerned with the total time than I am with the scheduler taking up so much of the time | 16:41:23 |
Doug Whitfield | I am wondering if they can talk to their zookeeper, though I am not sure how it functions at all without zookeeper since KIP-500 is not yet fully implemented | 17:34:31 |
Doug Whitfield | so, this goes some way to answering the second question: https://github.com/dpkp/kafka-python/issues/308 | 17:50:59 |
4 May 2021 |
Doug Whitfield | how do people feel about using Kafka as a cache? To me, the idea that you are going to have to learn another tool such as redis is bogus, because you are still going to have to learn using Kafka as a cache, but I know people do it and I am sure they have their reasons. I guess with Redis specifically their are licensing issues. There's infinispan, but I guess that's just java (not going to pretend like I am a java expert). What other caching options are there? | 14:33:55 |
Doug Whitfield | seems like there are some use cases where using Kafka as a cache is a decent idea. It's not real clear to me when those instances are | 15:50:38 |
10 Jun 2021 |
| manujaro joined the room. | 09:44:43 |
23 Jun 2021 |
| @jonbringhurst:matrix.org joined the room. | 23:18:50 |
| @jonbringhurst:matrix.org changed their display name from jonbringhurst to jon. | 23:31:24 |
3 Aug 2021 |
| mr.mike joined the room. | 15:52:16 |
12 Aug 2021 |
| @rohanrode:matrix.org joined the room. | 04:20:18 |
@rohanrode:matrix.org | hi | 04:20:38 |
| @rohanrode:matrix.org left the room. | 04:21:49 |
3 Sep 2021 |
| Suriyadeva Sasi joined the room. | 09:03:59 |
14 Sep 2021 |
| Leonan Ferreira joined the room. | 00:53:27 |
24 Sep 2021 |
Doug Whitfield |
- for a topic with 3 partitions, there is a broker lead for each partition and a single publisher (round-robin) will publish to all three broker leaders. Correct?
- for a topic with 3 partitions and three publisher instances running, each publisher would use all three broker leaders. The three publisher instances wouldn’t be aligned specifically to one of the three broker leaders. Correct?
- Rather than developing a model for dividing the work among publishers, wouldn’t the publisher offset internal topic manage that? it is supposed to help the publisher understand where it is in publishing events and where to start looking for the next event to publish. This would only work for a three publisher model if the publishers shared the internal publisher offset topic. I’m assuming that isn’t the case? each publisher instance would have its own internal offset topic?
| 13:28:11 |
28 Sep 2021 |
Doug Whitfield | Regarding 3, aka "Rather than developing a model for dividing the work among publishers, wouldn’t the publisher offset internal topic manage that? it is supposed to help the publisher understand where it is in publishing events and where to start looking for the next event to publish. This would only work for a three publisher model if the publishers shared the internal publisher offset topic. I’m assuming that isn’t the case? each publisher instance would have its own internal offset topic?"
After some research, I responded with this:
"If you’re referring to the idea that a publisher can store its current offset in a Kafka topic, and coordinate with other publishers working in tandem with it by relying on the committed topic offset, I’d like to clarify some terms first. If a producer is also a consumer, the consumer facet or nature of this worker/thread/app/route is what contains the offset. If you were consuming files from a network share, for example, you might move them to a hidden folder or delete them when you were done consuming them for publication onto a kafka topic, a simple but effective way of marking which files have been consumed and published and which are ready for consumption. In Kafka, the offset refers to the latest available message in the topic. So here, I’m assuming you mean your producer is also a kafka consumer, and is doing some work like enrichment before publishing the completed work to another topic for further processing.
If you want to coordinate between consumer-producers in this way, I would create a single topic with a single message type, but publish messages from each consumer-producer with a header stating which one is which. Then, on startup of the consumer-producer, you can consume a message first from the offset topic, filter for the header to determine which offset message belongs to the individual worker, and then start a stream at the offset it acquires."
Now the customer has responded with this:
"I don’t think I did a good job presenting point 3 to you. This was in the context of a Kafka Connect Publisher (Source).. I didn’t do a lot of homeowrk on this but the idea is that Kafka Connect keeps track of where the publisher is in publishing from a source (database). https://docs.confluent.io/platform/6.2.1/connect/javadocs/javadoc/org/apache/kafka/connect/source/SourceRecord.html
I think the response to point 3 was in regards to the standard Java Publisher API and not considering connect. Can I get an alternative response considering Kafka Connect Publishers?"
Any thoughts? | 14:34:47 |
Doug Whitfield | Do people understand the question? | 21:46:16 |
16 Dec 2021 |
| @dlamanov:matrix.org joined the room. | 14:51:07 |
| @dlamanov:matrix.org left the room. | 14:52:41 |
21 Dec 2021 |
| @hekmek:matrix.org joined the room. | 11:29:22 |
27 Dec 2021 |
| @hekmek:matrix.org left the room. | 21:22:19 |