Test environment is as follows:
```
CPU: Intel L5640 2.26 GHz 6 cores * 2 EA
Memory: SAMSUNG PC3-10600R 4 GB * 4 EA
HDD: TOSHIBA SAS 10,000 RPM 300 GB * 6 EA
OS: CentOS release 6.6 (Final)
Logstash 2.3.4
```
I used the following configuration:
```
input {
kafka {
zk_connect => '1.2.3.4:2181'
topic_id => 'some-log'
consumer_threads => 1
}
}
filter {
metrics {
meter => "events"
add_tag => "metric"
}
}
output {
if "metric" in [tags] {
stdout { codec => line {
format => "Count: %{[events][count]}"
}
}
}
}
```
I got the following result:
```
./bin/logstash -f some-log-kafka.conf
Settings: Default pipeline workers: 24
Pipeline main started
Count: 9614
Count: 23080
Count: 37087
Count: 50815
Count: 64517
Count: 78296
Count: 91977
Count: 105990
```
Default `flush_interval` is 5 seconds, so it looks roughly 14K per 5 seconds (2.8K per second).
With `consumer_threads` set to 10, I got the following result:
```
./bin/logstash -f impression-log-kafka.conf
Settings: Default pipeline workers: 24
Pipeline main started
Count: 9599
Count: 23254
Count: 37253
Count: 51029
Count: 64881
Count: 78868
Count: 92663
Count: 106267
```
It looks increasing `consumer_threads` doesn't make much difference.
Based on benchmark using my simple no-op consumer built with Kafka client Java library in the same machine, I expected around 30K and at least 10K but it's just 1/10 of the expected performance.
I'm not sure this could be enhanced by customizing configuration.
As a base test, I tested with `generator` as follows:
```
input {
generator { }
}
filter {
metrics {
meter => "events"
add_tag => "metric"
}
}
output {
#stdout { }
if "metric" in [tags] {
stdout { codec => line { format => "Count: %{[events][count]}"
}
}
}
}
```
I got the following result:
```
./bin/logstash -f generator.conf
Settings: Default pipeline workers: 24
Pipeline main started
Count: 200584
Count: 424425
Count: 651640
Count: 881605
Count: 1110150
```
It looks roughly 220K per 5 seconds (44K per second). It's not good as much as I expected as my simple no-op consumer built with Kafka client Java library consumed from 30K to 50K per second.
What am I missing here?
References:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html
http://izeye.blogspot.kr/2016/07/benchmark-simple-no-op-kafka-consumer.html
No comments:
Post a Comment