WinterFlow.io - Netdata v2.1.0 release notes

Netdata Growth
Summary
Highlights
Acknowledgments
Contributions
Deprecation notice
Support options

Netdata Growth

1.5 million downloads per day
72.6k GitHub stars!
651M Docker Hub pulls!

Netdata continues to experience phenomenal growth, with over 1.5 million downloads daily through Cloudflare and Docker Hub, fueling observability for users worldwide.

Thanks to your unwavering support ❤️, Netdata is the leader in the observability category in the CNCF landscape, ahead of all other solutions, including Elasticsearch, Grafana, and Prometheus, in GitHub stars. This demonstrates the trust and admiration of our community.

This success drives rapid adoption among enterprises, reflecting the growing recognition of Netdata as the go-to observability solution for both cloud-native and on-premises environments. Our commitment remains steadfast: to deliver cutting-edge, AI-powered observability with unmatched performance and simplicity—all while being significantly more affordable.

As we evolve, our focus on empowering businesses with higher-fidelity AI insights ensures Netdata remains the easiest and fastest way to optimize infrastructure and applications at any scale. 🚀

You like Netdata? Give Netdata a ⭐ too, on GitHub!

Release Summary

This release focuses heavily on streaming functionality, enabling unprecedented scalability, reduced CPU overhead, and optimized memory utilization. Netdata has been re-architected to meet the demands of enterprise environments while maintaining its hallmark ease of use and affordability.

Release Summary

Release Highlights

Major Performance and Scalability Improvements

This release significantly enhances Netdata’s performance and streaming capabilities, with particular focus on multi-parent infrastructures:

Optimized CPU Usage: Streamlined ML model distribution and improved thread management reduce CPU utilization by 30–50% in parent-child setups.
Smarter Memory Management: New features prevent out-of-memory situations while maximizing cache usage for better query performance.
Enhanced Multi-Parent Scalability: Improved load balancing and connection handling for more stable operation at scale.
Optimized Query Processing: Prioritized handling of user queries ensures responsive experience even under heavy load.

Detailed Technical Improvements:

Category	Feature	Benefit
CPU Optimization	ML Model Streaming	• ML models now stream between Netdata Agents alongside metric data • Options for edge or central ML training • 30-50% CPU reduction in parent-child setups • Note: Next major version will disable ML training on children by default
	Thread Management	• Streaming threads fixed to match CPU cores • Single thread handles ingestion and re-streaming per node • Reduced context switches and cross-CPU communication
Memory Management	Out-of-Memory Protection	• Dynamic cache adjustment maintains 10% system memory buffer (max 5 GiB) • Container-aware (supports cgroups v1 and v2) • Configurable via `[db].dbengine out of memory protection`
	Cache Optimization	• Option to utilize all available memory for caching • Reduced disk I/O on busy parent nodes • Enable with `[db].dbengine use all ram for caches`
	ML Training Management	• Dynamic queue management prevents memory overload • Consistent performance during heavy ML workloads
Scalability	Parent Cluster Load Distribution	• Random parent selection for load distribution • Prevents single-node bottlenecks in large deployments
	Connection Handling	• Randomized reconnection timing • Prevents connection floods • Smoother large-scale reconnect handling
Query Performance	Query Prioritization	• Immediate response to user queries under any load • Connection operations get secondary priority • Background tasks (replication, ML) yield to high-priority operations • Quick new node integration through expedited backfilling
	Real-Time Response	• Responsive user experience during heavy processing • Efficient concurrent query handling • Maintains performance during high-load background operations

Cloud: Automated Room Assignment with Label-Based Rules

Netdata Cloud Dashboard introduces node rule-based room assignment—a powerful new feature that transforms how you organize your infrastructure monitoring:

Dynamic Room Assignment: Nodes are automatically placed into relevant rooms based on their host labels, eliminating manual organization.
Rule-Based Management: Create flexible rules using host labels to define where nodes belong, ensuring consistent organization.
Scale-Ready Architecture: As your infrastructure grows, new nodes are automatically sorted into appropriate rooms, maintaining clean monitoring structure.

Dynamic Room Allocation

Cloud: Configurable Alert Repeat Notifications

Netdata Cloud enhances alert management with customizable notification repeats:

Custom Repeat Intervals: Set how often you want to be reminded about ongoing alerts for each notification channel.
Automated Follow-ups: Receive automatic notification repeats for unresolved alerts based on your specified timeframe.
Channel-Specific Settings: Configure different repeat frequencies for each integration to match your workflow.

Repeat Notifications Help Text

Cloud: Pin Your Essential Charts with Dashboard Favorites

Netdata Cloud Dashboard introduces favorites pinning for faster access to your critical monitoring views:

One-Click Pinning: Select and pin your most important charts and sections directly from the dashboard.
Quick-Access Organization: Pinned items appear at the top of your Table of Contents for instant visibility.

Favourites

Dynamic Configuration: Bulk Operations for Collectors and Alerts

Dynamic Configuration in Netdata now supports bulk operations on monitoring settings. You can perform the following operations on multiple collector jobs and health checks at once:

Enable/Disable
Restart
Delete

Acknowledgments

@orisano for removing a duplicated row in logging readme.

Contributions

Collectors

Improvements

Add dyncfg support for Virtual Nodes (go.d.plugin) (#19205, #19207, #19214, #19238 @ilyam8)
Add monitoring of /run/reboot-required (proc.plugin) (#19109, @ilyam8)
Add “force_http2” option to collectors that use HTTP for metrics collection (go.d.plugin) (#19047, @ilyam8)
Add support for checking full chain expiry time (go.d/x509check) (#19001,#19004 @ilyam8)
Add data collection status chart and alert (go.d.plugin) (#18981, #18989, #18990 @ilyam8)
Add cluster support for RabbitMQ collector (go.d/rabbitmq) (#18965, #18972, #18976 @ilyam8)

Bug fixes

Prevent connection leak when Ping fails (go.d/mongodb) (#19232, @ilyam8)
Properly release file locks during service reload (go.d.plugin) (#19153, #19154 @ilyam8)
Handle “HPE Smart Array” line in HPSSA collector (go.d/hpssa) (#19084, @ilyam8)
Handle missing sysName gracefully in SNMP collector (go.d/snmp) (#18970, @ilyam8)

Other

Add MegaCli64 to ndsudo (#19223, @ilyam8)
Code refactor for simplicity (go.d.plugin) (#19143, #19145, #19146, #19155 @ilyam8)
Minor Hyper-V fixes (windows/hyperv) (#19130, @ilyam8)
Reduce EBPF memory usage (#19117, @stelfrag)
Disable python example collector (python.d/example) (#19114, @ilyam8)
Disable monitoring of /run/reboot-required on non-Debian systems (proc/reboot_required) (#19110, @ilyam8)
Improve error handling in callback functions in socket package (go.d.plugin) (#19103, @ilyam8)
Correct close idle connections in web package (go.d.plugin) (#19052, @ilyam8)
Implement terminating on QUIT command (go.d.plugin) (#19038, @ilyam8)
Preserve original process names in metrics labels (windows/netframework) (#19036, @ilyam8)
Auto-adjust GOMAXPROCS based on container CPU limits (go.d.plugin) (#19023, #19026 @ilyam8)
Code cleanup and renames (go.d.plugin) (#18987, #19081, #19087, #19090, #19180 @ilyam8)

Packaging/Installation

All changes

Add PCRE2 development library to required packages (#19217, @ilyam8)
Disable compilation of H2O (#19216, #19218, @ilyam8)
Use setuid as a fallback for static builds when setcap fails for plugins (#19215, @ilyam8)
Fix native package availability check on Debian-based systems in kickstart (#19183, @ilyam8)
Update deb repository config fetched by kickstart to the latest version (#19181, @ilyam8)
Update incorrect checksum for Golang (32-bit Linux) (#19127, @ilyam8)
Add —dev option to installer (#19034, @ktsaou)
Improve Windows installer (#18983, #19122, #19132, #19159 @thiagoftsm)

Documentation

All changes

Fix deployment command for Windows Agent nightly version (#19236, @ilyam8)
Update network requirements to use domain-based allowlisting for Cloud connectivity (#19222, @M4itee)
Add a user guide for dynamic room configuration (#19199, @kapantzak)
Remove a duplicated row in logging readme (#19190, @orisano)
Reorder silent mode and add full pipeline command examples (#19176, @Ancairon)
Fixup URLs in package repo documentation to use index files (#19174, @Ferroin)
docs: leftover links + changes on api-tokens.md (#19162, @Ancairon)
Improve Cloud Authentication and Authorization docs (#19160, @Ancairon)
Improve Cloud Plans and ACLK docs (#19140, @Ancairon)
Improve Cloud readme (#19139, @Ancairon)
Reorganize Netdata repo readme introduction for clearer project overview (#19134, @ilyam8)
Update window plugin metadata (#19129, #19147, #19158, #19171, #19175, #19188, #19182 @thiagoftsm @ilyam8)
Fix formatting, typos, and some simplifications in the docs/ directory (#19112, @ilyam8)
Improve Cloud On Prem docs (#19104,#19105, [#19107](https://github.com/netdata/netdata/pull/19107 @Ancairon)
Improve Organize Your Infrastructure documentation (#19101, @Ancairon)
Improve readability of Claiming documentation (#19100, @Ancairon)
Improve Registry docs (#19095, @Ancairon)
Fix full-text search instructions and typos in systemd-journal plugin readme (#19093, #19066 @ilyam8)
Improve Daemon docs(#19091, @Ancairon)
Remove stale docs, and update links and optimization documentation (#19089, @Ancairon)
Remove Go windows integration (#19078, @Ancairon)
Split database overview and configuration reference (#19077, @Ancairon)
Improve database docs (#19075, @Ancairon)
Update sizing Netdata Agent pages (#19074, @Ancairon)
Simplify collector configuration page (#19072, @Ancairon)
Create a terminology dictionary for Netdata (#19071, @Ancairon)
Update terminology from “claim” to “connect” for Node connection process (#19060, @Ancairon)
Update Windows installation docs (#19054, @Ancairon)
Cleanup Securing Agents section docs (#19053, @Ancairon)
Update documentation about our native package repos (#19049, @Ferroin)
Capitalize the word “Agent” and “Cloud” (#19043, #19044, @Ancairon)
Remove references to old MSI installer from go.d/windows metadata (#19024, @ilyam8)
Add deprecation notice go.d/windows collector (#19009, @ilyam8)
Update Windows installation and deployment documentation (#18765, #18928, @thiagoftsm)

Other Notable Changes

Improvements

Optimize metric processing for high-volume data collection from Child nodes (#18945, #19137, #19167, #19168, #19186, #19193, #19196, #19204, #19206 @ktsaou)

Bug Fixes

Dynamic updates of Virtual Host name now properly sync to Netdata Cloud (#19163, @stelfrag)

Other

Improve shutdown handling by preventing data file rotation and deferring alert state changes to startup (#19241, @stelfrag)
Rename some internal charts context for better organization (#19239, @ilyam8)
RRDHOST system-info isolation (#19235, @ktsaou)
Allow more threads to load contexts during startup (#19234, @stelfrag)
Release health summary memory when host health monitoring is disabled (#19233, @stelfrag)
Fix heap use after free in health (#19228, @ktsaou)
Fix compiler warnings on 32-bit (#19221, @ktsaou)
Remove July arrays (#19194, @stelfrag)
Allow recursive readers, even when writers are waiting (#19191, @ktsaou)
Send QUIT to plugins (#19166, @ktsaou)
Add units per context to /api/v3/contexts (#19165, @ktsaou)
Fixed bug in streaming sender read (#19136, @ktsaou)
Minor beatification of log messages (#19135, @ktsaou)
Update macOS identification to use consistent naming in system-info.sh (#19128, @ilyam8)
Avoid scanning charts for replication status (#19124, @stelfrag)
Move eBPF code from linetdata to src/collector (#19121, @thiagoftsm)
Change default nice level to 0 (#19120, @ilyam8)
Fix undefined behaviour in ebpf_select_pc_prefix() (#19116, @vkalintiris)
Use system environment proxy settings by default for Cloud connection and add connection logging (#19098, @ktsaou)
Reset parameter when generating an alert snapshot (#19097, @stelfrag)
Correct reporting of metrics count, instance count, and context statistics (#19094, @ktsaou)
Add optional mimalloc allocator support at compile time (#19080, #19118, @stelfrag)
Update gorilla compression internal charts family (#19068, @ilyam8)
Do not intentionally abort on non-0 exit code (#18991, @vkalintiris)
Add /api/v3/stream_path (#18943, @ktsaou)

Deprecation notice

Important Changes in Next Major Release

This release will be the last version supporting the following legacy components:

Deprecated Components

Component Type	Versions Being Deprecated
APIs	v1, v2
Dashboards	v0, v1

What This Means

Starting with the next major release, only the v3 API and v3 Dashboard will be supported. These newer versions offer improved performance, enhanced features, and better security.

Important Changes in Next Release

1. Removal of go.d Windows Collector

The go.d Windows collector will be removed in the next release. Users should migrate to the native Windows Netdata Agent.

2. Kubernetes Service Discovery Changes

Removed Components

The Agent Service Discovery sidecar container will be removed from the Netdata Helm chart as this functionality is now natively integrated into the go.d.plugin.

Impact on Custom Configurations

If you have custom Kubernetes service discovery configurations, you will need to update your settings in the following sections:

Old Section	New Section	Description
`discovery`	`discover`	Section for configuring the Kubernetes service discoverer
`build`	`compose`	Section for creating data collection job configurations

Example Migration

Previous Configuration Format

discovery:
  k8s:
    - tags: unknown
      role: pod
      local_mode: true
build:
  - name: "Applications"
    selector: '!unknown applications'
    tags: file
    apply:
      - selector: apache
        template: |
          - module: apache
            name: apache-{{.TUID}}
            url: http://{{.Address}}/server-status?auto

New Configuration Format

# Root sections renamed to "discover" and "compose"
discover:
  - discoverer: k8s
    k8s:
      - tags: unknown
        role: pod
        pod:
          local_mode: yes
compose:  # Renamed from "build"
  - name: "Applications"
    selector: "app"
    config:  # Renamed from "apply"
      - selector: "apache"
        template: |
          - module: apache
            name: apache-{{.TUID}}
            url: http://{{.Address}}/server-status?auto

Required Actions

Migrate to the new syntax before upgrading
Refer to the current Netdata Helm chart service discovery configuration for the updated syntax.

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Premium Support: Customers who wish to have a direct channel with Netdata and prioritized support with defined SLAs can contact us.
Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!

Netdata

v2.1.0

Table of Contents

Netdata Growth

Release Summary

Release Summary

Release Highlights

Major Performance and Scalability Improvements

Cloud: Automated Room Assignment with Label-Based Rules

Cloud: Configurable Alert Repeat Notifications

Cloud: Pin Your Essential Charts with Dashboard Favorites

Dynamic Configuration: Bulk Operations for Collectors and Alerts

Acknowledgments

Contributions

Collectors

Packaging/Installation

Documentation

Other Notable Changes

Deprecation notice

Important Changes in Next Major Release

Important Changes in Next Release

1. Removal of go.d Windows Collector

2. Kubernetes Service Discovery Changes

Support options