Netdata
Real-time infrastructure and application monitoring platform
Alternative to: Prometheus, Grafana, Nagios, Zabbix, Datadog, New Relic, Sensu, Dynatrace
v2.2.0
2025-01-22Table of Contents
Netdata Growth
- 1.5 million downloads per day
- 73k GitHub stars!
- 653M Docker Hub pulls!
Netdata continues to experience phenomenal growth, with over 1.5 million downloads daily through Cloudflare and Docker Hub, fueling observability for users worldwide.
Thanks to your unwavering support ❤️, Netdata is the leader in the observability category in the CNCF landscape, ahead of all other solutions including Elasticsearch, Grafana, and Prometheus in GitHub stars, showcasing the trust and admiration of our community.
This success is driving rapid adoption among enterprises, reflecting the growing recognition of Netdata as the go-to observability solution for both cloud-native and on-premises environments. Our commitment remains steadfast: to deliver cutting-edge, AI-powered observability with unmatched performance and simplicity—all while being significantly more affordable.
As we evolve, our focus on empowering businesses with higher-fidelity AI insights ensures Netdata remains the easiest and fastest way to optimize infrastructure and applications at any scale. 🚀
You like Netdata? Give Netdata a ⭐ too, on GitHub!
Release Summary
Netdata 2.2.0 delivers major performance optimizations targeting Parent-to-Parent streaming deployments, achieving 50% less memory usage across all installations, while Parent nodes streaming to other Parents benefit from 50% reduced bandwidth usage and 20% lower CPU utilization. New auto-detected configuration profiles optimize performance for different use cases: Parent, Child, and IoT deployments.
Release Highlights
Performance and Scalability Improvements
| Area | Improvement | Technical Details |
|---|---|---|
| Memory Management | 50% reduction in memory usage (all deployments) | Eliminated majority of memory fragmentation by separating long-standing data from ephemeral data using specialized allocation algorithms. Enhanced out-of-memory protection allows cache shrinking to near-zero when needed, compatible with container and Kubernetes environments. |
| Network Efficiency | 50% reduction in bandwidth (Parent-to-Parent streaming) | Implemented message combining for multiple metric updates, enabling higher compression ratios and reduced network overhead through larger network frames when streaming between Parent nodes. |
| Processing Performance | 20% reduction in CPU usage (Parent-to-Parent streaming) | Improved lock contention with advanced algorithms, enhancing fairness, responsiveness, and scalability specifically for Parents streaming to other Parents. DBEngine cache now delivers full capacity at scale. |
| Startup Performance | Significantly faster startup | Parallelized metadata loading based on available CPU cores, resulting in significantly reduced loading times. |
| Configuration Management | New profile system | Introduced auto-detected configuration profiles: - Parent: Optimized for data transfer between parents. - Child: Minimized resource footprint. - IoT: Configured for minimum resource usage. |
| Cloud Connectivity | Enhanced reliability | Reworked code to ensure stable Netdata Cloud connectivity under heavy load conditions on busy Netdata Parents. |
Enhanced Single Node Dashboard View
Introducing an improved single-node dashboard experience! You can now open a dedicated dashboard for any node directly from the Nodes tab (or other locations), giving you instant access to all critical data related to that node.
The dashboard consolidates essential information into intuitive tabbed views, including:
- Metrics
- Top
- Logs
- Alerts
- Anomalies
- Events
With automatic data filtering specific to the selected node, you can navigate seamlessly between tabs to quickly access troubleshooting insights. When you’re done, the entire dashboard can be closed effortlessly with a single click.
This streamlined experience makes monitoring and managing individual nodes easier and faster than ever before.
Personalized Space Navigation
Customizable Space Navigation
You can now personalize the order and appearance of your space icons in the left panel:
- Reorder Spaces: Simply drag and drop space icons to organize them in the order that works best for you.
- Customize Colors: Use the color picker to assign unique colors to your space icons, making them easily distinguishable.
- Save Preferences: Your customizations are saved automatically, ensuring a consistent experience every time you access your dashboard.
If you have multiple spaces, you can now find and access the spaces you are looking for much faster and more intuitively than before.
Acknowledgments
- @enoch85 for adding instructions for setting up email notifications for Docker Compose.
Contributions
Collectors
Improvements
- Add collector for optical transceiver modules (go.d/ethtool) (#19426, #19428,#19429, #19430, #19434 @ilyam8)
- Add support for Pod exclusion using
netdata.cloud/ignoreannotation (go.d/k8s_state) (#19342, @ilyam8) - Add container filtering via
netdata.cloud/ignorelabel (go.d/docker) (#19341, @ilyam8) - Add a configuration option to filter containers (go.d/docker) (#19337, #19339 @ilyam8)
- Add collector for YugabyteDB (go.d/yugabytedb) (#19325, @ilyam8)
- Add container filtering via
netdata.cloud/*labels (cgroups.plugin) (#19315, #19316, #19318 @ktsaou @ilyam8) - Add collector for NATS (go.d/nats) (#19252, #19253, #19262, #19264, #19266, #19280, #19282, #19284, #19285, #19303, #19309,#19311, @ilyam8)
- Port sensors collector from Go to C (debugfs/sensors) (#19251, #19294 @ktsaou)
Other
Packaging/Installation
All changes
- Drop Fedora 39 from CI and package builds (#19431, @Ferroin)
- Embed GPL-3 license locally instead of downloading from gnu.org for Windows packages (#19414, @ilyam8)
- Make libunwind opt-in at build time instead of auto-enabled (#19393, @Ferroin)
- Remove openSUSE 15.5 from CI and package builds (#19392, @Ferroin)
- Fix issues with $PATH and netdatacli detection (#19371, @Ferroin)
- Query systemd for unit paths instead of using hardcoded locations in installer/uninstaller (#19346, @Ferroin)
- Assorted systemd detection fixes (#19345, @Ferroin)
- Fix function name typo in prepare_offline_install in kickstart (#19323, @ilyam8)
- Add bison and flex dependencies required by vendored libsensors (#19292, @ilyam8)
- Update go toolchain to v1.23.4 (#19273, @ilyam8)
- Add
--auto-update-statusflag to display a configured auto-update mechanism to netdata-updater (#19248, @Ferroin)
Documentation
All changes
- Add instructions for setting up email notifications for Docker Compose (#19331, @enoch85)
- Improve On-Prem Cloud troubleshooting documentation clarity (#19279, @ilyam8)
- Add more Common Issues to On-Prem Cloud troubleshooting documentation (#19275, @M4itee)
- Add an alert guide for reboot required (#19260, @ralphm)
- Rename ‘Node Membership Rules’ to ‘Node Rule-Based Room Assignment’ (#19257, @ilyam8)
- Update copyright notices (#19256, @ktsaou)
Other Notable Changes
Improvements
- Improve Cloud connection reliability via async ACLK message handling (#19436, @stelfrag)
- Improve ACLK responsiveness by sending alert transitions asynchronously (#19397, @stelfrag)
- Reduce memory usage by minimizing glibc heap fragmentation (#19385, #19390, @ktsaou)
- Add metrics cardinality accounting and function (#19362, #19366, #19368, #19413 @ktsaou)
- Lower CPU overhead by batching stream compression (#19352, @ktsaou)
- Reduce memory usage by adding high-performance UUIDMap for memory-efficient dimension mapping (#19307, @ktsaou)
- Improve startup time by optimizing and cleaning up context loading (#19304, #19321, #19336, #19348, #19389, #19401, #19403, #19404, #19416, #19433 @ktsaou)
Other
- Adjust ACLK timeout (#19425, @ktsaou)
- Log stream_info payload when it cannot be parsed (#19424, @ktsaou)
- Fix coverity issues (#19422, @stelfrag)
- Use only compare-and-exchange for reference counting (#19411, @ktsaou)
- Use r/w spinlock instead of spinlock in Alert prototypes (#19410, @ktsaou)
- Fix system memory calculation for cgroups v1 (#19402, @ktsaou)
- Split RRD files and cleanup (#19399, #19405, @ktsaou)
- Fix nodes staying in initializing status (#19398, @ktsaou)
- Unify memory API (#19396, @ktsaou)
- Exclude unused memory in used_arena (#19382, @ktsaou)
- Fix mallinfo2 (#19381, @ktsaou)
- Limit the glibc unused memory (#19380, @ktsaou)
- Reduce lock contention on ARAL new page allocations (#19376, @ktsaou)
- Disable libunwind on forked children (#19374, @ktsaou)
- Fix alert entry traversal when doing cleanup (#19373, @stelfrag)
- Fix for PGC wanted_cache_size getting to zero (#19370, @ktsaou)
- Fix memory corruption in stream-thread (#19367, @ktsaou)
- Optimize replication status check frequency (#19361, @ktsaou)
- Respect flood protection configuration for daemon (#19360, @ktsaou)
- Fix os_system_memory() for concurrent use and call it from pulse (#19359, @ktsaou)
- Fix flood protection (#19358, @ktsaou)
- Allow compiling with FSANITIZE_ADDRESS (#19357, @ktsaou)
- Check cluster centers size in copy constructor of inlined kmeans (#19356, @vkalintiris)
- Fix Stream compression (#19355, @ktsaou)
- Fix compilation on Windows (#19354, @ktsaou)
- Fix logging of libunwind stack trace (#19353, @ktsaou)
- Reformat the metadata report to provide the complete picture (#19351, @ktsaou)
- Lower compression level to lower cpu resources on Parents (#19350, @ktsaou)
- Fix wanted cache size calculation and add chart for OOM protection (#19349, @ktsaou)
- Use sqlite3_status64 to prevent memory status integer overflow (#19347, @ktsaou)
- Add alert version to
netdatacli aclk-stateandapi/v1/aclk(#19335, @stelfrag) - Annotate logs with stack trace when libunwind is available (#19334, @ktsaou)
- Convert invalid utf8 sequences to hex characters (#19333, @ktsaou)
- Improve memory allocation failure handling: abort on fatal errors and log system memory (#19332, @vkalintiris)
- Improve performance of locks (#19314, @ktsaou)
- Improve and extend internal statistics (#19308, #19379, #19384,#19406, #19420, #19435, #19444 @ktsaou)
- Fix shutdown (#19306, @ktsaou)
- Fixed mixed-up ordering in waiting queue (#19305, @ktsaou)
- Rework Waiting Queue (#19302, @ktsaou)
- revert waiting-queue optimization (#19301, @ktsaou)
- Improve stream sending thread error message (#19300, @ilyam8)
- Optimize queuing and lock handling performance (#19299, @ktsaou)
- Implement nd_poll() fairness (#19298, @ktsaou)
- Enhance alert transition log verbosity (#19297, @ktsaou)
- Fix metric retention check and cleanup (#19278, @stelfrag)
- Skip label cleanup during metadata processing (#19274, @stelfrag)
- Optimize and stabilize key components (dbengine, webserver, streaming, libnetdata) (#19237, @ktsaou)
Deprecation notice
Changed in this release
All previously announced deprecations have been implemented in this release, except for the v1/v2 APIs and v0/v1 Dashboard versions, which remain available for now and will be removed in a future release.
Important Changes in Next Major Release
This release will be the last version supporting the following legacy components:
Deprecated Components
| Component Type | Versions Being Deprecated |
|---|---|
| APIs | v1, v2 |
What This Means
Starting with the next major release, only the v3 API and v3 Dashboard will be supported. These newer versions offer improved performance, enhanced features, and better security.
Important Changes in Next Minor Release
Old Dashboards
v0/v1 Dashboard versions will be removed.
Collector Changes
Sensors Collector
The Go Sensors Collector will be removed. A new C implementation is already available as part of the debugfs plugin.
Important: If you export data to external Time-Series Databases (TSDB) or use custom alerts, note:
- Some metric names have changed due to this rewrite.
- Review and update your configurations to reflect the new metric names.
- Consult the collector’s documentation for a complete list of current metrics.
SNMP Collector
The default value of the create_vnode option will change from no to yes. This means SNMP devices will automatically appear as Virtual Nodes in Netdata by default.
Important: If you want to maintain the current behavior and prevent SNMP devices from appearing as Virtual Nodes:
- You must explicitly set
create_vnode: noin your SNMP data collection job configurations. - Review and update your configurations before upgrading to ensure continuity in your monitoring setup.
Support options
As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:
- Premium Support: Customers who wish to have a direct channel with Netdata and prioritized support with defined SLAs can contact us.
- Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
- GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
- GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
- Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
- Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 2000 engineers are already using it!