Release v1.32.0

The newest version of Netdata, v.1.32.0, propels us toward the end of the year, and the Netdata community is positioned to grow stronger than ever in 2022. Before we get into specifics of the new release, it’s worth reflecting on that growth.

Netdata open-source Agent growth

The open-source Netdata Agent, the best OSS node monitoring and troubleshooting ever, currently has:

1,000,000 unique Netdata nodes live!
330,000 engineers using the agent per month!
Our open-source community growing at an amazing rate, with 3,000 new nodes and 8,000 users per day!
250,000 Docker pulls per day with 360 million total, according to DockerHub!

Netdata Cloud growth

The Netdata Cloud, our infrastructure-level, distributed, real-time monitoring and troubleshooting orchestrator, is also showing similar growth, with:

35,000 live Netdata nodes!
90,000 engineers signed up with 200 new sign-ups every day!
180 new spaces created every day!

We are not just pleased with this amazing adoption rate, we are inspired by it. It is you users who give us the energy and confidence to move forward into a new era of high-fidelity, real-time monitoring and troubleshooting, made accessible to everyone!

Thank you for the inspiration! You rock!

Community News

As many of you know, even though we are not endorsed by CNCF, Netdata is the fourth most starred project in the CNCF landscape. We want to thank you for this expression of your appreciation. If you love Netdata and haven’t yet, consider giving us a Github star.

Additionally, we invite you to join us on our new Discord server to continue our growth and trajectory, but also to join in on fun and informative live conversations with our wonderful community.

v1.32.0 at a glance

The following offers a high-level overview of some of the key changes made in this release, with more detailed description available in subsequent sections.

New Cloud backend and Agent communication protocol This Agent release supports our new Cloud backend. From here, we will be offering much faster and simpler communication, reliable alerts and exchange of metadata, and first-time support for the parent-child relationship of Netdata agents. This is the first Agent release that allows Netdata Cloud to use the Netdata Agent as a distributed time-series database that supports replication and query routing, for every metric!

eBPF latency monitoring, container monitoring, and more We use eBPF to monitor all running processes, without the cooperation of the processes and without sniffing data traffic. This new release includes 13 new eBPF monitoring features, including I/O latency, BTRFS, EXT4, NFS, XFS and ZFS latencies, IRQs latencies, extended swap monitoring, and more.

Machine learning (ML) powered anomaly detection This release links Netdata Agent with dlib, the popular C++ machine learning algorithms library, which we use to automatically detect anomalies out-of-the-box, at the edge! Once enabled, Netdata trains an ML model for every metric, which is then used to detect outliers in real-time. The resulting “anomaly bit” (where 0=normal, 1=anomalous) associated with each database entry is stored alongside the raw metric value with zero additional storage overhead! This feature is still in development, so it is disabled by default. If you would like to test it and provide feedback, you can enable the feature using the instructions provided in the Detailed release highlights section.

New timezone selector and time controls in the user interface We implemented a new timezone picker and time controls to enhance administrative abilities in the dashboard.

Docker image POWER8+ support Netdata Docker images now support recent IBM Power Systems, Raptor Talos II, and more.

And more… Four new collectors, 112 total improvements, 95 bug fixes, 49 documentation updates, and 57 packaging and installation changes!

Detailed release highlights

New Cloud backend and Agent communication protocol

It’s no secret that the best of Netdata Cloud is yet to come. After several months of developing, testing, and benchmarking a new architectural system, we have steadied ourselves for that growth. These changes should offer notable and immediate improvements in reliability and stability, but more importantly, they allow us to quickly and efficiently develop new features and enhanced functionality. Here’s what you can look for on the short-term horizon, thanks to our new architecture:

Greater capacity: The new architecture will change the communication protocol between the Agent and the Cloud to be incremental, improving our agent-handling capacity by ensuring that the Cloud uses measurably less bandwidth.
Parent/child relationships: The new architecture will allow, for the first time, the recognition of parent child relationships in the Cloud. These changes will enable you to change storage configuration on parents, limit sent metrics, and reduce data frequency to achieve a longer data retention for your nodes. Atop of this, we will continue to develop the ability for you to have complex setups to scale your monitoring with parents as proxies. Ultimately, this will enable Netdata to operate as a headless connector with the lowest footprint possible on your production nodes.
Alerts: The new architecture will host a multitude of improvements on our alerts presentation over the coming months, allowing for enhanced reliability, alert management, alert logs to be collected in the Cloud, and more.

If you would like to be among the first to test this new architecture and provide feedback, first make sure that you have installed the latest Netdata version following our guide. Then, follow our instructions for enabling the new architecture.

eBPF container monitoring

We did a lot of work to enhance our eBPF container monitoring this release. First, we start with the development of full eBPF support for cgroups. As a refresher on just how important this update is: cgroups together with Namespaces are the building blocks for containers, which is the dominant way of distributing monitoring applications. We use cgroups to control how much of a given key resource (CPU, memory, network, and disk I/O) can be accessed or used by a process or set of processes. Our eBPF collector now creates charts for each cgroup, which enables us to understand how a specific cgroup interacts with the Linux kernel! 🤓

This enhances our already extensive monitoring by including cgroups for mem, process, network, file access, and more.

eBPF latency monitoring

By enabling eBPF monitoring on all systems that support it, Netdata has already been established as a world-leading distributor of eBPF! We use eBPF to monitor all running processes, without the cooperation of the processes, by tracking any way the application interfaces with the system. And in this release, we continue our commitment to further improve eBPF by tracking latencies by disks, IRQs, etc.

Our new eBPF latency features include:

A new set of Disk I/O latency charts, which monitor the time that it takes for an I/O request to complete. As many of you may know, this is the most important metric for storage performance!
Latency IRQs monitoring to help anyone with time spent servicing interrupts (hard or soft).
A new Filesystem submenu that adds latency monitoring for different filesystems: BTRFS, Ext4, NFS, XFS and ZFS. The latency monitoring was brought for the most common functions, like latency for each open request and latency for each sync request.

eBPF is a very strong addition to our monitoring tools, and we are committed to provide the best experience with monitoring with eBPF from a distance without disrupting the data flow!

Other eBPF enhancements

But we didn’t stop there with eBPF in v1.32.0. We also provided the following updates:

We moved VFS to a Filesystem menu to simplify the visualization of events realized by filesystems. This allows you to monitor actions of filesystems and their latency.
Until now, Netdata had metrics that demonstrated the amount of swap usage. eBPF.plugin now extends the swap monitoring to show how a specific application group/cgroup is performing action on SWAP.
We have improved process management monitoring by adding monitoring to shared memory and using tracepoints to monitor process creation and exit with more accuracy.
Netdata also brings monitoring for OOM Kill events for each apps groups defined on host.

If you share our interest in eBPF monitoring, or have questions or requests, feel free to drop by our Community forum to start a discussion with us.

Machine learning (ML) powered anomaly detection

Machine learning (ML) is undeniably a wave of the future in monitoring and troubleshooting. The Netdata community is riding that wave forward together, ahead of everyone else. Netdata v.1.32.0 introduces some foundational capabilities for ML-driven anomaly detection in the agent. We have integrated the popular dlib c++ ml library to power unsupervised anomaly detection out-of-the-box.

While this functionality is still under development and subject to change, we want to develop this with you, as a team. The functionality is disabled by default while we dogfood the feature internally and build additional ML-leveraging features into Netdata Cloud. But you can go to the new [ml] section in netdata.conf and set enabled=yes to turn on anomaly detection. After restarting Netdata, you should see the Anomaly Detection menu with charts highlighting the overall number and percent of anomalous metrics on your node. This can be a very useful single number summary of the state of your node.

Share your feedback by emailing us at analytics-ml-team@netdata.cloud or just come hang out in the 🤖-ml-powered-monitoring channel of our discord, where we discuss all things ML and more!

And then, be on the lookout for some bigger announcements and launches relating to ML over the next couple of months.

New timezone selector and time controls in the user interface

Collaborating in a remote world across regions can be difficult, so we wanted to make it easier for you to sync with your administrative teams and your system information. Our new timezone selector allows you to select a timezone to accommodate collaboration needs within your teams and infrastructure. Additionally, we have added the following time controls to allow you to distinguish if the content you are looking at is live or historical and to refresh the content of the page when the tabs are in the background:

Play: When this option is selected, the content of the page will be automatically refreshed while this is in the foreground.
Pause: When this option is selected, the content of the page will not refresh due to a manual request to pause it or, for example, when you are investigating data on a chart (cursor is on top of a chart)
Force Play: When this option is selected, the content of the page will be automatically refreshed even if this is in the background.

Docker image POWER8+ support

And on top of all of that, we have added 64-bit little-endian POWER8+ support to our official Docker images, allowing the use of Netdata Docker images on recent IBM Power Systems, Raptor Talos II, and similar POWER based hardware, extending the list of what is currently supported for our Docker images, which includes:

32 and 64 bit x86
ARMv7
AArch64

Acknowledgments

@nabijaczleweli for fixing writing updater log under root.
@MikaelUrankar for fixing calculation of sysctl mib size in freebsd plugin.
@filip-plata for adding additional metrics to python.d/postgres collector.
@eltociear for fixing typos.
@gotjoshua for adding a link to python.d/httpcheck.conf.
@wangpei-nice for fixing ebpf.plugin segfault when ebpf_load_program returns null pointer.
@zanechua for adding Microsoft Teams to supported notification endpoints.
@diizzyy for adding support for Intel 2.5G and Synopsys DesignWare nic driver in freebsd plugin.
@Saruspete for fixing handling of adding slabs after discovery in slabinfo plugin.
@mjtice for adding autovacuum and tx wraparound charts to python.d/postgres.
@charoleizer for adding PostgreSQL version to requirements section.
@danmichaelo for fixing a typo in exporting docs.
@oldgiova for adding capsh check before issuing setcap cap_perfmon.
@oldgiova for adding Travis ctrl file for checking if changes happened.
@0x3333 for fixing an inconsistent status check in charts.d/apcupsd.
@etienne-napoleone for adding terra related binaries to blockchains apps plugin group.
@anayrat for fixing postgres replication_slot chart on standby.
@vpiserchia for fixing handling of null values returned by _cat/indices API in python.d/elasticsearch.
@elelayan for fixing zpool state parsing in proc/zfs.
@steffenweber for adding missing privilege to fix MySQL slave reporting.
@unhandled-exception for adding sorting of the list of databases in alphabetical order in python.d/postgres.
@78Star for updating Netdata and its dependencies versions for pfSense.
@unhandled-exception for fixing crashing of the wal query if wal-file was removed concurrently in python.d/postgres.
@rupokify for updating jQuery dependency.
@caleno for fixing a typo in streaming docs.
@rex4539 for fixing typos.

Dashboard

Add various updates to dashboard info (#11639, @ilyam8)
Add timex plugin chart descriptions (#11635, @ilyam8)
Add proc plugin zfs chart descriptions (#11630, @ilyam8)
Add proc plugin infiniband chart descriptions (#11628, @ilyam8)
Add proc plugin pagetypeinfo chart descriptions (#11627, @ilyam8)
Add proc plugin net_wireless chart descriptions (#11626, @ilyam8)
Add proc plugin net_rpc_nfs and net_rpc_nfsd chart descriptions (#11625, @ilyam8)
Add proc plugin power_supply chart descriptions (#11619, @ilyam8)
Add cgroups plugin systemd services chart descriptions (#11618, @ilyam8)
Add cgroups plugin chart descriptions (#11607, @ilyam8)
Add apps plugin chart descriptions (#11601, @ilyam8)
Add proc plugin vmstat chart descriptions (#11597, @ilyam8)
Add proc plugin ksm chart descriptions (#11595, @ilyam8)
Add proc plugin edac chart descriptions (#11589, @ilyam8)
Add proc plugin stat chart descriptions (#11586, @ilyam8)
Add proc plugin net_stat_synproxy chart descriptions (#11581, @ilyam8)
Add proc plugin softirqs chart descriptions (#11577, @ilyam8)
Add proc plugin net_stat_conntrack chart descriptions (#11576, @ilyam8)
Add proc plugin uptime chart descriptions (#11569, @ilyam8)
Add proc plugin net_sockstat and net_sockstat6 chart descriptions (#11567, @ilyam8)
Add proc plugin net_snmp6 chart descriptions (#11565, @ilyam8)
Add proc plugin net_sctp_snmp chart descriptions (#11564, @ilyam8)
Add proc plugin net_snmp chart descriptions (#11557, @ilyam8)
Add proc plugin net_netstat chart descriptions (#11554, @ilyam8)
Add proc plugin net_ip_vs_stats chart descriptions (#11546, @ilyam8)
Add proc plugin net_dev chart descriptions (#11543, @ilyam8)
Add proc plugin meminfo chart descriptions (#11541, @ilyam8)
Add proc plugin mdstat chart descriptions (#11537, @ilyam8)
Add proc plugin interrupts chart descriptions (#11532, @ilyam8)
Add proc plugin diskstats chart descriptions (#11528, @ilyam8)
Add proc plugin ipc semaphores chart descriptions (#11523, @ilyam8)
Remove ‘vernemq.queue_messages_in_queues’ from dashboard info (#11403, @ilyam8)
Move MD arrays charts under Disks (#11119, @thiagoftsm)

Collectors

New

Add Traefik collector (go.d/traefik) (#605, @ilyam8)
Add HAProxy collector (go.d/haproxy) (#599, @ilyam8)
Add Mongodb collector (go.d/mongodb) (#598, @georgeok)
Add Ethereum Node collector (go.d/geth) (#585, @odyslam)

Improvements

Add AWS to apps_groups.conf (#11826, @ilyam8)
Show stats for systemd protected mount points (diskspace plugin) (#11767, @vlvkobal)
Add support for v1.7.0+ (go.d/coredns) (#619, @georgeok)
Add “/basic_status” job nginx.conf (go.d/nginx) (#612, @ilyam8)
Add sharding metrics (go.d/mongodb) (#609, @georgeok)
Add thread operations metrics (go.d/mysql) (#607, @ilyam8)
Add replica sets metrics (go.d/mongodb) (#604, @georgeok)
Add databases metrics (go.d/mongodb) (#602, @georgeok)
Add more OS(OperatingSystem) charts (go.d/wmi) (#593, @ilyam8)
Add caddy job to prometheus.conf (go.d/prometheus) (#581, @odyslam)
Add AOF file size metrics (go.d/redis) (#578, @ilyam8)
Add openethereum/geth jobs to prometheus.con (go.d/prometheus) (#578, @odyslam)
Update whois/whois-parser packages and add timeout configuration option (go.d/whoisquery) (#576, @ilyam8)
Disable reporting min/avg/max group uptime by default (apps plugin) (#11609, @ilyam8)
Add sorting of the list of databases in alphabetical order (python.d/postgres) (#11580, @unhandled-exception)
Add terra related binaries to blockchains group (apps plugin) (#11437, @etienne-napoleone)
Add instruction per cycle charts (perf plugin) (#11392, @thiagoftsm)
Add autovacuum and tx wraparound charts (python.d/postgres) (#11267, @mjtice)
Add support for Intel 2.5G and Synopsys DesignWare nic driver (freebsd plugin) (#11251, @diizzyy)
Add web3 and blockchains groups (apps plugin) (#11220, @odyslam)
Implement merging user/stock configuration files (python.d plugin) (#11217, @ilyam8)
Rename default job from ‘local’ to ‘anomalies’ (python.d/anomalies) (#11178, @andrewm4894)
Add standby lag and blocking transactions charts (python.d/postgres) (#11169, @filip-plata)

Bug fixes

Fix renaming for cgroups with dots in the path (cgroups plugin) (#11775, @vlvkobal)
Fix exiting on SIGPIPE (go.d plugin) (#630, @ilyam8)
Fix domain syntax validation (go.d/whoisquery) (#629, @ilyam8)
Fix missing NONE in valid request methods (go.d/squidlog) (#621, @ilyam8)
Remove wrong “queue_messages_in_queues” chart (go.d/vernemq) (#601, @ilyam8)
Fix HTTP/socket client initialization order (go.d/phpfpm) (#591, @ilyam8)
Fix scraping metrics when resources are not discovered (go.d/vsphere) (#589, @ilyam8)
Fix LTSV log format parsing (go.d/weblog) (#584, @ilyam8)
Fix expiration date parsing (go.d/whoisquery) (#575, @ilyam8)
Fix containers name resolution for crio/containerd runtime (cgroups plugin) (#11756, @ilyam8)
Add sensors to charts.d.conf and add a note on how to enable it (charts.d plugin) (#11715, @ilyam8)
Fix crashing of the wal query if wal-file was removed concurrently (python.d/postgres) (#11697, @unhandled-exception)
Fix “lsns: unknown column” logging (cgroups plugin) (#11687, @ilyam8)
Fix nfsd RPC metrics and remove unused nfsd charts and metrics (proc/nfsd) (#11632, @vlvkobal)
Fix “proc4ops” chart family (proc/nfsd) (#11623, @ilyam8)
Fix swap size calculation (cgroups plugin) (#11617, @vlvkobal)
Fix RSS memory counter for systemd services (cgroups plugin) (#11616, @vlvkobal)
Fix VBE parsing (python.d/varnish) (#11596, @ilyam8)
Remove unused synproxy chart (proc/synproxy) (#11582, @vlvkobal)
Fix zpool state parsing (proc/zfs) (#11545, @elelayan)
Fix null values returned by ‘_cat/indices’ API (python.d/elasticsearch) (#11501, @vpiserchia)
Fix replication_slot chart on standby (python.d/postgres) (#11455, @anayrat)
Fix an inconsistent status check (charts.d/apcupsd) (#11435, @0x3333)
Fix plugin name (stats.d plugin) (#11400, @vlvkobal)
Fix plugin names (freebsd and macos plugins) (#11398, @vlvkobal)
Fix lack of “module” in chart definition (all chart.d modules) (#11390, @ilyam8)
Fix various python modules charts contexts (python.d/smartd_log, mysql, zscores) (#11310, @ilyam8)
Fix current operation charts title and context (proc/mdstat) (#11289, @ilyam8)
Fix handling of adding slabs after discovery (slabinfo plugin) (#11257, @Saruspete)
Fix calculation of sysctl mib size (freebsd plugin) (#11159, @MikaelUrankar)

eBPF

New

Add MD flush calls tracking (#11681, @UmanShahzad)
Add shared memory system calls tracking (#11560, @UmanShahzad)
Add OOM kills tracking (#11470, @UmanShahzad)
Add soft IRQ latency tracking (#11445, @UmanShahzad)
Add hard IRQ latency tracking (#11410, @UmanShahzad)
Add mount/umount calls tracking (#11358, @thiagoftsm)
Add btrfs latency monitoring (#11348, @thiagoftsm)
Add ZFS latency monitoring (#11330, @thiagoftsm)
Add NFS latency monitoring (#11313, @thiagoftsm)
Add disk latency monitoring (#11276, @thiagoftsm)
Add XFS latency monitoring (#11238, @thiagoftsm)
Add ext4 latency monitoring (#11224, @thiagoftsm)
Add extended swap monitoring (#11090, @thiagoftsm)

Improvements

Add (eBPF) to submenu (#11721, @thiagoftsm)
Process monitoring cleanup and improvements (#11643, @thiagoftsm)
Add integration with cgroups plugin (socket, shared memory, cachestat) (#11642, @thiagoftsm)
Add integration with cgroups plugin (process, file descriptor, VFS, directory cache and OOMkill) (#11611, @thiagoftsm)
Add initial integration with cgroups plugin (swap) (#11573, @thiagoftsm)
Add integration with cgroups plugin (create shared memory with cgroups) (#11559, @thiagoftsm)
Update charts descriptions (#11547, @thiagoftsm)
Convert eBPF submenus to lowercase (#11511, @thiagoftsm)
Socket monitoring code improvements and update charts descriptions (#11441, @thiagoftsm)
Move file operation monitoring to a separate thread (#11401, @thiagoftsm)
Add module names for threads (#11387, @thiagoftsm)
Move repeating part of latency chart descriptions to the family level (#11363, @thiagoftsm)
Reduce plugin’s memory usage (#11256, @thiagoftsm)
Assorted improvements and fixes (#11230, @thiagoftsm)
Move VFS monitoring to a separate threads and add new charts (#11187, @thiagoftsm)

Bug fixes

Fix command line arguments (#11670, @thiagoftsm)
Fix hardirq/softirq value init logic (#11471, @UmanShahzad)
Fix VFS index reference (#11356, @thiagoftsm)
Fix a case when multiple eBPF plugins are running (#11287, @thiagoftsm)
Fix applying configuration options (#11253, @thiagoftsm)
Fix a segfault when ebpf_load_program returns null pointer (#11203, @wangpei-nice)
Fix a wrong pointer to a function and move parser to main thread (#11152, @thiagoftsm)

Health

Improvements

Remove pihole_blocked_queries alert (#11829, @Ancairon)
Improve check for supported -F parameter in sendmail (#11506, @MrZammler)
Add custom e-mail headers (#11454, @MrZammler)
Add ‘cockroachdb_underreplicated_ranges’ alarm (#11360, @ilyam8)
Disable ‘oom_kill’ alarm on k8s nodes (#11359, @ilyam8)
Add geth stock alarms (#11341, @odyslam)
Remove pythond modules specific last_collected alarms (#11307, @ilyam8)
Remove CockroachDB deprecated alarms (#11235, @ilyam8)
Add new email notification template (#11219, @MrZammler)
Add system clock synchronization state alarm (#11177, @ilyam8)
Add python.d/go.d jobs last_collected_secs alarms (#11168, @ilyam8)
Make stocks alarms less sensitive (#11153, @ilyam8)

Bug fixes

Fix swap_used alarm calculation (#11672, @ilyam8)
Fix ram level alarms (#11452, @ilyam8)
Fix ‘gearman_workers_queued’ alarm (#11361, @ilyam8)
Fix sending MS Teams notifications to multiple channels (#11355, @ilyam8)
Fix sendmail ‘unrecognized option: F’ issue (#11283, @MrZammler)
Update old logo to new one (#11263, @odyslam)
Swap class and type attributes in stock alarm configurations (#11240, @MrZammler)
Fix alarm line ‘charts’ matching (#11204, @ilyam8)

Documentation

Updating ansible steps for clarity (#11823, @kickoke)
Add a note about pkg-config file location for freeipmi (#11831, @vlvkobal)
Fix broken link in charts.mdx (#11808, @DShreve2)
Fix typos (#11782, @rex4539)
Add nightly release version to readme (#11780, @andrewm4894)
Fix link to new charts (#11773, @DShreve2)
Fix typos in netdata-security.md (#11772, @jlbriston)
Update eBPF documentation (Filesystem and HardIRQ) (#11752, @UmanShahzad)
Add command for new health entity file (#11733, @DShreve2)
Remove dated contact suggestion (#11732, @DShreve2)
Add documentation about Filesystem and HardIRQ (#11752, @UmanShahzad)
Fix a typo in streaming docs (#11747, @caleno)
Update eBPF documentation (#11741, @thiagoftsm)
Fix broken link - Charts 2.0 (#11729, @DShreve2)
Fix broken link - eBPF plugin (#11728, @DShreve2)
Add Cloud sign-up link (#11714, @DShreve2)
Update claiming instructions for Docker (#11713, @DShreve2)
Fix broken links in kickstart.md (#11708, @DShreve2)
Add missing collectors to the eBPF plugin readme (#11703, @thiagoftsm)
Fix broken link - Charts 2.0 (#11701, @hugovalente-pm)
Update Netdata and dependencies versions for pfSense (#11674, @78Star)
Add a note about new release of charts on the Cloud (#11637, @hugovalente-pm)
Update optional parameters for upcoming installer (#11604, @DShreve2)
Add missing privilege to fix MySQL slave reporting (#11574, @steffenweber)
Fix broken links (#11540, @ilyam8)
Update london demo to point at london3 (#11533, @andrewm4894)
Add a note about handling backslashes in health configuration files (#11527, @ilyam8)
Improve streaming documentation wording (#11510, @siamaktavakoli)
Fix a typo in claiming docs (#11492, @car12o)
Remove broken link (#11482, @andrewm4894)
Add a note on how to find web files directory for custom dashboards (#11461, @ilyam8)
Update “Install Netdata on Synology” guide (#11449, @ilyam8)
Update installation documentation (#11442, @hugovalente-pm)
Update eBPF documentation (#11440, @thiagoftsm)
Add time controls and timezone selector description (#11433, @hugovalente-pm)
Fix broken links - Custom dashboards (#11413, @hugovalente-pm)
Fix broken links - Custom dashboards (#11405, @hugovalente-pm)
Rename claiming action to connect (#11378, @hugovalente-pm)
Fix a typo in exporting docs (#11376, @danmichaelo)
Add PostgreSQL version to requirements section (#11328, @charoleizer)
Minor fixes (#11320, @UmanShahzad)
Fix prometheus node CPU alert rule (#11309, @ilyam8)
Updated get-started.mdx (#11303, @jlbriston)
Add Legacy/NG ACLK documentation (#11243, @underhood)
Add links to data privacy page (#11226, @joelhans)
Add Microsoft Teams to supported notification endpoints (#11205, @zanechua)
Add a link to python.d/httpcheck.conf (#11182, @gotjoshua)
Fix broken links (#11175, @joelhans)
Update news about the latest release (#11165, @joelhans)

Packaging / Installation

Use pip3 when installing git-semver package (#11817, @maneamarius)
Add POWER8+ static builds (#11802, @Ferroin)
Update libbpf to v0.5.1 (#11800, @thiagoftsm)
Verify checksums of makeself deps (#11791, @vkalintiris)
Update go.d.plugin version to v0.31.0 (#11789, @ilyam8)
Add Oracle Linux 8 to CI and package builds (#11776, @Ferroin)
Fix a typo in installation script (#11766, @ShimonOhayon)
Update dashboard to v2.20.11 (#11743)
Minor improvement to CPU number function regarding macOS. (#11746, @iigorkarpov)
Add log grouping in installer and static build code when running under GitHub Actions. (#11720, @Ferroin)
Add basic telemetry to the new kickstart script. (#11718, @Ferroin)
Add eBPF plugin to static binaries (#11709, @thiagoftsm)
Fix libbpf handling in RPM package builds. (#11702, @Ferroin)
Don’t use api.github.com when checking for latest stable version (#11700, @ilyam8)
Fix handling of disabling telemetry in static installs. (#11689, @Ferroin)
Mark g++ for freebsd as NOTREQUIRED (#11678, @MrZammler)
Optimize static build and update various dependencies. (#11660, @Ferroin)
Improve installation on systems with limited RAM. (#11658, @Ferroin)
Add support for local builds to the new kickstart script. (#11654, @Ferroin)
Explicitly opt out of LTO in RPM builds. (#11644, @Ferroin)
Add flag to mark containers as created from official images in analytics. (#11606, @Ferroin)
Add POWER8+ support to our official Docker images. (#11592, @Ferroin)
Disable eBPF compilation in different platforms (#11566, @thiagoftsm)
Fix installer flag —use-system-protobuf (#11539, @underhood)
Re-add EPEL on CentOS 7. (#11525, @Ferroin)
Use the correct exit status for the updater with static updates. (#11520, @Ferroin)
Remove reset_netdata_trace.sh from netdata.service (#11517, @ilyam8)
Install basic netdata deps by default. (#11508, @Ferroin)
Fix handling of claiming in kickstart script when running as non-root. (#11507, @Ferroin)
Use system copy of protobuf in Docker images and static builds. (#11496, @Ferroin)
Add initial implementation of new kickstart script. (#11493, @Ferroin)
Add static builds for ARMv7l and ARMv8a (#11490, @Ferroin)
Add the ability to allow arbitrary options to be passed to make from netdata-installer.sh. (#11479, @Ferroin)
Embed build architecture in static build archive names. (#11463, @Ferroin)
Fix edge repository configuration DEB packages. (#11458, @Ferroin)
Add check for failed protobuf configure or make (#11450, @MrZammler)
Don’t bail early if we fail to build cloud deps with required cloud. (#11446, @Ferroin)
Change default to not using LTO for builds. (#11432, @Ferroin)
Use DebHelper compat level 9 in repoconfig packages to support Ubuntu 16.04 (#11426, @Ferroin)
Add capsh check before issuing setcap cap_perfmon (#11386, @oldgiova)
Update handling of builds of bundled dependencies. (#11375, @Ferroin)
Add support for bundling protobuf as part of the install. (#11374, @Ferroin)
Properly handle eBPF plugin in RPM packages. (#11362, @Ferroin)
Add support for claiming existing installs via kickstarter scripts. (#11350, @Ferroin)
Assorted kickstart install fixes. (#11342, @Ferroin)
Add aclk-schemas to dist_noinst_DATA (#11338, @underhood)
Auto-detect PGID in Dockerfile’s ENTRYPOINT script (#11274, @odyslam)
Add code for repository configuration packages. (#11273, @Ferroin)
Explicitly update libarchive on CentOS 8 when installing dependencies. (#11264, @Ferroin)
Fix kickstart-static64.sh install script fail when trying to access .install-type before it is created (#11262, @ilyam8)
Add openSUSE 15.3 package builds. (#11259, @Ferroin)
Fix libjudy installation on CentOS 8. (#11248, @Ferroin)
Fix install_type detection during update (#11199, @ilyam8)
Store info about the installation type for later retrieval. (#11157, @Ferroin)
Compile/Link with absolute paths for bundled/vendored deps. (#11129, @vkalintiris)
Fix writing updater log under root (#10901, @nabijaczleweli)
Add ARM binary package builds to CI. (#10769, @Ferroin)

Other Notable Changes

Improvements

Clean compilation warnings (#11810, @stelfrag)
Fix coverity issues (#11809, @stelfrag)
Add commands to check and fix database corruption (#11828, @stelfrag)
Use two digits after the decimal point for the anomaly rate. (#11804, @vkalintiris)
Always queue alerts to aclk_alert (#11806, @MrZammler)
Add some logging for cloud new architecture to access.log (#11788, @MrZammler)
Delete from aclk alerts table if ack’ed from cloud one day ago (#11779, @MrZammler)
Remove feature flag for ACLK new cloud architecture (#11774, @stelfrag)
Insert alert into aclk_alert directly instead of queuing it (#11769, @MrZammler)
Store and submit dimension delete messages for new cloud architecture (#11765, @stelfrag)
Implement cloud initiated disconnect command (#11723, @underhood)
Announce proto capability and enable if cloud supports (#11476, @underhood)
Add exit points between env and OTP (#11751, @underhood)
Improve the ACLK sync process for the new cloud architecture (#11744, @stelfrag)
Disable C++ warnings from dlib library. (#11738, @vkalintiris)
Add queue removed alerts to cloud for new architecture (#11704, @MrZammler)
Add support to stream chart labels on a parent - child setup (#11675, @MrZammler)
Add snapshot message for cloud new architecture (#11664, @MrZammler)
Add protobuf to -W buildinfo output. (#11634, @Ferroin)
Add new alarm status protocol messages (#11612, @underhood)
Add local webserver API/v1 call “aclk” (#11588, @underhood)
Make New Cloud architecture optional for ACLK-NG (#11587, @underhood)
Enable additional functionality for the new cloud architecture (#11579, @stelfrag)
Add alert message support for ACLK new architecture (#11552, @MrZammler)
Add support for Anomaly Detection MVP (#11548, @vkalintiris)
Add New Cloud Protocol files to CMake (#11536, @underhood)
Add archive uploads for dist, package build, and static build checks. (#11534, @Ferroin)
Add node message support for ACLK new architecture (#11514, @stelfrag)
Clean netdata naming (#11484, @andrewm4894)
Add aclk/cloud state command to netdatacli (#11462, @underhood)
Add chart message support for ACLK new architecture (#11447, @stelfrag)
Add Alert Related API for new protocol (#11424, @underhood)
Update SQLite version from v3.33.0 to 3.36.0 (#11423, @stelfrag)
Add SQLite unit tests (#11422, @stelfrag)
Add NodeInstanceInfo API (#11419, @underhood)
Use SQLite to store the health log and alert configurations. (#11399, @MrZammler)
Add ACLK synchronization event loop (#11396, @stelfrag)
Add HTTP basic authentication to Prometheus remote write and HTTP versions of Graphite, JSON, OpenTSDB (#11394, @vlvkobal)
Add new Cloud chart related parsers and generators (#11393, @underhood)
Remove warning when GCC 8.x is used (#11389, @thiagoftsm)
Add support to allow ACLK-NG to grow MQTT buffer (#11340, @underhood)
Add support for bundled protobuf (#11335, @underhood)
Add ACLK-NG cloud request type charts (#11326, @UmanShahzad)
Add HTTP access log messages for ACLK-NG (#11318, @UmanShahzad)
Add a log message when the page cache manager sleeps for more than 1 second. (#11314, @vkalintiris)
Add hop count for children (#11311, @stelfrag)
Remove access check for install-type file (#11288, @MrZammler)
Support TLS SNI in ACLK-NG (#11285, @underhood)
Make ACLK-NG the default if available (#11272, @underhood)
Add extra posthog attributes (#11237, @MrZammler)
Add support to ACLK-NG for new Cloud NodeInstance related msgs (#11234, @underhood)
Add support so ACLK NG and Legacy can coexist (#11225, @underhood)
Move cleanup of obsolete charts to a separate thread (#11222, @vlvkobal)
Add check to only report the exit code when anonymous statistics script fails (#11215, @MrZammler)
Reduce memory needed per dimension (#11212, @stelfrag)
Improve dbengine intialization to ignore journal files that can not be read (#11210, @stelfrag)
Use memory mode RAM if memory mode dbengine is specified but not available (#11207, @stelfrag)
Improve return status check for the execution of anonymous statistics script (#11188, @MrZammler)
Reuse the SN_EXISTS bit to track anomaly status. (#11154, @vkalintiris)
Remove deprecated command line options (#11149, @vkalintiris)
Remove unecessary relative paths when including headers. (#11124, @vkalintiris)
Add field to provide UTC offset in seconds and edit health config command (#11051, @MrZammler)

Bug fixes

Set NETDATA_CONTAINER_OS_DETECTION properly (#11827, @MrZammler)
Fix agent crash when ACLK sync thread is not initialized (#11820, @MrZammler)
Simple fix for the data API query (#11787, @vlvkobal)
Use the proper format specifier when logging configuration options. (#11795, @vkalintiris)
Use correct hop count if host is already in memory (#11785, @stelfrag)
Fix proc/interrupts parser (#11783, @maximethebault)
Skip sending hidden dimensions via ACLK (#11770, @stelfrag)
Fix host hop count reported to the cloud (#11768, @stelfrag)
Fix log if D_ACLK is used (#11763, @underhood)
Fix retention message duration when no local metrics are found (#11762, @stelfrag)
Fix an issue with incomplete payload served when https is enabled (#11754, @MrZammler)
Fix a type in the popocorn information message (#11745, @underhood)
Fix /api/v1/info if ml-info is missing (#11739, @MrZammler)
Fix typo in aclk_query.c (#11737, @eltociear)
Fix online chart in NG not updated properly (#11734, @underhood)
Fix coverity CID #373610 (#11719, @MrZammler)
Fix loading old and custom dashboards (#11710, @rupokify)
Fix coverity issues 373612 & 373611 (#11684, @MrZammler)
Fix warnings from -Wformat-truncation=2 (#11676, @MrZammler)
Fix interval usage and reduce I/O (#11662, @thiagoftsm)
Fix build issue related to legacy aclk and new arch code (#11655, @MrZammler)
Fix typo in URL when calling env (#11651, @underhood)
Fix false poll timeout (#11650, @underhood)
Fix chart config overflow (#11645, @stelfrag)
Fix an overflow when unsigned integer subtracted (#11638, @vlvkobal)
Fix coverity issues 373400-373402 (#11631, @stelfrag)
Fix proper initialization struct with zeroes (#11621, @MrZammler)
Fix https client (#11608, @underhood)
Fix CID 339027 and reverse arguments (#11578, @thiagoftsm)
Fix resource leak when analytics thread stops (#11575, @MrZammler)
Fix coverity report issues CID_373247-373251 (#11549, @stelfrag)
Fix coverity issues for health config (#11535, @MrZammler)
Fix issue with log messages appearing in the terminal instead of the error.log on startup (#11524, @stelfrag)
Fix issues in Alarm API (#11491, @underhood)
Fix list corruption in ACLK sync code and remove fatal (#11444, @stelfrag)
Fix coverity reported issues 372243 - 372248 (#11429, @stelfrag)
Fix CID 372233 to CID 372236 (#11411, @underhood)
Fix bundled protobuf linkage on systems needing -latomic (#11406, @underhood)
Fix coverity issue 372222 (#11404, @stelfrag)
Fix typo in analytics.c (#11329, @eltociear)
Fix coverity errors in ACLK (#11322, @underhood)
Fix confusing error in ACLK Legacy (#11278, @underhood)
Fix an issue to send correct aclk implementation used by agent to posthog. (#11247, @MrZammler)
Fix error on —disable-cloud (#11244, @underhood)
Fix mqtt_websockets submodule version (#11196, @underhood)
Fix claiming script exit code when daemon not running and the claim was successful (#11195, @ilyam8)
Fix loading of class, component and type from health log when sufficient fields are detected. (#11193, @MrZammler)
Fix issue with mqtt_websockets on FreeBSD (#11172, @underhood)
Fix typo in aclk.c (#11170, @eltociear)
Fix mqtt_websockets on MacOS (#11145, @underhood)

Deprecation notice

An upcoming stable release of the Netdata agent will include a maintainability update to our base Docker image. A small percentage of users will find that all self-compiled packages must be manually rebuilt after the update, even if relocation/SONAME errors are not encountered. --security-opt=seccomp=unconfined can be passed with no default.json, but this introduces security vulnerabilities between the host and malicious code in the container.

Alternatively, users can prepare for the update by upgrading to one of the following:

runc v1.0.0-rc93
Docker 19.03.9 or greater AND libseccomp 2.4.2 or greater

While Netdata previously avoided making this update to minimize inconvenience to our users, we are now facing a third-party end-of-life date, and we believe the minimal number of affected users substantiates the need for the change.

Additionally, in a future stable release, we will be removing our legacy agent-to-cloud connection. Most users should see no change in this upgrade, but we will lose SOCKS 5 proxy support for the Netdata Cloud functionality, which will affect a small number of users.

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata agent, feel free to contact us by one of the following channels:

Github: You can use our Github repo to report bugs and submit feature requests
Community forum: You can visit our community forum for questions and training.
NEW: Discord: You can jump into our Discord for interactive, synchronous help and discussion. More than 700 engineers are already using it! Join us!