UNCLASSIFIED - NO CUI

Skip to content

Update dependency zeek/zeek to v6.0.0

renovate requested to merge renovate/zeek-zeek-6.0.x into development

This MR contains the following updates:

Package Update Change
zeek/zeek patch v6.0.0-6 -> v6.0.0

Release Notes

zeek/zeek

v6.0.0

Compare Source

Breaking Changes

  • Zeek now treats private address space (i.e., non-routable IP address ranges) as local by default, matching the intuition of many users that e.g. a 192.168/16 IP address should show up as local in the logs. To do this, Zeek automatically adds Site::private_address_space to Site::local_nets at startup. Subsequent runtime updates to Site::private_address_space propagate to Site::local_nets, while updates to the latter don't affect the former.

    You're free to define Site::local_nets as before and do not need to update your configurations. If you added standard private address space to Site::local_nets in the past, you no longer need to do so. This also applies to zeekctl's networks.cfg file.

    The new global Boolean Site::private_address_space_is_local, true by default, controls the behavior. A redef to false brings back Zeek's prior behavior of considering private address space an unrelated concept, which will come in handy for example when working with tests that compare results against log baselines that have not yet been updated.

  • Telemetry centralization and Prometheus exposition is not enabled by default anymore. Previously, the manager node would open port 9911/tcp by default and import all metrics from other nodes. For large clusters, the current implementation introduces significant processing overhead on the manager even if the Prometheus functionality is not used. While inconvenient, disable this functionality (assumed to be used by few as of now) by default to preserve resources.

    The script to enable centralization and the Prometheus endpoint is now located in the policy/ folder. Re-enable the old functionality with:

    @​load frameworks/telemetry/prometheus

    You may experiment with increasing Broker::metrics_export_interval (default 1s) to reduce the extra overhead and communication at the expense of stale metrics.

  • Custom source tarballs require a repo-info.json file.

    Note, should you be using official Zeek release tarballs only, or build Zeek solely from git checkouts, this does not affect you.

    However, if you're building your own Zeek source tarballs, it is now required that a repo-info.json file exists at the top-level. The dist target was extended to add this file and official Zeek release source tarballs will contain it going forward.

    The following command can be used to produce repo-info.json:

    python3 ./ci/collect-repo-info.py --only-git > ../path/to/tarballdir/repo-info.json

    This is required to support the new -V / --build-info option that provides information about git submodules and included plugins used during the build. The ci/collect-repo-info.py tool runs at ./configure time and either collects the required information from a git clone (when git is installed), or otherwise uses the content of a file named repo-info.json.

    If you see opportunities to extend repo-info.json with further information, please get in touch.

  • Plugin authors should raise the minimum required CMake version to 3.15 to ensure compatibility with new CMake scaffolding included in this release. Older versions will trigger a warning at configuration time and, depending on the functionality included in the plugin, may trigger subsequent errors during configuration or build.

  • Zeek container images are not pushed to the zeekurity organization anymore. Please switch to using the zeek/zeek image on DockerHub, or the images published to public.ecr.aws/zeek/zeek.

  • The IRC_Data analyzer declaration has been moved to protocols/irc/IRC.h.

  • The error message returned when using bro_init, bro_done, and bro_script_loaded events is now removed. removed. Usage of these events has returned that error during script parsing for a few years, and time has come to finally remove it.

New Functionality

  • Zeek now features experimental JavaScript support:

    /* hello.js */ zeek.on('zeek_init', () => { console.log('Hello, Zeek!'); });

    $ zeek ./hello.js Hello, Zeek!

    When building Zeek on a system that features a recent (16.13+) version of the libnode package with development headers, Zeek automatically includes the externally-maintained ZeekJS plugin (https://github.com/corelight/zeekjs) as a builtin plugin. This allows Zeek to load and execute JavaScript code located in .js or .cjs files. When no such files are passed to Zeek, the JavaScript engine and Node.js environment aren't initialized and there is no runtime impact.

    The Linux distributions Fedora 37 & 38, Ubuntu 22.10, and the upcoming Debian 12 release provide suitable packages. On other platforms, Node.js can be built from source with the --shared option.

    To disable this functionality, pass --disable-javascript to configure.

  • Zeek now comes with Spicy support built in, meaning it can now leverage any analyzers written in Spicy out of the box. While the interface layer connecting Zeek and Spicy used to be implemented through an external Zeek plugin, that code has now moved into the Zeek code base itself. We also added infrastructure to Zeek that enables its built-in standard analyzers to use Spicy instead of Binpac. As initial (simple) examples, Zeek's Syslog and Finger analyzers are now implemented in Spicy. While their legacy versions remain available as fallbacks for now in case Spicy gets explicitly disabled at build time, their use is deprecated and their code won't be maintained any further. (Some of these Spicy updates were part of Zeek 5.2 already, but hadn't been included in its NEWS section.)

  • Zeek events now hold network timestamps. For scheduled events, the timestamp represents the network time for which the event was scheduled for, otherwise it is the network time at event creation. A new bif current_event_time() allows to retrieve the current event's network timestamp within the script-layer.

    When Zeek sends events via Broker to other nodes in a cluster, an event's network timestamp is attached to the Broker messages. On a receiving Zeek node executing a handler for a remote event, current_event_time() returns the network time of the sending node at the time the event was created.

    The Broker level implementation allows to exchange arbitrary event metadata, but Zeek's script and C++ APIs currently only expose network timestamp functionality.

  • A new bif from_json() can be used to parse JSON strings into records.

    type A: record { a: addr; }; local p = from_json({"a": "192.168.0.1"}", A); if ( p$valid ) print (p$v as A)

    Implicit conversion from JSON to Zeek types is implemented for bool, int, count, real, interval (number as seconds) and time (number as unix timestamp), port (strings in "80/tcp" notation), patterns, addr, subnet, enum, sets, vectors and records similar to the rules of the input framework. Optional or default record fields are allowed to be missing or null in the input.

  • Zeek now provides native "Community ID" support with a new bif called community_id_v1(). Two policy scripts protocols/conn/community-id-logging and frameworks/notice/community-id extend the respective logs with a community_id field the same way as the external zeek-community-id plugin provides. A main difference to the external hash_conn() bif is that the community_id_v1() takes a conn_id record instead of a connection.

    Loading the new policy scripts and using the external zeek-community-id plugin at the same time is unsupported.

  • ZeekControl is now multi-logger aware. When multiple logger nodes are configured in ZeekControl's node.cfg, by default the log archival logic adds a logger's name as suffix to the rotated file name:

    stats.11:18:57-11:19:00-logger-1.log.gz stats.11:18:57-11:19:00-logger-2.log.gz

    Previously, in a multi-logger setup, individual logger processes would overwrite each other's log files during rotation, causing data loss.

    For setups with a single logger, there's no change in behavior. The naming of the final logs can be customized by providing an alternative make-archive-name script and using the new ZEEK_ARG_LOG_SUFFIX environment variable.

  • A supervisor controlled Zeek cluster is now multi-logger aware. This avoids loggers overwriting each other's log files within a single log-queue directory. By default, a logger's name is appended to the rotated logs by zeek-archiver.

  • Introduce a new command-line option -V / --build-info. It produces verbose output in JSON format about the repository state and any included plugins.

  • The X.509 certificate parser now exposes the signature type that is given inside the signed portion of the certificate.

  • The SSL parser now parses the CertificateRequest handshake message. There is a new ssl_certificate_request event and a new parse_distinguished_name function. We also added the protocols/ssl/certificate-request-info policy script, that adds some additional information to ssl.log.

  • Add logging metrics for streams (zeek-log-stream-writes) and writers (zeek-log-writer-writes-total).

  • Add networking metrics via the telemetry framework. These are enabled when the misc/stats script is loaded.

    zeek-net-dropped-packets zeek-net-link-packets zeek-net-received-bytes zeek-net-packet-lag-seconds zeek-net-received-packets-total

    Except for lag, metrics originate from the get_net_stats() bif and are updated through the Telemetry::sync() hook every 15 seconds by default.

  • The DNS analyzer now parses RFC 2535's AD ("authentic data") and CD ("checking disabled") flags from DNS requests and responses, making them available in the dns_msg record provided by many of the dns_* events. The existing Z field remains unchanged and continues to subsume the two flags, for backward compatibility.

  • The supervisor framework can now start worker nodes that read from a trace file.

  • Zeek can be prevented from updating network_time() to the current time by setting allow_network_time_forward=F. Together with set_network_time() or a custom plugin, this allows control of network_time() without Zeek interfering.

  • The setting Pcap::non_fd_timeout can be used to configure the timeout used by non-selectable packet sources in the idle case (default 20usec). This value has previously been hard-coded, but increasing it can significantly reduce idle CPU usage in low packet rate deployments.

  • Zeek now supports a new @pragma directive. It currently allows suppressing deprecation warnings in Zeek scripts by opening with @pragma push ignore-deprecations and closing with @pragma pop. This particularly helps in situations where use of the Zeek base scripts, for example to populate a deprecated field for API compatibility, would otherwise trigger deprecation warnings.

  • The Reporter class was extended by a Deprecation() method to use for logging deprecations rather than using ad-hoc Warning() calls.

  • The network statistics record type features a new pkts_filtered field for reporting the number of packets that the interface filtered before hand-off to Zeek. Packet source implementations are free to fill this field as feasible. The default pcap packet source does not provide this information because its availability depends on the libpcap version.

  • Packet statistics (packets received, packets dropped, bytes received, packets seen on link, and packets filtered) are now reported to the Telemetry framework, under the zeek_net prefix.

  • Zeek's cluster framework provides the new get_node_count(node_type: NodeType) function to obtain the number of nodes for a given node type as defined in the cluster layout. Furthermore, broadcast_topics was added as a collection of broker topics that can be used to reach all nodes in a cluster.

  • The new Cluster::Experimental namespace has been introduced to Zeek's cluster framework to provide experimental features. Based on practical experiences and the adoption of an experimental feature, it may become a regular feature or be removed in future releases. Experimental features are loaded via: @load policy/frameworks/cluster/experimental

  • Zeek's cluster framework provides two new experimental events:

    • cluster_started: This event will be broadcasted from the manager once all cluster-level connections have been established based on the given cluster layout. If any node restarts (including the manager itself), the event will neither be rebroadcasted nor raised locally for the restarted node.

    • node_fully_connected: This event will be sent to the manager and raised locally once a cluster node has successfully conducted cluster-level handshakes for all its outgoing connections to other cluster nodes based on the given cluster layout.

    Note: There is no tracking of cluster node connectivity. Thus, there is no guarantee that all peerings still exist at the time of these events being raised.

  • The IEEE 802.11 packet analyzer gains the ability to parse encapsulated A-MSDU packets, instead of just dropping them. It also gains the ability to properly recognize CCMP-encrypted packets. These encrypted packets are currently dropped to Zeek's inability to do anything with them.

  • Add packet analzyers for LLC, SNAP, and Novell 802.3, called from the Ethernet and VLAN analyzers by default.

  • Environment variables for the execution of log rotation postprocessors can be set via Log::default_rotation_postprocessor_cmd_env.

  • The record_field record was extended by optional and record_fields() can now be used to determine the optionality of record fields.

  • The ip4_hdr record was extended by DF, MF, offset and sum to aid packet-level analysis use-cases.

  • Zeek now supports parsing the recently standardized DTLS 1.3. Besides the protocol messages being correctly parsed and raising the typical SSL/TLS events, the biggest visible change is the newly added ssl_extension_connection_id event.

  • The NTP analyzer now recognizes when client and server mode messages disagree with the notion of "originator" and "responder" and flips the connection. This can happen in packet loss or packet re-ordering scenarios. Such connections will have a ^ added to their history.

  • New bifs for ceil() and log2() have been added.

  • Seeds for deterministic processing can now also be set through a new environment variable called ZEEK_SEED_VALUES. The format is expected to contain 21 positive numbers separated by spaces.

Changed Functionality

  • The base distribution of the Zeek container images has been upgraded to Debian 12 "bookworm" and JavaScript support was enabled.

  • When get_file_handle() is invoked for an analyzer that did not register an appropriate callback function, log a warning and return a generic handle value based on the analyzer and connection information.

  • The &on_change attribute of set and tables is propagated through copy().

  • Revert back to old method of preallocating PortVal objects for all valid port numbers, as it was implemented prior to the Windows port. Not preallocating these objects saves a minor amount of memory for short runs of Zeek, but comes at a performance cost for having to allocate the objects every time a new port is seen plus do map lookups for each port. This memory savings is mostly lost for long runs of Zeek, since all of the ports will likely end up allocated in time.

    If the version from the Windows port is desired, a new configure option --disable-port-prealloc will disable the preallocation and enable the map lookup version.

  • The main-loop has been changed to process all ready IO sources with a zero timeout in the same loop iteration. Previously, two zero-timeout sources would require two main-loop iterations. Further, when the main-loop is polling IO sources with file descriptors, zero timeout IO sources are added to the list of sources to be processed as well.

    The intervals to decide when Zeek checks FD-based IO sources for readiness have been made configurable through io_poll_interval_default and io_poll_interval_live for ease of testing, development and debugging of the main-loop.

  • Zeek does not arbitrarily update network_time() to current time anymore. When a packet source is providing a constant stream of packets, packets drive network time. Previously, Zeek updated network time to current time in various situations, disregarding timestamps of network packets. Zeek will now update network_time() only when a packet source has been inactive/idle for an interval of packet_source_inactivity_timeout (default 100msec). When a worker process suddenly observes no packets, timer expiration may initially be delayed by packet_source_inactivity_timeout.

  • Calling suspend_processing() when reading traces does not update network time to the current time anymore. Instead, Zeek keeps network_time() according to the trace file. This causes scheduled events to not fire once suspend_processing() is called, which seems more reasonable than arbitrarily setting network_time() to current time. Processing can still be continued from broker events or input readers.

  • Previously, Zeek would process and dispatch events for the very first packet in a trace file in order to initialize time, even if suspend_processing() was called in a zeek_init() handler. This has been changed such that the first packet will only be processed once continue_processing() has been invoked again. Some background around the previous behavior can be found in GH-938. Given that the network_time_init() event explicitly signals initialization of network time, this behavior seems more reasonable.

  • If an event is scheduled with a 0.0sec timeout from a zeek_init() handler that also invokes suspend_processing(), the scheduled event will fire immediately with network_time() still yielding 0.0. Previously, network_time() was set to the current time. The new behavior provides more deterministic operation and aligns with timers stopping during a suspend_processing().

  • Broker no longer initializes network time to current time when processing input. Particularly in combination with pcap processing this was not desirable behavior.

  • The IO loop's poll interval is now correctly reduced from 100 to 10 for live packet sources. This should lower CPU usage for deployments with non-selectable packet sources.

  • Zeek's CMake scaffolding has received an overhaul for modernizing the build system and to make it easier to maintain going forward. Plugins can now use a declarative interface for adding all sources, BIFs, etc. in one block instead of using the previous begin/end functions. While the old plugin functions still exist for backward compatibility, the underlying codebase requires newer CMake features. Plugin authors should raise their minimum required CMake version to 3.15, to match Zeek's.

  • The IRC data analyzer does not extract DCC acknowledgements to files anymore. Instead, irc_dcc_send_ack is raised with the bytes acknowledged by the recipient.

  • The IRC base script now use file_sniff() instead of file_new() for DCC file transfers to capture fuid and inferred MIME type in irc.log.

  • The ignore_checksums script variable now reflects the correct value when using the -C command-line flag.

  • Support for ARUBA GRE tunnels now covers all of the known protocol type values for those tunnels.

  • The vlan field reported by the AF_PACKET packet source is now properly masked to exclude PCP and DEI bits. Previously, these bits were included and could cause invalid vlan values > 4095 to be reported.

  • Libpcap based packet source now avoids the 32bit wraparound of link and dropped packet counters as reported by users.

  • The ssl_history field in ssl.log indicates that the letter j is reserved for hello retry requests. However, this logging was never fully implemented; instead, hello retry requests were logged like as a server hello (with the letter s). This oversight was fixed, and hello retry requests are now correctly logged.

  • When per-connection SMB parser state (read offsets, tree ids, ...) exceeds SMB::max_pending_messages (default 1000), Zeek discards such per-connection state and raises a new smb2_discarded_messages_state() event. This event is used to reset script-layer SMB state. This change provides protection against unbounded state growth due to partial or one-sided SMB connections.

    Setting SMB::max_pending_messages to 0 can be used to switch back to the previous behavior of not discarding state. Setting SMB::enable_state_clear to F skips the script-layer state clearing logic.

  • Fix disable_analyzer() builtin function crashing when attempting to disable connection's root analyzers.

  • Zeek script vectors now support negative indices.

    local v = vector(1, 2, 3); print v[-1]; # prints 3

  • Function parameters are rendered by Zeekygen as :param x rather than just :x:. This allows to group parameters Zeek's documentation.

Removed Functionality

  • Mixing vector and scalar operands for binary expressions, like addition, multiplication, etc., is now an error.

  • Using deprecated when semantics without capturing variables is now an error.

  • Referencing local variables in a more outer scope than where they were declared is now an error

Deprecated Functionality

  • The cluster framework's worker_count has been deprecated in favor of the new function get_active_node_count(node_type: NodeType) that can be used to obtain the number of nodes of a given type the calling node is currently connected to.

v6.0.0-rc3

Compare Source

v6.0.0-rc2

Compare Source

v6.0.0-rc1

Compare Source


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

Rebasing: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this MR and you won't be reminded about this update again.


  • If you want to rebase/retry this MR, check this box

This MR has been generated by Renovate Bot.

Merge request reports