UNCLASSIFIED - NO CUI

Skip to content

Update zeek/zeek Docker tag to v8

This MR contains the following updates:

Package Type Update Change
zeek/zeek major 7.2.2 -> 8.0.0
zeek/zeek ironbank-github major v7.2.2 -> v8.0.0

⚠️ Warning

Some dependencies could not be looked up. Check the warning logs for more information.


Release Notes

zeek/zeek (zeek/zeek)

v8.0.0

Compare Source

We would like to thank @​aidans111, Anthony Verez (@​netantho), Baa (@​Baa14453), Bhaskar Bhar (@​bhaskarbhar), @​dwhitemv25, EdKo (@​ephikos), @​edoardomich, Fupeng Zhao (@​AmazingPP), hendrik.schwartke@os-s.de (@​hendrikschwartke), @​i2z1, Jan Grashöfer (@​J-Gras) Jean-Samuel Marier, Justin Azoff (@​JustinAzoff), Mario D (@​mari0d), Markus Elfring (@​elfring), Peter Cullen (@​pbcullen), Sean Donaghy, Simeon Miteff (@​simeonmiteff), Steve Smoot (@​stevesmoot), @​timo-mue, @​wojciech-graj, and Xiaochuan Ye (@​XueSongTap) for their contributions to this release.

Breaking Changes

  • Zeek by default now depends on the availability of the ZeroMQ library for building and running. This is in preparation of switching to the ZeroMQ-based cluster backend by default in future Zeek versions. On an Ubuntu based system, the required system packages are libzmq5, libzmq3-dev and cppzmq-dev. See the Dockerfiles in the ci/ directory for other supported platforms.

  • Zeek and all of its associated submodules now require C++20-capable compilers to build. This will let us move forward in using more modern C++ features and replace some workarounds that we have been carrying. Minimum recommended versions of compilers are GCC 10, Clang 8, and Visual Studio 2022.

  • The zeek::Span class has been deprecated and the APIs in the telemetry subsystem switched to use std::span instead of zeek::Span. If your plugin instantiates counter or gauge instances using the telemetry subsystem and you've previously used zeek::Span explicitly, updates may be needed.

  • The code base underwent a big cleanup of #include usage, across almost all of the files. We tested builds of all of the existing third-party packages and only noticed one or two failures, but there is a possibility for breakage related to this cleanup.

  • The lookup_connection() and connection_exists() builtin functions now require conn_id instances as argument, rather than internally supporting duck type matching conn_id-like records.

  • Network timestamps are not added to events by default anymore. Use the following redef line to enable them:

    redef EventMetadata::add_network_timestamp = T;

    The background is that event metadata has become more generic and may incur a small overhead when enabled. There's not enough users of network timestamp metadata to justify the complexity of treating it separate.

  • The ASCII writer's JSON::TS_MILLIS timestamp format was changed to produce signed integers. This matters for the representation for timestamps that are before the UNIX epoch. These are now written as negative values, while previously the negative value was interpreted as an unsigned integer, resulting in very large timestamps, potentially causing issues for downstream consumers.

    If you prefer to always have unsigned values, it's possible to revert to the previous behavior by setting:

    redef LogAscii::json_timestamps = JSON::TS_MILLIS_UNSIGNED;

  • The "endpoint" label of metrics exposed via Prometheus or the telemetry.log was renamed to "node". This is done for consistency with cluster terminology: The label values have always been the value of ``Cluster::node`, so it's more intuitive to call it. The "endpoint" name originated from a time when the telemetry framework was implemented in Broker.

    To revert to the "endpoint" label, you can do the following, but we strongly suggest to migrate to the new default "node" instead:

    redef Telemetry::metrics_endpoint_label = "endpoint";

  • The current_event_time() builtin function as well as Event::Time() and EventMgr::CurrentEventTime() now return -1.0 if no timestamp metadata is available for the current event, or if no event is being dispatched. Previously this would've been 0.0, or the timestamp of the previously dispatched event.

  • Missing network timestamp metadata on remote events is not set to the local network time anymore by default. This potentially hid useful debugging information about another node not sending timestamp metadata. The old behavior can be re-enabled as follows:

    redef EventMetadata::add_missing_remote_network_timestamp = T;

  • The IsPacketSource() method on IOSource was removed. It was unused and incorrectly returned false on all packet sources.

  • The --with-binpac and --with-bifcl arguments for configure are now deprecated. Both arguments have for a long time just used the internal version of the tooling even if something was passed, so they were mostly useless. This may cause breakage of cross-compiling, where the binpac and bifcl tooling needs to be run on the host machine. We haven't heard from anyone that this is the case with the arguments in their currently-broken state.

  • The parsing of data for the ssl_session_ticket_handshake event was fixed. In the past, the data contained two extra bytes before the session ticket data. The event now contains only the session ticket data. You might have to adjust your scripts if you manually worked around this bug in the past.

New Functionality

  • Zeek now supports pluggable and customizable connection tracking. The default behavior remains unchanged and uses a connection's five tuple based on the IP/port pairs and proto field. Zeek 8 ships with one additional implementation, to factor VLAN tags into the connection tracking. To switch to VLAN-aware connection tracking:

    @​load frameworks/conn_key/vlan_fivetuple

    By convention, additional fields used by alternative ConnKey implementations are added into the new ctx field of conn_id. The type of ctx is conn_id_ctx.

    The vlan_fivetuple script adds two additional fields to the conn_id_ctx record type, representing any VLAN tags involved. Accordingly, every log using conn_id reflects the change as well as ctx and the VLAN fields have the &log attribute. The columns used for logging will be named id.ctx.vlan and id.ctx.inner_vlan.

    This feature does not automatically provide a notion of endpoint that corresponds with the effective connection tuple. For example, applications tracking endpoints by IP address do not somehow become VLAN-aware when enabling VLAN-aware tracking.

    Users may experiment with their own notion of endpoint by combining the orig_h or resp_h field of conn_id with the new ctx field. For example, tracking the number of connections from a given host in a VLAN-aware fashion can be done as follows:

    global connection_counts: table[conn_id_ctx, addr] of count &default=0;

    event new_connection(c: connection) { ++connection_counts[cidctx, cidorig_h]; }

    Note that this script snippet isn't VLAN-specific, yet it is VLAN-aware if the vlan_fivetuple script is loaded. In future Zeek versions, this pattern is likely to be used to adapt base and policy scripts for more "context awareness".

    Users may add their own plugins (for example via a zkg package) to provide alternative implementations. This involves implementing a factory for connection "keys" that factor in additional flow information. See the VLAN implementation in the src/packet_analysis/protocol/ip/conn_key/vlan_fivetuple directory for an example.

  • Added support to ZeekControl for seamlessly switching to ZeroMQ as cluster backend by adding the following settings to zeekctl.cfg:

    ClusterBackend = ZeroMQ UseWebSocket = 1

    With the ZeroMQ cluster backend, Zeekctl requires to use Zeek's WebSocket API to communicate with individual nodes for the print and netstats commands. Setting the UseWebSocket option enables a WebSocket server on the manager node, listening on 127.0.0.1:27759 by default (this is configurable with using the newly introduced WebSocketHost and WebSocketPort options). The UseWebSocket option can also be used when ClusterBackend is set to Broker, but isn't strictly required.

    For ZeroMQ (or other future cluster backends), setting UseWebSocket is a requirement as Zeekctl does not speak the native ZeroMQ protocol to communicate with cluster nodes for executing commands. This functionality requires the websockets Python package with version 11.0 or higher.

  • Cluster telemetry improvements. Zeek now exposes a configurable number of metrics regarding outgoing and incoming cluster events. By default, the number of events sent and received by a Zeek cluster node and any attached WebSocket clients is tracked as four individual counters. It's possible to gather more detailed information by adding Cluster::Telemetry::VERBOSE and Cluster::Telemetry::DEBUG to the variables Cluster::core_metrics and Cluster::webscoket_metrics:

    redef Cluster::core_metrics += { Cluster::Telemetry::VERBOSE }; redef Cluster::websocket_metrics += { Cluster::Telemetry::DEBUG };

    Configuring verbose, adds metrics that are labeled with the event handler and topic name. Configuring debug, uses histogram metrics to additionally track the distribution of the serialized event size. Additionally, when debug is selected, outgoing events are labeled with the script location from where they were published.

  • Support for the X-Application-Name HTTP header was added to the WebSocket API at v1/messages/json. A WebSocket application connecting to Zeek may set the X-Application-Name header to a descriptive identifier. The value of this header will be added to the cluster metrics as app label. This allows to gather incoming and outgoing event metrics of a specific WebSocket application, simply by setting the X-Application-Name header.

  • The SMTP analyzer can now optionally forward the top-level RFC 822 message individual SMTP transactions to the file analysis framework. This can be leveraged to extract emails in form of .eml files from SMTP traffic to disk.

    To enable this feature, set the SMTP::enable_rfc822_msg_file_analysis option and implement an appropriate file_new() or file_over_new_connection() handler:

    redef SMTP::enable_rfc822_msg_file_analysis = T;

    event file_over_new_connection(f: fa_file, c: connection, is_orig: bool) { if ( fid == csmtprfc822\_msg_fuid ) Files::add_analyzer(f, Files::ANALYZER_EXTRACT, \[extract_filename="email"]); }

  • Generic event metadata support. A new EventMetadata module was added allowing to register generic event metadata types and accessing the current event's metadata using the functions current() and current_all() of this module.

  • A new plugin hook, HookPublishEvent(), has been added for intercepting publishing of Zeek events. This hook may be used for monitoring purposes, modifying or rerouting remote events.

    Plugins can implement and enable this hook by calling the following method within their Configure() implementation.

    EnableHook(HOOK_PUBLISH_EVENT)

    The signature of HookPublishEvent() is as follows.

    bool HookPublishEvent(zeek::cluster::Backend& backend, const std::string& topic, zeek::cluster::detail::Event& event);

  • Zeek now includes the Redis protocol analyzer from the evantypanski/spicy-redis project (https://github.com/evantypanski/spicy-redis). This analyzer is enabled by default. This analyzer logs Redis commands and their associated replies in redis.log.

    To disable the analyzer in case of issues, use the following snippet:

    redef Analyzer::disabled_analyzers += { Analyzer::ANALYZER_REDIS, };

  • The FTP analyzer now supports explicit TLS via AUTH TLS.

  • Two new script-level hooks in the Intel framework have been added.

    hook indicator_inserted(indicator_value: string, indicator_type: Intel::Type)

    hook indicator_removed(indicator_value: string, indicator_type: Intel::Type)

    These are reliably invoked on worker and manager nodes the first time an indicator value is inserted into the store and once it has been completely removed from the store.

  • The frameworks/intel/seen scripts have been annotated with event groups and a new frameworks/intel/seen/manage-event-groups policy script added.

    The motivation is to allow Zeek distributors to load the intel/seen scripts by default without incurring their event overhead when no Intel indicators are loaded. Corresponding event handlers are enabled once the first Intel indicator of a given Intel::Type is added. Event handlers are disabled when the last indicator is removed, again.

    Note that the manage-event-groups script interacts with the Intel::seen_policy hook: If no indicators for a given Intel::Type are loaded, the Intel::seen_policy will not be invoked as the event handlers extracting indicators aren't executed.

    If you rely on the Intel::seen_policy hook to be invoked regardless of the contents of the Intel store, do not load the manage-event-groups or set:

    redef Intel::manage_seen_event_groups = F;

  • The DNS analyzer was extended to support NAPTR RRs (RFC 2915, RFC 3403). A corresponding dns_NAPTR_reply event was added.

  • A new get_tags_by_category BIF method was added that returns a list of tags for a specified plugin category. This can be used in lieu of calling zeek -NN and parsing the output. For example, this will return the list of all analyzer plugins currently loaded:

    get_tags_by_category("ANALYZER");

  • A new conn_generic_packet_threshold_crossed event was introduced. The event triggers for any IP-based session that reaches a given threshold. Multiple packet thresholds can be defined in ConnThreshold::generic_packet_thresholds. The generic thresholds refer to the total number of packets on a connection without taking direction into account (i.e. the event also triggers on one-sided connections).

    The event is intended as an alternative to the new_connection event that allows for ignoring short-lived connections like DNS or scans. For example, it can be used to set up traditional connection monitoring without introducing overhead for connections that would never reach a larger threshold anyway.

  • Zeek now supports extracting the PPPoE session ID. The PacketAnalyzer::PPPoE::session_id BiF can be used to get the session ID of the current packet.

    The onn/pppoe-session-id-logging.zeek policy script adds pppoe session IDs to the connection log.

    The get_conn_stats() function's return value now includes the number of packets that have not been processed by any analyzer. Using data from get_conn_stats() and get_net_stats(), it's possible to determine the number of packets that have been received and accepted by Zeek, but eventually discarded without processing.

  • Two new hooks, Cluster::on_subscribe() and Cluster::on_unsubscribe() have been added to allow observing Subscribe() and Unsubscribe() calls on backends by Zeek scripts.

Changed Functionality

  • The Conn::set_conn function is now always run in new_connection, instead of only being run in connection_state_remove.

  • Logging of failed analyzers has been overhauled. dpd.log was replaced by a new analyzer.log that presents a more unified and consistent view of failed analyzers. The previous analyzer.log was renamed to analyzer-debug.log; see below for more details.

    For protocol analyzers, analyzer.log now reports initially confirmed analyzers that Zeek subsequently removed from the connection due to a protocol violation.

    For file and packet analyzers, all errors will be logged to analyzer.log.

    As part of this work, a new analyzer_failed event has been introduced. This event is raised when an analyzer is removed because of raising a violation.

  • analyzer.log was renamed to analyzer_debug.log, and is no longer created by default. The log file will be created if the frameworks/analyzer/debug-logging.zeek policy script is loaded.

    Note that the namespace for options in the script changed to Analyzer::DebugLogging. Furthermore the default options changed to enable more detailed output by default.

  • Record fields with a &default attribute are now consistently re-initialized after deleting such fields. Previously, this would only work for constant expressions, but has been extended to apply to arbitrary expressions.

  • Publishing remote events with vector arguments that contain holes is now rejected. The receiver side never had a chance to figure out where these holes would have been. There's a chance this breaks scripts that accidentally published vectors with holes. A reporter error is produced at runtime when serialization of vectors with holes is attempted.

  • Kerberos support on macOS has been enabled. Due to incompatibilities, the system provided libkrb5 is ignored, however. Only versions from homebrew are supported and found/picked-up by default. Use --with-krb5 for pointing at a custom librkb5 installation.

  • The $listen_host configuration for Cluster::listen_websocket()'s WebSocketServerOptions was deprecated. Use the new $listen_addr field instead.

  • The service_violation field of the connection record was marked as deprecated. Consider using the new failed_analyzers field of the connection record instead.

  • detect-protocol.zeek was the last non-deprecated policy script left inframeworks/dpd. It was moved to frameworks/analyzer/detect-protocol.zeek`.

  • Running Zeek with Zeekygen for documentation extraction (-X|--zeekygen ) now implies -a, i.e., parse-only mode.

  • The not_valid_before and not_valid_after times of X509 certificates are now logged as GMT timestamps. Before, they were logged as local times; thus the output was dependent on the timezone that your system is set to. Similarly, the related events and the Zeek data structures all interpreted times in X509 certificates as local times.

  • The PPPoE parser now respects the size value given in the PPPoE header. Data beyon the size given in the header will be truncated.

  • Record fields with &default attributes initializing empty vector, table or set instances are now deferred until they are accessed, potentially improving memory usage when such fields are never accessed.

Removed Functionality

  • The --with-bind argument for configure was removed. We removed the need for the BIND library from our CMake setup in the v7.2 release, but this non-functional argument was left behind.

  • The --disable-archiver argument for configure was removed. This was deprecated and scheduled to be removed in v7.1, but we apparently missed it during the cleanup for that release.

Deprecated Functionality

  • The dpd.log is now deprecated and replaced by analyzer.log (see above). dpd.log is no longer created by default, but can be loaded using the frameworks/analyzer/deprecated-dpd-log.zeek policy script.

    Relatedly, the service_violation field of the connection record is deprecated and will only be present if the frameworks/analyzer/deprecated-dpd-log.zeek policy script is loaded.

  • The protocols/http/detect-sqli.zeek script has been deprecated in favor of a new protocols/http/detect-sql-injection.zeek script to switch from the victim host being placed into the src field of a notice to instead use dst. The attacker host is now placed into src. Further, notices hold the first sampled connection uid.

    Note that the Notice::Type enumeration names remain the same. You can determine which script was used by the presence of populated uid and dst fields in the notice.log entries.

    The replacement script doesn't populate the email_body_sections anymore either.

  • Using &default and &optional together on a record field has been deprecated as it would only result in &default behavior. This will become an error starting with Zeek 8.1.

  • The zeek::Event() constructor was deprecated. Use event_mgr::Enqueue() or event_mgr::Dispatch() instead.

  • Passing ts as the last argument to EventMgr::Enqueue() has been deprecated and will lead to compile time warnings. Use EventMgr::Enqueue(detail::MetadataVectorPtr meta, ...) for populating meta accordingly.

  • For plugin authors: in the core, the constructor for Connection instances has been deprecated in favor of a new one to support pluggable connection tuples. The ConnTuple struct, used by this deprecated Connection constructor, is now deprecated as well.

  • The zeek::filesystem namespace alias is deprecated in favor of using std::filesystem directly. Similarly, the ghc::filesystem submodule stored in auxil/filessytem has been removed and the files included from it in the Zeek installation will no longer be installed. Builds won't warn about the deprecation of zeek::filesystem due to limitations of how we can mark deprecations in C++.

  • The zeek::util::starts_with and zeek::util::ends_with functions are deprecated. std::string and std::string_view added begins_with and ends_with methods in C++ 20, and those should be used instead.

  • The record_type_to_vector BIF is deprecated in favor of using the newly ordered record_fields BIF.


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻️ Rebasing: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this MR and you won't be reminded about these updates again.


  • If you want to rebase/retry this MR, check this box

This MR has been generated by Renovate Bot.

Merge request reports

Loading