Process Telemetry Data Sources
    • 21 May 2024
    • 1 Minute to read
    • PDF

    Process Telemetry Data Sources

    • PDF

    Article summary

    The agent utilizes multiple datasources on an endpoint to capture all of the fields necessary to create a complete process event.

    Datasources for Linux

    The agent consumes raw, unfiltered process data from the system's audit netlink socket. 

    This raw data isn't particularly helpful on its own, as you can see below:

    type=SYSCALL msg=audit(1364481363.243:24287): arch=c000003e syscall=2 success=no exit=-13 a0=7fffd19c5592 a1=0 a2=7fffd19c4b50 a3=a items=1 ppid=2686 pid=3538 auid=500 uid=500 gid=500 euid=500 suid=500 fsuid=500 egid=500 sgid=500 fsgid=500 tty=pts0 ses=1 comm="cat" exe="/bin/cat" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="sshd_config"
    type=CWD msg=audit(1364481363.243:24287):  cwd="/home/shadowman"
    type=PATH msg=audit(1364481363.243:24287): item=0 name="/etc/ssh/sshd_config" inode=409248 dev=fd:00 mode=0100600 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:etc_t:s0

     Challenges

    • Events can span multiple lines and records, requiring correlation

    • Data is in a CSV, key-value like format

    • Many fields are raw and would benefit from translation to something “human readable”

    • Many desired fields are not available from audit including file hashes

    Red Canary Linux EDR addresses all of these challenges

    • Events are correlated and provided in an easily understandable, consumable JSON format

    • Additional metadata and fields are obtained from other datasources, utilizing a plug-n-play architecture we call the Event Model

    Here’s a comparison of process data collected from Audit (alone) vs. Red Canary’s Linux EDR (utilizing Audit):

    Field 

     Audit 

    Red Canary Linux EDR 

    timestamp

    host_name

    ✅ 

    user_uid

    ✅ 

    user_name

    ✅ 

    user_domain

    ✅ 

    user_username

    ✅ 

    login_user_uid

    ✅ 

    login_user_name

    ✅ 

    login_user_domain

    ✅ 

    process_md5

    ✅ 

    process_sha256

    ✅ 

    process_pid

    ✅ 

    process_name

    ✅ 

    process_path

    ✅ 

    process_command_line

    ✅ 

    parent_process_timestamp

    ✅ 

    parent_process_pid

    ✅ 

    parent_process_name

    ✅ 

    parent_process_path

    ✅ 

    parent_process_md5

    ✅ 

    parent_process_sha256

    ✅ 

    What this means to you

    • The depth and breadth of telemetry collected will often exceed the telemetry collected by commercial or open source solutions that primarily only use audit (e.g.: osquery, go-audit, auditbeat, zeek-agent, and so on)

    • We aren’t married to, or held hostage by, any one datasource.

    • If operating system APIs change or new subsystems are added as operating systems evolve, we can swiftly adapt to these changes rather than requiring a complete redesign of the product.

    • Our plug-and-play architecture enables us to construct an event from any number of datasources on an endpoint, giving us flexibility and the ability to deliver the most granular telemetry possible.


    Was this article helpful?