Binary input data format

Binary Parser Plugin

Use the binary input data format with user-specified configurations to parse binary protocols into Telegraf metrics.

Configuration

[[inputs.file]]
  files = ["example.bin"]

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "binary"

  ## Do not error-out if none of the filter expressions below matches.
  # allow_no_match = false

  ## Specify the endianness of the data.
  ## Available values are "be" (big-endian), "le" (little-endian) and "host",
  ## where "host" means the same endianness as the machine running Telegraf.
  # endianess = "host"

  ## Interpret input as string containing hex-encoded data.
  # hex_encoding = false

  ## Multiple parsing sections are allowed
  [[inputs.file.binary]]
    ## Optional: Metric (measurement) name to use if not extracted from the data.
    # metric_name = "my_name"

    ## Definition of the message format and the extracted data.
    ## Please note that you need to define all elements of the data in the
    ## correct order with the correct length as the data is parsed in the order
    ## given.
    ## An entry can have the following properties:
    ##  name        --  Name of the element (e.g. field or tag). Can be omitted
    ##                  for special assignments (i.e. time & measurement) or if
    ##                  entry is omitted.
    ##  type        --  Data-type of the entry. Can be "int8/16/32/64", "uint8/16/32/64",
    ##                  "float32/64", "bool" and "string".
    ##                  In case of time, this can be any of "unix" (default), "unix_ms", "unix_us",
    ##                  "unix_ns" or a valid Golang time format.
    ##  bits        --  Length in bits for this entry. If omitted, the length derived from
    ##                  the "type" property will be used. For "time" 64-bit will be used
    ##                  as default.
    ##  assignment  --  Assignment of the gathered data. Can be "measurement", "time",
    ##                  "field" or "tag". If omitted "field" is assumed.
    ##  omit        --  Omit the given data. If true, the data is skipped and not added
    ##                  to the metric. Omitted entries only need a length definition
    ##                  via "bits" or "type".
    ##  terminator  --  Terminator for dynamic-length strings. Only used for "string" type.
    ##                  Valid values are "fixed" (fixed length string given by "bits"),
    ##                  "null" (null-terminated string) or a character sequence specified
    ##                  as HEX values (e.g. "0x0D0A"). Defaults to "fixed" for strings.
    ##  timezone    --  Timezone of "time" entries. Only applies to "time" assignments.
    ##                  Can be "utc", "local" or any valid Golang timezone (e.g. "Europe/Berlin")
    entries = [
      { type = "string", assignment = "measurement", terminator = "null" },
      { name = "address", type = "uint16", assignment = "tag" },
      { name = "value",   type = "float64" },
      { type = "unix", assignment = "time" },
    ]

    ## Optional: Filter evaluated before applying the configuration.
    ## This option can be used to mange multiple configuration specific for
    ## a certain message type. If no filter is given, the configuration is applied.
    # [inputs.file.binary.filter]
    #   ## Filter message by the exact length in bytes (default: N/A).
    #   # length = 0
    #   ## Filter the message by a minimum length in bytes.
    #   ## Messages longer of of equal length will pass.
    #   # length_min = 0
    #   ## List of data parts to match.
    #   ## Only if all selected parts match, the configuration will be
    #   ## applied. The "offset" is the start of the data to match in bits,
    #   ## "bits" is the length in bits and "match" is the value to match
    #   ## against. Non-byte boundaries are supported, data is always right-aligned.
    #   selection = [
    #     { offset = 0, bits = 8, match = "0x1F" },
    #   ]
    #
    #

In this configuration mode, you explicitly specify the field and tags to parse from your data.

A configuration can contain multiple binary subsections. For example, the file plugin can process binary data multiple times. This can be useful (together with filters) to handle different message types.

Note: The filter section needs to be placed after the entries definitions, otherwise the entries will be assigned to the filter section.

General options and remarks

`allow_no_match` (optional)

By specifying allow_no_match you allow the parser to silently ignore data that does not match any given configuration filter. This can be useful if you only want to collect a subset of the available messages.

`endianness` (optional)

This specifies the endianness of the data. If not specified, the parser will fall back to the “host” endianness, assuming that the message and Telegraf machine share the same endianness. Alternatively, you can explicitly specify big-endian format ("be") or little-endian format ("le").

`hex_encoding` (optional)

If true, the input data is interpreted as a string containing hex-encoded data like C0 C7 21 A9. The value is case insensitive and can handle spaces, however prefixes like 0x or x are not allowed.

Non-byte aligned value extraction

In both, filter and entries definitions, values can be extracted at non-byte boundaries. You can for example extract 3-bit starting at bit-offset 8. In those cases, the result will be masked and shifted such that the resulting byte-value is right aligned. In case your 3-bit are 101 the resulting byte value is 0x05.

This is especially important when specifying the match value in the filter section.

Entries definitions

The entries array specifies how to parse the message into the measurement name, timestamp, tags, and fields.

`measurement` specification

When setting the assignment to "measurement", the extracted value is used as the metric name, overriding other specifications. The type setting is assumed to be "string" and can be omitted similar to the name option. See string type handling for details and further options.

`time` specification

When setting the assignment to "time", the extracted value is used as the timestamp of the metric. The default is the current time for all created metrics.

The type setting specifies the time-format of included timestamps. Use one of the following:

unix (default)
unix_ms
unix_us
unix_ns
Go “reference time”. Consult the Go time package for details and additional examples on how to set the time format.

For the unix format and derivatives, the underlying value is assumed to be a 64-bit integer. The bits setting can be used to specify other length settings. All other time-formats assume a fixed-length string value to be extracted. The length of the string is automatically determined using the format setting in type.

The timezone setting converts the extracted time to the given value timezone. By default, the time will be interpreted as utc. Other valid values are local (the local timezone configured for the machine), or valid timezone-specification (for example,Europe/Berlin).

`tag` specification

When setting the assignment to "tag", the extracted value is used as a tag. The name setting is the name of the tag and the type defaults to string. When specifying other types, the extracted value is first interpreted as the given type and then converted to string.

The bits setting can be used to specify the length of the data to extract and is required for fixed-length string types.

`field` specification

When setting the assignment to "field" or omitting the assignment setting, the extracted value is used as a field. The name setting is used as the name of the field and the type as the type of the field value.

The bits setting can be used to specify the length of the data to extract. By default the length corresponding to type is used. Please see the string and bool specific sections when using those types.

`string` type handling

Strings are assumed to be fixed-length strings by default. In this case, the bits setting is mandatory to specify the length of the string in bit.

To handle dynamic strings, the terminator setting can be used to specify characters to terminate the string. The two named options, fixed and null specify fixed-length and null-terminated strings, respectively. Any other setting is interpreted as a hexadecimal sequence of bytes matching the end of the string. The termination-sequence is removed from the result.

`bool` type handling

By default, bool types are assumed to be one bit in length. You can specify any other length by using the bits setting. When interpreting values as booleans, any zero value is false and any non-zero value is true.

omitting data

Parts of the data can be omitted by setting omit = true. In this case, you only need to specify the length of the chunk to omit by either using the type or bits setting. All other options can be skipped.

Filter definitions

Filters can be used to match the length or the content of the data against a specified reference. See the examples section for details. You can also check multiple parts of the message by specifying multiple section entries for a filter. Each section is then matched separately. All have to match to apply the configuration.

`length` and `length_min` options

Using the length option, the filter checks if the parsed data has exactly the given number of bytes. Otherwise, the configuration is not applied. Similarly, for length_min the data has to have at least the given number of bytes to generate a match.

`selection` list

Selections can be used with or without length constraints to match the content of the data. Here, the offset and bits properties specify the start and length of the data to check. Both values are in bit allowing for non-byte aligned value extraction. The extracted data is checked against the given match value specified in HEX.

If multiple selection entries are specified all of the selections must match for the configuration to get applied.

Examples

In the following example, we use a binary protocol with three different messages in little-endian format

Message A definition

+--------+------+------+--------+--------+------------+--------------------+--------------------+
| ID     | type | len  | addr   | count  | failure    | value              | timestamp          |
+--------+------+------+--------+--------+------------+--------------------+--------------------+
| 0x0201 | 0x0A | 0x18 | 0x7F01 | 0x2A00 | 0x00000000 | 0x6F1283C0CA210940 | 0x10D4DF6200000000 |
+--------+------+------+--------+--------+------------+--------------------+--------------------+

Message B definition

+--------+------+------+------------+
| ID     | type | len  | value      |
+--------+------+------+------------+
| 0x0201 | 0x0B | 0x04 | 0xDEADC0DE |
+--------+------+------+------------+

Message C definition

+--------+------+------+------------+------------+--------------------+
| ID     | type | len  | value x    | value y    | timestamp          |
+--------+------+------+------------+------------+--------------------+
| 0x0201 | 0x0C | 0x10 | 0x4DF82D40 | 0x5F305C08 | 0x10D4DF6200000000 |
+--------+------+------+------------+------------+--------------------+

All messages consists of a 4-byte header containing the message type in the 3rd byte and a message specific body. To parse those messages you can use the following configuration:

[[inputs.file]]
  files = ["messageA.bin", "messageB.bin", "messageC.bin"]
  data_format = "binary"
  endianess = "le"

  [[inputs.file.binary]]
    metric_name = "messageA"

    entries = [
      { bits = 32, omit = true },
      { name = "address", type = "uint16", assignment = "tag" },
      { name = "count",   type = "int16" },
      { name = "failure", type = "bool", bits = 32, assignment = "tag" },
      { name = "value",   type = "float64" },
      { type = "unix",    assignment = "time" },
    ]

    [inputs.file.binary.filter]
      selection = [{ offset = 16, bits = 8, match = "0x0A" }]

  [[inputs.file.binary]]
    metric_name = "messageB"

    entries = [
      { bits = 32, omit = true },
      { name = "value",   type = "uint32" },
    ]

    [inputs.file.binary.filter]
      selection = [{ offset = 16, bits = 8, match = "0x0B" }]

  [[inputs.file.binary]]
    metric_name = "messageC"

    entries = [
      { bits = 32, omit = true },
      { name = "x",   type = "float32" },
      { name = "y",   type = "float32" },
      { type = "unix",    assignment = "time" },
    ]

    [inputs.file.binary.filter]
      selection = [{ offset = 16, bits = 8, match = "0x0C" }]

The above configuration has one [[inputs.file.binary]] section per message type and uses a filter in each of those sections to apply the correct configuration by comparing the 3rd byte (containing the message type). This results in the following output:

metricA,address=383,failure=false count=42i,value=3.1415 1658835984000000000
metricB value=3737169374i 1658847037000000000
metricC x=2.718280076980591,y=0.0000000000000000000000000000000006626070178575745 1658835984000000000

metricB uses the parsing time as timestamp due to missing information in the data. The other two metrics use the timestamp derived from the data.

Was this page helpful?

Thank you for your feedback!

Support and feedback

Thank you for being part of our community! We welcome and encourage your feedback and bug reports for Telegraf and this documentation. To find support, use the following resources:

Customers with an annual or support contract can contact InfluxData Support.

Edit this page Submit docs issue Submit Telegraf issue

Binary input data format

Configuration

General options and remarks

allow_no_match (optional)

endianness (optional)

hex_encoding (optional)

Non-byte aligned value extraction

Entries definitions

measurement specification

time specification

tag specification

field specification

string type handling

bool type handling