Traces

Altay Sansal

May 07, 2024

4 min read

Defining a Trace

The TraceDescriptor is a way to define the structure of a seismic trace as stored in SEG-Y files. It is composed of Trace Header Descriptor and Trace Data Descriptor. This information is combined using the TraceDescriptor.

The TraceDescriptor has fields for trace header, optional extended trace header, and trace data definitions. We also provide an optional offset field to define the beginning byte-location of the traces within a binary file. Most of the time this field gets populated automatically.

A custom trace descriptor can be built programmatically following a simple workflow. The same descriptor can be built from JSON as well. Navigate to JSON Trace Descriptor below for that.

Trace Header Descriptor

Trace headers are defined using StructuredDataTypeDescriptor. Each header field is a StructuredFieldDescriptor. We have an example workflow here. You can see more examples in the Data Types documentation.

We first do the required imports and then define header fields. By default, endianness is big, so we don’t have to declare it.

 1from segy.schema.data_type import StructuredFieldDescriptor
 2
 3trace_header_fields = [
 4    StructuredFieldDescriptor(
 5        name="inline",
 6        offset=188,
 7        format="int32",
 8    ),
 9    StructuredFieldDescriptor(
10        name="crossline",
11        offset=192,
12        format="int32",
13    ),
14]

Then we create StructuredDataTypeDescriptor for trace headers. We know trace headers must be 240-bytes so we declare it. This will ensure we read/write with correct padding.

1from segy.schema.data_type import StructuredDataTypeDescriptor
2
3trace_header_descriptor = StructuredDataTypeDescriptor(
4    fields=trace_header_fields,
5    item_size=240,
6)

Trace Data Descriptor

Trace data is described using TraceDataDescriptor. The data is mainly explained by its data type (endianness and format), and number of samples.

Continuing our previous example, we build the data descriptor. We assume that samples are encoded in ‘ibm32’ format and and they are big endian (again, default).

1
2from segy.schema.trace import TraceDataDescriptor
3
4trace_data_descriptor = TraceDataDescriptor(
5    format="ibm32",
6    samples=360
7)

Trace Descriptor

Finally, since we have all components, we can create a descriptor for of a trace.

1from segy.schema.trace import TraceDescriptor
2
3trace_descriptor = TraceDescriptor(
4    header_descriptor=trace_header_descriptor,
5    data_descriptor=trace_data_descriptor,
6    offset=3600  # just an example of possible offset
7)

If we look at the Numpy data type of the trace, we can see how it will be decoded from raw bytes:

1>>> trace_descriptor.dtype
2dtype([('header', {'names': ['inline', 'crossline'], 'formats': ['>i4', '>i4'], 'offsets': [188, 192], 'itemsize': 240}), ('data', '>u4', (360,))])

JSON Trace Descriptor

We can define the exact same trace descriptor above using JSON. This can either be defined as a string or can be read from a file. Both will work. Let’s write the JSON.

{
  "headerDescriptor": {
    "fields": [
      {
        "format": "int32",
        "name": "inline",
        "offset": 188
      },
      {
        "format": "int32",
        "name": "crossline",
        "offset": 192
      }
    ],
    "itemSize": 240
  },
  "dataDescriptor": {
    "format": "ibm32",
    "samples": 360
  },
  "offset": 3600
}

Then if we have our JSON as a string in the variable json_str, we can generate the same descriptor, with validation of all fields. If there are any errors in the JSON, there will be a validation error raised.

1>>> trace_descriptor_from_json = TraceDescriptor.model_validate_json(json_str)
2>>> trace_descriptor_from_json == trace_descriptor
3True

Reference

pydantic model segy.schema.trace.TraceDescriptor

A descriptor class for a Trace (Header + Data).

Show JSON schema
{
   "title": "TraceDescriptor",
   "description": "A descriptor class for a Trace (Header + Data).",
   "type": "object",
   "properties": {
      "description": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Description of the field.",
         "title": "Description"
      },
      "headerDescriptor": {
         "allOf": [
            {
               "$ref": "#/$defs/StructuredDataTypeDescriptor"
            }
         ],
         "description": "Trace header descriptor."
      },
      "extendedHeaderDescriptor": {
         "anyOf": [
            {
               "$ref": "#/$defs/StructuredDataTypeDescriptor"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Extended trace header descriptor."
      },
      "sampleDescriptor": {
         "allOf": [
            {
               "$ref": "#/$defs/TraceSampleDescriptor"
            }
         ],
         "description": "Trace data descriptor."
      },
      "offset": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Starting offset of the trace.",
         "title": "Offset"
      },
      "endianness": {
         "anyOf": [
            {
               "$ref": "#/$defs/Endianness"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Endianness of traces and headers."
      }
   },
   "$defs": {
      "Endianness": {
         "description": "Enumeration class with three possible endianness values.\n\nExamples:\n    >>> endian = Endianness.BIG\n    >>> print(endian.symbol)\n    >",
         "enum": [
            "big",
            "little",
            "native"
         ],
         "title": "Endianness",
         "type": "string"
      },
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16"
         ],
         "title": "ScalarType",
         "type": "string"
      },
      "StructuredDataTypeDescriptor": {
         "description": "A class representing a descriptor for a structured data-type.\n\nExamples:\n    Let's build a structured data type from scratch!\n\n    We will define three fields with different names, data-types, and\n    starting offsets.\n\n    >>> field1 = StructuredFieldDescriptor(\n    >>>     name=\"foo\",\n    >>>     format=\"int32\",\n    >>>     offset=0,\n    >>> )\n    >>> field2 = StructuredFieldDescriptor(\n    >>>     name=\"bar\",\n    >>>     format=\"int16\",\n    >>>     offset=4,\n    >>> )\n    >>> field3 = StructuredFieldDescriptor(\n    >>>     name=\"fizz\",\n    >>>     format=\"int32\",\n    >>>     offset=16,\n    >>> )\n\n    Note that the fields span the following byte ranges:\n\n    * `field1` between bytes `[0, 4)`\n    * `field2` between bytes `[4, 6)`\n    * `field3` between bytes `[16, 20)`\n\n    The gap between `field2` and `field3` will be padded with `void`. In\n    this case we expect to see an item size of 20-bytes (total length of\n    the struct).\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})\n\n    If we wanted to pad the end of the struct (to fit a specific byte range),\n    we would provide the item_size in the descriptor. If we set it to 30,\n    this means that we padded the struct by 10 bytes at the end.\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>>     item_size=30,\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})\n\n    To see what's going under the hood, we can look at a lower level numpy\n    description of the `dtype`. Here we observe all the gaps (void types).\n\n    >>> struct_dtype.dtype.descr\n    [('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "fields": {
               "description": "A list of descriptors for a structured data-type.",
               "items": {
                  "$ref": "#/$defs/StructuredFieldDescriptor"
               },
               "title": "Fields",
               "type": "array"
            },
            "itemSize": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Expected size of the struct.",
               "title": "Itemsize"
            },
            "offset": {
               "anyOf": [
                  {
                     "minimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Starting byte offset.",
               "title": "Offset"
            },
            "endianness": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/Endianness"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Endianness of structured data type."
            }
         },
         "required": [
            "fields"
         ],
         "title": "StructuredDataTypeDescriptor",
         "type": "object"
      },
      "StructuredFieldDescriptor": {
         "description": "A class representing a descriptor for a structured data-type field.\n\nExamples:\n    A named float at offset 8-bytes:\n\n    >>> data_type = StructuredFieldDescriptor(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     offset=8,\n    >>> )\n\n    The name and offset fields will only be used if the structured\n    field is used within the context of a :class:`StructuredDataTypeDescriptor`.\n\n    >>> data_type.name\n    my_var\n    >>> data_type.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataTypeDescriptor`.\n\n    >>> data_type.dtype\n    dtype('float32')",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "The data type of the field."
            },
            "name": {
               "description": "The short name of the field.",
               "title": "Name",
               "type": "string"
            },
            "offset": {
               "description": "Starting byte offset.",
               "minimum": 0,
               "title": "Offset",
               "type": "integer"
            }
         },
         "required": [
            "format",
            "name",
            "offset"
         ],
         "title": "StructuredFieldDescriptor",
         "type": "object"
      },
      "TraceSampleDescriptor": {
         "description": "A descriptor class for a Trace Samples.",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "Format of trace samples."
            },
            "samples": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of samples in trace. It can be variable, then it must be read from each trace header.",
               "title": "Samples"
            }
         },
         "required": [
            "format"
         ],
         "title": "TraceSampleDescriptor",
         "type": "object"
      }
   },
   "required": [
      "headerDescriptor",
      "sampleDescriptor"
   ]
}

field headerDescriptor: StructuredDataTypeDescriptor [Required]

Trace header descriptor.

field extendedHeaderDescriptor: StructuredDataTypeDescriptor | None = None

Extended trace header descriptor.

field sampleDescriptor: TraceSampleDescriptor [Required]

Trace data descriptor.

field offset: int | None = None

Starting offset of the trace.

field endianness: Endianness | None = None

Endianness of traces and headers.

property dtype: dtype[Any]

Get numpy dtype.