Data Types

Altay Sansal

Feb 12, 2026

0 min read

Intro

ScalarType

A class representing scalar data types.

HeaderSpec

A class representing a header specification.

HeaderField

A class representing header field spec.

Endianness

Enumeration class with three possible endianness values.

class segy.schema.ScalarType

A class representing scalar data types.

IBM32 = 'ibm32'
INT64 = 'int64'
INT32 = 'int32'
INT16 = 'int16'
INT8 = 'int8'
UINT64 = 'uint64'
UINT32 = 'uint32'
UINT16 = 'uint16'
UINT8 = 'uint8'
FLOAT64 = 'float64'
FLOAT32 = 'float32'
FLOAT16 = 'float16'
STRING8 = 'S8'
property dtype: np.dtype[Any]

Return numpy dtype of the format.

pydantic model segy.schema.HeaderSpec

A class representing a header specification.

Examples

Let’s build a header from scratch!

We will define three fields with different names, data-types, and start byte locations.

>>> field1 = HeaderField(
>>>     name="foo",
>>>     format="int32",
>>>     byte=1,
>>> )
>>> field2 = HeaderField(
>>>     name="bar",
>>>     format="int16",
>>>     byte=5,
>>> )
>>> field3 = HeaderField(
>>>     name="fizz",
>>>     format="int32",
>>>     byte=17,
>>> )

Note that the fields span the following byte ranges:

  • field1 between bytes [0, 4)

  • field2 between bytes [4, 6)

  • field3 between bytes [16, 20)

The gap between field2 and field3 will be padded with void. In this case we expect to see an item size of 20-bytes (total length of the header struct).

>>> header = HeaderSpec(
>>>     fields=[field1, field2, field3],
>>> )

Now let’s look at its data type:

>>> header.dtype
dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})

If we wanted to pad the end of the struct (to fit a specific byte range), we would provide the item_size in the spec. If we set it to 30, this means that we padded the struct by 10 bytes at the end.

>>> header = HeaderSpec(
>>>     fields=[field1, field2, field3],
>>>     item_size=30,
>>> )

Now let’s look at its data type:

>>> header.dtype
dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})

To see what’s going under the hood, we can look at a lower level numpy description of the dtype. Here we observe all the gaps (void types).

>>> header.dtype.descr
[('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]

Show JSON schema
{
   "title": "HeaderSpec",
   "description": "A class representing a header specification.\n\nExamples:\n    Let's build a header from scratch!\n\n    We will define three fields with different names, data-types, and\n    start byte locations.\n\n    >>> field1 = HeaderField(\n    >>>     name=\"foo\",\n    >>>     format=\"int32\",\n    >>>     byte=1,\n    >>> )\n    >>> field2 = HeaderField(\n    >>>     name=\"bar\",\n    >>>     format=\"int16\",\n    >>>     byte=5,\n    >>> )\n    >>> field3 = HeaderField(\n    >>>     name=\"fizz\",\n    >>>     format=\"int32\",\n    >>>     byte=17,\n    >>> )\n\n    Note that the fields span the following byte ranges:\n\n    * `field1` between bytes `[0, 4)`\n    * `field2` between bytes `[4, 6)`\n    * `field3` between bytes `[16, 20)`\n\n    The gap between `field2` and `field3` will be padded with `void`. In\n    this case we expect to see an item size of 20-bytes (total length of\n    the header struct).\n\n    >>> header = HeaderSpec(\n    >>>     fields=[field1, field2, field3],\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> header.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})\n\n    If we wanted to pad the end of the struct (to fit a specific byte range),\n    we would provide the item_size in the spec. If we set it to 30, this means\n    that we padded the struct by 10 bytes at the end.\n\n    >>> header = HeaderSpec(\n    >>>     fields=[field1, field2, field3],\n    >>>     item_size=30,\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> header.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})\n\n    To see what's going under the hood, we can look at a lower level numpy\n    description of the `dtype`. Here we observe all the gaps (void types).\n\n    >>> header.dtype.descr\n    [('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]",
   "type": "object",
   "properties": {
      "fields": {
         "description": "List containing multiple header field spec instances.",
         "items": {
            "$ref": "#/$defs/HeaderField"
         },
         "title": "Fields",
         "type": "array"
      },
      "itemSize": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Expected size of the struct.",
         "title": "Itemsize"
      },
      "offset": {
         "anyOf": [
            {
               "minimum": 0,
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Starting byte offset.",
         "title": "Offset"
      },
      "endianness": {
         "anyOf": [
            {
               "$ref": "#/$defs/Endianness"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Endianness of structured data type."
      }
   },
   "$defs": {
      "Endianness": {
         "description": "Enumeration class with three possible endianness values.\n\nAttributes:\n    BIG: Big endian.\n    LITTLE: Little endian.\n\nExamples:\n    >>> endian = Endianness.BIG\n    >>> print(endian.symbol)\n    >",
         "enum": [
            "big",
            "little"
         ],
         "title": "Endianness",
         "type": "string"
      },
      "HeaderField": {
         "description": "A class representing header field spec.\n\nExamples:\n    A named float starting at byte location 9:\n\n    >>> field = HeaderField(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     byte=9,\n    >>> )\n\n    The name, byte, and offset fields will only be used if the structured\n    field is used within the context of a :class:`HeaderSpec`. Offset is\n    calculated automatically from byte location.\n\n    >>> field.name\n    my_var\n    >>> field.byte\n    9\n    >>> field.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataFormat`.\n\n    >>> field.dtype\n    dtype('float32')",
         "properties": {
            "name": {
               "description": "The short name of the field.",
               "title": "Name",
               "type": "string"
            },
            "byte": {
               "description": "Field's start byte location.",
               "minimum": 1,
               "title": "Byte",
               "type": "integer"
            },
            "format": {
               "$ref": "#/$defs/ScalarType",
               "description": "The data type of the field."
            }
         },
         "required": [
            "name",
            "byte",
            "format"
         ],
         "title": "HeaderField",
         "type": "object"
      },
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16",
            "S8"
         ],
         "title": "ScalarType",
         "type": "string"
      }
   },
   "required": [
      "fields"
   ]
}

field fields: list[HeaderField] [Required]

List containing multiple header field spec instances.

field itemSize: int | None = None

Expected size of the struct.

field offset: int | None = None

Starting byte offset.

Constraints:
  • ge = 0

field endianness: Endianness | None = None

Endianness of structured data type.

property dtype: dtype[Any]

Converts the names, data types, and offsets of the object into a NumPy dtype.

property names: list[str]

Get the names of the fields.

property formats: list[dtype[Any]]

Get the formats of the fields.

property offsets: list[int]

Get the offsets the fields.

add_field(field, overwrite=False)

Add a field to the structured data type.

Parameters:
Return type:

None

remove_field(name)

Remove a field from the structured data type by name.

Parameters:

name (str)

Return type:

None

customize(fields)

Customizes existing HeaderSpec fields with new headers handling overlaps.

It first handles name conflicts. Then it handles byte-range intersections.

This ensures no byte-range intersections between new and existing fields. Assumes new fields do not overlap each other (validated first) and existing fields do not overlap each other.

Parameters:

fields (HeaderField | list[HeaderField]) – List of new header fields.

Return type:

None

property itemsize: int

Number of bytes for the data type.

pydantic model segy.schema.HeaderField

A class representing header field spec.

Examples

A named float starting at byte location 9:

>>> field = HeaderField(
>>>     name="my_var",
>>>     format="float32",
>>>     byte=9,
>>> )

The name, byte, and offset fields will only be used if the structured field is used within the context of a HeaderSpec. Offset is calculated automatically from byte location.

>>> field.name
my_var
>>> field.byte
9
>>> field.offset
8

The dtype property is inherited from DataFormat.

>>> field.dtype
dtype('float32')

Show JSON schema
{
   "title": "HeaderField",
   "description": "A class representing header field spec.\n\nExamples:\n    A named float starting at byte location 9:\n\n    >>> field = HeaderField(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     byte=9,\n    >>> )\n\n    The name, byte, and offset fields will only be used if the structured\n    field is used within the context of a :class:`HeaderSpec`. Offset is\n    calculated automatically from byte location.\n\n    >>> field.name\n    my_var\n    >>> field.byte\n    9\n    >>> field.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataFormat`.\n\n    >>> field.dtype\n    dtype('float32')",
   "type": "object",
   "properties": {
      "name": {
         "description": "The short name of the field.",
         "title": "Name",
         "type": "string"
      },
      "byte": {
         "description": "Field's start byte location.",
         "minimum": 1,
         "title": "Byte",
         "type": "integer"
      },
      "format": {
         "$ref": "#/$defs/ScalarType",
         "description": "The data type of the field."
      }
   },
   "$defs": {
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16",
            "S8"
         ],
         "title": "ScalarType",
         "type": "string"
      }
   },
   "required": [
      "name",
      "byte",
      "format"
   ]
}

field name: str [Required]

The short name of the field.

field byte: int [Required]

Field’s start byte location.

Constraints:
  • ge = 1

field format: ScalarType [Required]

The data type of the field.

property offset: int

Return zero based offset from one based byte location.

property dtype: dtype[Any]

Converts the data type of the object into a NumPy dtype.

property range: tuple[int, int]

Return the start and stop byte location of the field.

Note: This return is Fortran-style and right half-open. [start, stop)

property itemsize: int

Number of bytes for the data type.

class segy.schema.Endianness

Enumeration class with three possible endianness values.

BIG

Big endian.

LITTLE

Little endian.

Examples

>>> endian = Endianness.BIG
>>> print(endian.symbol)
>
BIG = 'big'
LITTLE = 'little'
property symbol: Literal['<', '>', '=']

Get the numpy symbol for the endianness from mapping.