Data Types

Altay Sansal

May 07, 2024

0 min read

Intro

ScalarType

A class representing scalar data types.

StructuredDataTypeDescriptor

A class representing a descriptor for a structured data-type.

StructuredFieldDescriptor

A class representing a descriptor for a structured data-type field.

Endianness

Enumeration class with three possible endianness values.

class segy.schema.data_type.ScalarType

A class representing scalar data types.

IBM32 = 'ibm32'
INT64 = 'int64'
INT32 = 'int32'
INT16 = 'int16'
INT8 = 'int8'
UINT64 = 'uint64'
UINT32 = 'uint32'
UINT16 = 'uint16'
UINT8 = 'uint8'
FLOAT64 = 'float64'
FLOAT32 = 'float32'
FLOAT16 = 'float16'
property char: str

Returns the numpy character code for a given data type string.

pydantic model segy.schema.data_type.StructuredDataTypeDescriptor

A class representing a descriptor for a structured data-type.

Examples

Let’s build a structured data type from scratch!

We will define three fields with different names, data-types, and starting offsets.

>>> field1 = StructuredFieldDescriptor(
>>>     name="foo",
>>>     format="int32",
>>>     offset=0,
>>> )
>>> field2 = StructuredFieldDescriptor(
>>>     name="bar",
>>>     format="int16",
>>>     offset=4,
>>> )
>>> field3 = StructuredFieldDescriptor(
>>>     name="fizz",
>>>     format="int32",
>>>     offset=16,
>>> )

Note that the fields span the following byte ranges:

  • field1 between bytes [0, 4)

  • field2 between bytes [4, 6)

  • field3 between bytes [16, 20)

The gap between field2 and field3 will be padded with void. In this case we expect to see an item size of 20-bytes (total length of the struct).

>>> struct_dtype = StructuredDataTypeDescriptor(
>>>     fields=[field1, field2, field3],
>>> )

Now let’s look at its data type:

>>> struct_dtype.dtype
dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})

If we wanted to pad the end of the struct (to fit a specific byte range), we would provide the item_size in the descriptor. If we set it to 30, this means that we padded the struct by 10 bytes at the end.

>>> struct_dtype = StructuredDataTypeDescriptor(
>>>     fields=[field1, field2, field3],
>>>     item_size=30,
>>> )

Now let’s look at its data type:

>>> struct_dtype.dtype
dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})

To see what’s going under the hood, we can look at a lower level numpy description of the dtype. Here we observe all the gaps (void types).

>>> struct_dtype.dtype.descr
[('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]

Show JSON schema
{
   "title": "StructuredDataTypeDescriptor",
   "description": "A class representing a descriptor for a structured data-type.\n\nExamples:\n    Let's build a structured data type from scratch!\n\n    We will define three fields with different names, data-types, and\n    starting offsets.\n\n    >>> field1 = StructuredFieldDescriptor(\n    >>>     name=\"foo\",\n    >>>     format=\"int32\",\n    >>>     offset=0,\n    >>> )\n    >>> field2 = StructuredFieldDescriptor(\n    >>>     name=\"bar\",\n    >>>     format=\"int16\",\n    >>>     offset=4,\n    >>> )\n    >>> field3 = StructuredFieldDescriptor(\n    >>>     name=\"fizz\",\n    >>>     format=\"int32\",\n    >>>     offset=16,\n    >>> )\n\n    Note that the fields span the following byte ranges:\n\n    * `field1` between bytes `[0, 4)`\n    * `field2` between bytes `[4, 6)`\n    * `field3` between bytes `[16, 20)`\n\n    The gap between `field2` and `field3` will be padded with `void`. In\n    this case we expect to see an item size of 20-bytes (total length of\n    the struct).\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})\n\n    If we wanted to pad the end of the struct (to fit a specific byte range),\n    we would provide the item_size in the descriptor. If we set it to 30,\n    this means that we padded the struct by 10 bytes at the end.\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>>     item_size=30,\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})\n\n    To see what's going under the hood, we can look at a lower level numpy\n    description of the `dtype`. Here we observe all the gaps (void types).\n\n    >>> struct_dtype.dtype.descr\n    [('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]",
   "type": "object",
   "properties": {
      "description": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Description of the field.",
         "title": "Description"
      },
      "fields": {
         "description": "A list of descriptors for a structured data-type.",
         "items": {
            "$ref": "#/$defs/StructuredFieldDescriptor"
         },
         "title": "Fields",
         "type": "array"
      },
      "itemSize": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Expected size of the struct.",
         "title": "Itemsize"
      },
      "offset": {
         "anyOf": [
            {
               "minimum": 0,
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Starting byte offset.",
         "title": "Offset"
      },
      "endianness": {
         "anyOf": [
            {
               "$ref": "#/$defs/Endianness"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Endianness of structured data type."
      }
   },
   "$defs": {
      "Endianness": {
         "description": "Enumeration class with three possible endianness values.\n\nExamples:\n    >>> endian = Endianness.BIG\n    >>> print(endian.symbol)\n    >",
         "enum": [
            "big",
            "little",
            "native"
         ],
         "title": "Endianness",
         "type": "string"
      },
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16"
         ],
         "title": "ScalarType",
         "type": "string"
      },
      "StructuredFieldDescriptor": {
         "description": "A class representing a descriptor for a structured data-type field.\n\nExamples:\n    A named float at offset 8-bytes:\n\n    >>> data_type = StructuredFieldDescriptor(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     offset=8,\n    >>> )\n\n    The name and offset fields will only be used if the structured\n    field is used within the context of a :class:`StructuredDataTypeDescriptor`.\n\n    >>> data_type.name\n    my_var\n    >>> data_type.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataTypeDescriptor`.\n\n    >>> data_type.dtype\n    dtype('float32')",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "The data type of the field."
            },
            "name": {
               "description": "The short name of the field.",
               "title": "Name",
               "type": "string"
            },
            "offset": {
               "description": "Starting byte offset.",
               "minimum": 0,
               "title": "Offset",
               "type": "integer"
            }
         },
         "required": [
            "format",
            "name",
            "offset"
         ],
         "title": "StructuredFieldDescriptor",
         "type": "object"
      }
   },
   "required": [
      "fields"
   ]
}

field fields: list[StructuredFieldDescriptor] [Required]

A list of descriptors for a structured data-type.

field itemSize: int | None = None

Expected size of the struct.

field offset: int | None = None

Starting byte offset.

Constraints:
  • ge = 0

field endianness: Endianness | None = None

Endianness of structured data type.

property dtype: dtype[Any]

Converts the names, data types, and offsets of the object into a NumPy dtype.

property itemsize: int

Number of bytes for the data type.

field description: str | None = None

Description of the field.

pydantic model segy.schema.data_type.StructuredFieldDescriptor

A class representing a descriptor for a structured data-type field.

Examples

A named float at offset 8-bytes:

>>> data_type = StructuredFieldDescriptor(
>>>     name="my_var",
>>>     format="float32",
>>>     offset=8,
>>> )

The name and offset fields will only be used if the structured field is used within the context of a StructuredDataTypeDescriptor.

>>> data_type.name
my_var
>>> data_type.offset
8

The dtype property is inherited from DataTypeDescriptor.

>>> data_type.dtype
dtype('float32')

Show JSON schema
{
   "title": "StructuredFieldDescriptor",
   "description": "A class representing a descriptor for a structured data-type field.\n\nExamples:\n    A named float at offset 8-bytes:\n\n    >>> data_type = StructuredFieldDescriptor(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     offset=8,\n    >>> )\n\n    The name and offset fields will only be used if the structured\n    field is used within the context of a :class:`StructuredDataTypeDescriptor`.\n\n    >>> data_type.name\n    my_var\n    >>> data_type.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataTypeDescriptor`.\n\n    >>> data_type.dtype\n    dtype('float32')",
   "type": "object",
   "properties": {
      "description": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Description of the field.",
         "title": "Description"
      },
      "format": {
         "allOf": [
            {
               "$ref": "#/$defs/ScalarType"
            }
         ],
         "description": "The data type of the field."
      },
      "name": {
         "description": "The short name of the field.",
         "title": "Name",
         "type": "string"
      },
      "offset": {
         "description": "Starting byte offset.",
         "minimum": 0,
         "title": "Offset",
         "type": "integer"
      }
   },
   "$defs": {
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16"
         ],
         "title": "ScalarType",
         "type": "string"
      }
   },
   "required": [
      "format",
      "name",
      "offset"
   ]
}

field name: str [Required]

The short name of the field.

field offset: int [Required]

Starting byte offset.

Constraints:
  • ge = 0

property dtype: dtype[Any]

Converts the byte order and data type of the object into a NumPy dtype.

property itemsize: int

Number of bytes for the data type.

field format: ScalarType [Required]

The data type of the field.

field description: str | None = None

Description of the field.

class segy.schema.data_type.Endianness

Enumeration class with three possible endianness values.

Examples

>>> endian = Endianness.BIG
>>> print(endian.symbol)
>
BIG = 'big'
LITTLE = 'little'
NATIVE = 'native'
property symbol: Literal['<', '>', '=']

Get the numpy symbol for the endianness from mapping.