SEG-Y File¶

Altay Sansal

May 07, 2024

5 min read

SEG-Y Descriptor: A Conceptual Overview¶

The SegyDescriptor is a structured model used to define the structure and content of a SEG-Y file. SEG-Y is a standard file format used in the geophysical industry for recording digital seismic data. In essence, this model serves as a blueprint for what a SEG-Y file should look like.

This class and its components provide a specified and flexible way to work with SEG-Y seismic data files programmatically, from defining the file structure and read/write operations, to customization for specialised use cases.

Conceptually a SEG-Y Revision 0 file looks like this on disk.

┌──────────────┐  ┌─────────────┐  ┌────────────────────┐        ┌────────────────────┐
│ Textual File │  │ Binary File │  │       Trace 1      │        │       Trace N      │
│ Header 3200B │─►│ Header 400B │─►│ Header 240B + Data │─ ... ─►│ Header 240B + Data │
└──────────────┘  └─────────────┘  └────────────────────┘        └────────────────────┘

Key Components¶

This descriptor model consists of several important components. Each of these components represents a particular section of a SEG-Y file.

SEGY-Standard¶

This attribute, segy_standard, corresponds to the specific SEG-Y standard that is being used. SEG-Y files can be of different revisions or standards, including custom ones.

It must be set to one of the allowed SegyStandard values.

Text File Header¶

The text_file_header stores the information required to parse the textual file header of the SEG-Y file. This includes important metadata that pertains to the seismic data in human-readable format.

Binary File Header¶

The binary_file_header item talks about the binary file header of the SEG-Y file. It is a set of structured and important information about the data in the file, stored in binary format for machines to read and process quickly and efficiently.

Binary headers are defined as StructuredDataTypeDescriptors and are built by specifying header fields in the StructuredFieldDescriptor format.

Extended Text Header¶

The extended_text_header is an optional attribute that provides space for extra information that can’t be fit within the regular text file header. This extended header can be used for additional human-readable metadata about the data.

Note

Extended text headers are were added in SEG-Y Revision 1.0.

Trace¶

The trace component is a descriptor for both the trace headers and the associated data. Trace headers contain specific information about each individual seismic trace in the dataset, and the trace data contains the actual numerical seismic data.

The Customize Method¶

The customize method is a way for users to tailor an existing SEG-Y descriptor to meet their specific requirements. It’s an optional tool that provides a way to update the various parts of the descriptor including the text header, binary header, extended text header, trace header and trace data. Note that the SEGY standard is always set to custom when using this method.

Reference¶

pydantic model segy.schema.segy.SegyDescriptor¶

A descriptor class for a SEG-Y file.

Show JSON schema

{
   "title": "SegyDescriptor",
   "description": "A descriptor class for a SEG-Y file.",
   "type": "object",
   "properties": {
      "segyStandard": {
         "anyOf": [
            {
               "$ref": "#/$defs/SegyStandard"
            },
            {
               "type": "null"
            }
         ],
         "description": "SEG-Y Revision / Standard. Can also be custom."
      },
      "textFileHeader": {
         "allOf": [
            {
               "$ref": "#/$defs/TextHeaderDescriptor"
            }
         ],
         "description": "Textual file header descriptor."
      },
      "binaryFileHeader": {
         "allOf": [
            {
               "$ref": "#/$defs/StructuredDataTypeDescriptor"
            }
         ],
         "description": "Binary file header descriptor."
      },
      "extendedTextHeader": {
         "anyOf": [
            {
               "$ref": "#/$defs/TextHeaderDescriptor"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Extended textual header descriptor."
      },
      "trace": {
         "allOf": [
            {
               "$ref": "#/$defs/TraceDescriptor"
            }
         ],
         "description": "Trace header + data descriptor."
      },
      "endianness": {
         "anyOf": [
            {
               "$ref": "#/$defs/Endianness"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Endianness of SEG-Y file."
      }
   },
   "$defs": {
      "Endianness": {
         "description": "Enumeration class with three possible endianness values.\n\nExamples:\n    >>> endian = Endianness.BIG\n    >>> print(endian.symbol)\n    >",
         "enum": [
            "big",
            "little",
            "native"
         ],
         "title": "Endianness",
         "type": "string"
      },
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16"
         ],
         "title": "ScalarType",
         "type": "string"
      },
      "SegyStandard": {
         "description": "Allowed values for SEG-Y standards in SegyDescriptor.",
         "enum": [
            0.0,
            1.0,
            2.0,
            2.1
         ],
         "title": "SegyStandard",
         "type": "numeric"
      },
      "StructuredDataTypeDescriptor": {
         "description": "A class representing a descriptor for a structured data-type.\n\nExamples:\n    Let's build a structured data type from scratch!\n\n    We will define three fields with different names, data-types, and\n    starting offsets.\n\n    >>> field1 = StructuredFieldDescriptor(\n    >>>     name=\"foo\",\n    >>>     format=\"int32\",\n    >>>     offset=0,\n    >>> )\n    >>> field2 = StructuredFieldDescriptor(\n    >>>     name=\"bar\",\n    >>>     format=\"int16\",\n    >>>     offset=4,\n    >>> )\n    >>> field3 = StructuredFieldDescriptor(\n    >>>     name=\"fizz\",\n    >>>     format=\"int32\",\n    >>>     offset=16,\n    >>> )\n\n    Note that the fields span the following byte ranges:\n\n    * `field1` between bytes `[0, 4)`\n    * `field2` between bytes `[4, 6)`\n    * `field3` between bytes `[16, 20)`\n\n    The gap between `field2` and `field3` will be padded with `void`. In\n    this case we expect to see an item size of 20-bytes (total length of\n    the struct).\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 20})\n\n    If we wanted to pad the end of the struct (to fit a specific byte range),\n    we would provide the item_size in the descriptor. If we set it to 30,\n    this means that we padded the struct by 10 bytes at the end.\n\n    >>> struct_dtype = StructuredDataTypeDescriptor(\n    >>>     fields=[field1, field2, field3],\n    >>>     item_size=30,\n    >>> )\n\n    Now let's look at its data type:\n\n    >>> struct_dtype.dtype\n    dtype({'names': ['foo', 'bar', 'fizz'], 'formats': ['<i4', '<i2', '<i4'], 'offsets': [0, 4, 16], 'itemsize': 30})\n\n    To see what's going under the hood, we can look at a lower level numpy\n    description of the `dtype`. Here we observe all the gaps (void types).\n\n    >>> struct_dtype.dtype.descr\n    [('foo', '<i4'), ('bar', '<i2'), ('', '|V10'), ('fizz', '<i4'), ('', '|V10')]",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "fields": {
               "description": "A list of descriptors for a structured data-type.",
               "items": {
                  "$ref": "#/$defs/StructuredFieldDescriptor"
               },
               "title": "Fields",
               "type": "array"
            },
            "itemSize": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Expected size of the struct.",
               "title": "Itemsize"
            },
            "offset": {
               "anyOf": [
                  {
                     "minimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Starting byte offset.",
               "title": "Offset"
            },
            "endianness": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/Endianness"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Endianness of structured data type."
            }
         },
         "required": [
            "fields"
         ],
         "title": "StructuredDataTypeDescriptor",
         "type": "object"
      },
      "StructuredFieldDescriptor": {
         "description": "A class representing a descriptor for a structured data-type field.\n\nExamples:\n    A named float at offset 8-bytes:\n\n    >>> data_type = StructuredFieldDescriptor(\n    >>>     name=\"my_var\",\n    >>>     format=\"float32\",\n    >>>     offset=8,\n    >>> )\n\n    The name and offset fields will only be used if the structured\n    field is used within the context of a :class:`StructuredDataTypeDescriptor`.\n\n    >>> data_type.name\n    my_var\n    >>> data_type.offset\n    8\n\n    The `dtype` property is inherited from :class:`DataTypeDescriptor`.\n\n    >>> data_type.dtype\n    dtype('float32')",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "The data type of the field."
            },
            "name": {
               "description": "The short name of the field.",
               "title": "Name",
               "type": "string"
            },
            "offset": {
               "description": "Starting byte offset.",
               "minimum": 0,
               "title": "Offset",
               "type": "integer"
            }
         },
         "required": [
            "format",
            "name",
            "offset"
         ],
         "title": "StructuredFieldDescriptor",
         "type": "object"
      },
      "TextHeaderDescriptor": {
         "description": "A descriptor class for SEG-Y textual headers.",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "rows": {
               "description": "Number of rows in text header.",
               "title": "Rows",
               "type": "integer"
            },
            "cols": {
               "description": "Number of columns in text header.",
               "title": "Cols",
               "type": "integer"
            },
            "encoding": {
               "allOf": [
                  {
                     "$ref": "#/$defs/TextHeaderEncoding"
                  }
               ],
               "description": "String encoding."
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "Type of string."
            },
            "offset": {
               "anyOf": [
                  {
                     "minimum": 0,
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Starting byte offset.",
               "title": "Offset"
            }
         },
         "required": [
            "rows",
            "cols",
            "encoding",
            "format"
         ],
         "title": "TextHeaderDescriptor",
         "type": "object"
      },
      "TextHeaderEncoding": {
         "description": "Supported textual header encodings.",
         "enum": [
            "ascii",
            "ebcdic"
         ],
         "title": "TextHeaderEncoding",
         "type": "string"
      },
      "TraceDescriptor": {
         "description": "A descriptor class for a Trace (Header + Data).",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "headerDescriptor": {
               "allOf": [
                  {
                     "$ref": "#/$defs/StructuredDataTypeDescriptor"
                  }
               ],
               "description": "Trace header descriptor."
            },
            "extendedHeaderDescriptor": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/StructuredDataTypeDescriptor"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Extended trace header descriptor."
            },
            "sampleDescriptor": {
               "allOf": [
                  {
                     "$ref": "#/$defs/TraceSampleDescriptor"
                  }
               ],
               "description": "Trace data descriptor."
            },
            "offset": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Starting offset of the trace.",
               "title": "Offset"
            },
            "endianness": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/Endianness"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Endianness of traces and headers."
            }
         },
         "required": [
            "headerDescriptor",
            "sampleDescriptor"
         ],
         "title": "TraceDescriptor",
         "type": "object"
      },
      "TraceSampleDescriptor": {
         "description": "A descriptor class for a Trace Samples.",
         "properties": {
            "description": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Description of the field.",
               "title": "Description"
            },
            "format": {
               "allOf": [
                  {
                     "$ref": "#/$defs/ScalarType"
                  }
               ],
               "description": "Format of trace samples."
            },
            "samples": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "Number of samples in trace. It can be variable, then it must be read from each trace header.",
               "title": "Samples"
            }
         },
         "required": [
            "format"
         ],
         "title": "TraceSampleDescriptor",
         "type": "object"
      }
   },
   "required": [
      "segyStandard",
      "textFileHeader",
      "binaryFileHeader",
      "trace"
   ]
}

field segyStandard: SegyStandard | None [Required]¶: SEG-Y Revision / Standard. Can also be custom.

field textFileHeader: TextHeaderDescriptor [Required]¶: Textual file header descriptor.

field binaryFileHeader: StructuredDataTypeDescriptor [Required]¶: Binary file header descriptor.

field extendedTextHeader: TextHeaderDescriptor | None = None¶: Extended textual header descriptor.

field trace: TraceDescriptor [Required]¶: Trace header + data descriptor.

field endianness: Endianness | None = None¶: Endianness of SEG-Y file.

customize(text_header_spec=None, binary_header_fields=None, extended_text_spec=None, trace_header_fields=None, trace_data_spec=None)¶

Customize an existing SEG-Y descriptor.

Parameters:

text_header_spec (TextHeaderDescriptor | None) – New text header specification.
binary_header_fields (list[StructuredFieldDescriptor] | None) – List of custom binary header fields.
extended_text_spec (TextHeaderDescriptor | None) – New extended text header specification.
trace_header_fields (list[StructuredFieldDescriptor] | None) – List of custom trace header fields.
trace_data_spec (TraceSampleDescriptor | None) – New trace data specification.
self (SegyDescriptor)

Returns:

A modified SEG-Y descriptor with “custom” segy standard.

Return type:

SegyDescriptor

pydantic model segy.schema.header.TextHeaderDescriptor¶

A descriptor class for SEG-Y textual headers.

Show JSON schema

{
   "title": "TextHeaderDescriptor",
   "description": "A descriptor class for SEG-Y textual headers.",
   "type": "object",
   "properties": {
      "description": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Description of the field.",
         "title": "Description"
      },
      "rows": {
         "description": "Number of rows in text header.",
         "title": "Rows",
         "type": "integer"
      },
      "cols": {
         "description": "Number of columns in text header.",
         "title": "Cols",
         "type": "integer"
      },
      "encoding": {
         "allOf": [
            {
               "$ref": "#/$defs/TextHeaderEncoding"
            }
         ],
         "description": "String encoding."
      },
      "format": {
         "allOf": [
            {
               "$ref": "#/$defs/ScalarType"
            }
         ],
         "description": "Type of string."
      },
      "offset": {
         "anyOf": [
            {
               "minimum": 0,
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Starting byte offset.",
         "title": "Offset"
      }
   },
   "$defs": {
      "ScalarType": {
         "description": "A class representing scalar data types.",
         "enum": [
            "ibm32",
            "int64",
            "int32",
            "int16",
            "int8",
            "uint64",
            "uint32",
            "uint16",
            "uint8",
            "float64",
            "float32",
            "float16"
         ],
         "title": "ScalarType",
         "type": "string"
      },
      "TextHeaderEncoding": {
         "description": "Supported textual header encodings.",
         "enum": [
            "ascii",
            "ebcdic"
         ],
         "title": "TextHeaderEncoding",
         "type": "string"
      }
   },
   "required": [
      "rows",
      "cols",
      "encoding",
      "format"
   ]
}

field rows: int [Required]¶: Number of rows in text header.

field cols: int [Required]¶: Number of columns in text header.

field encoding: TextHeaderEncoding [Required]¶: String encoding.

field format: ScalarType [Required]¶: Type of string.

field offset: int | None = None¶

Starting byte offset.

Constraints:

ge = 0

property dtype: dtype[Any]¶: Get numpy dtype.

property itemsize: int¶: Number of bytes for the data type.

field description: str | None = None¶: Description of the field.

class segy.schema.segy.SegyStandard¶

Allowed values for SEG-Y standards in SegyDescriptor.

REV0 = 0.0¶

REV1 = 1.0¶

REV2 = 2.0¶

REV21 = 2.1¶

pydantic model segy.schema.segy.SegyInfo¶

Concise and useful information about SEG-Y files.

Show JSON schema

{
   "title": "SegyInfo",
   "description": "Concise and useful information about SEG-Y files.",
   "type": "object",
   "properties": {
      "uri": {
         "description": "URI of the SEG-Y file.",
         "title": "Uri",
         "type": "string"
      },
      "segyStandard": {
         "anyOf": [
            {
               "$ref": "#/$defs/SegyStandard"
            },
            {
               "type": "null"
            }
         ],
         "description": "SEG-Y Revision / Standard. Can also be custom."
      },
      "numTraces": {
         "description": "Number of traces.",
         "title": "Numtraces",
         "type": "integer"
      },
      "samplesPerTrace": {
         "description": "Trace length in number of samples.",
         "title": "Samplespertrace",
         "type": "integer"
      },
      "sampleInterval": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "number"
            }
         ],
         "description": "Sampling rate from binary header.",
         "title": "Sampleinterval"
      },
      "fileSize": {
         "description": "File size in bytes.",
         "title": "Filesize",
         "type": "integer"
      }
   },
   "$defs": {
      "SegyStandard": {
         "description": "Allowed values for SEG-Y standards in SegyDescriptor.",
         "enum": [
            0.0,
            1.0,
            2.0,
            2.1
         ],
         "title": "SegyStandard",
         "type": "numeric"
      }
   },
   "required": [
      "uri",
      "segyStandard",
      "numTraces",
      "samplesPerTrace",
      "sampleInterval",
      "fileSize"
   ]
}

field uri: str [Required]¶: URI of the SEG-Y file.

field segyStandard: SegyStandard | None [Required]¶: SEG-Y Revision / Standard. Can also be custom.

field numTraces: int [Required]¶: Number of traces.

field samplesPerTrace: int [Required]¶: Trace length in number of samples.

field sampleInterval: int | float [Required]¶: Sampling rate from binary header.

field fileSize: int [Required]¶: File size in bytes.