5. For Developers

This section elaborates on the design of THUNER, describing some key internal classes and functions.

5.1. Input Records

A core challenge I found when working with MINT, and other tracking algorithms, was the need to iterate over multiple lists of files corresponding to distinct datasets, Note that for tracking algorithms, we really need to iterate over time-steps, but sometimes a single file will contain multiple timesteps; the following classes address this.

pydantic model thuner.track._utils.BaseInputRecord[source]

Bases: BaseModel

Base input record class. An input record will be defined for each dataset, and store the appropriate grids and files during tracking or tagging.

Fields:
field dataset: DataArray | Dataset | None = None

Dataset from which to draw grids, which is updated as needed as the run progresses. In this context, a ‘dataset’ is an xarray.DataArray or xarray.Dataset corresponding to a single file. A grid is a single time step extracted from a dataset.

field filepaths: list[str] | dict | None = None

The relevant dataset filepaths used for the run.

field name: str [Required]

Name of the input dataset being recorded.

field weights_filepath: Callable | None = None

The regridder function for this dataset. This should be left as None and inferred during tracking.

field write_interval: timedelta64 = np.timedelta64(1,'h')

How often to move attribute data from working memory to hard disk.

pydantic model thuner.track._utils.TrackInputRecord[source]

Bases: BaseInputRecord

input record class for datasets used for tracking.

Fields:
Validators:
  • _initialize_deques » all fields

field boundary_coodinates: deque | None = None

Deque of current/previous boundary coordinates.

field boundary_masks: deque[DataArray | Dataset] | None = None

Deque of current/previous boundary masks.

field deque_length: int = 2

Number of grids/masks to keep in memory.

field domain_masks: deque[DataArray | Dataset] | None = None

Deque of current/previous domain masks.

field grids: deque[DataArray | Dataset] | None = None

Deque of current/previous grids.

field next_boundary_coordinates: Dict | None = None

The next grid’s boundary coordinates.

field next_boundary_mask: DataArray | Dataset | None = None

The next grid’s boundary mask, i.e. mask of boundary pixels.

field next_domain_mask: DataArray | Dataset | None = None

The domain mask, i.e. region of valid values, for the next grid.

field next_grid: DataArray | Dataset | None = None

Next grid to carry out detection/matching. A ‘grid’ in thuner is a single time step.

field synthetic_base_dataset: DataArray | Dataset | None = None

Synthetic base dataset. See thuner.data.synthetic.

field synthetic_objects: list[dict] | None = None

Dictionaries descibing synthetic objects. See thuner.data.synthetic.

pydantic model thuner.track._utils.InputRecords[source]

Bases: BaseModel

Class for managing the input records for all the datasets of a given run.

Fields:
Validators:
  • _initialize_input_records » all fields

field data_options: DataOptions [Required]

Options for the datasets.

field tag: Dict[str, BaseInputRecord] = {}

Dictionary containing the input records for tagging datasets.

field track: Dict[str, TrackInputRecord] = {}

Dictionary containing the input records for tracking datasets.

5.2. Tracks

We also need classes that collect attributes and other data for each object as the tracking run proceeds. First we need a class to store the attributes of each object as they are collected. Attributes are stored in dictionaries, with the dictionaries cleared periodically when data is written to disk.

pydantic model thuner.attribute.utils.AttributesRecord[source]

Bases: BaseModel

Class for storing attributes recorded during the tracking process

Fields:
Validators:
  • _check_name » all fields

  • _initialize_attributes » all fields

field attribute_options: Attributes [Required]
field attribute_types: dict | None = None
field member_attributes: dict | None = None
field name: str = None

Now we need classes to manage each object and level, noting these classes are nested in a manner analogous to the thuner.option.track.ObjectOptions, thuner.option.track.LevelOptions and thuner.option.track.TrackOptions classes. Note we also store the corresponding options in each tracking class.

pydantic model thuner.track._utils.ObjectTracks[source]

Bases: BaseModel

Class for recording the attributes and grids etc for tracking a particular object.

Fields:
Validators:
  • _check_name » all fields

  • _initialize_attributes » all fields

  • _initialize_deques » all fields

field attributes: AttributesRecord | None = None

Attributes for the object.

field current_attributes: AttributesRecord | None = None

Attributes for the object collected during current iteration.

field deque_length: int = 2

Number of current/previous objects to keep in memory.

field gridcell_area: DataArray | Dataset | None = None

Area of each grid cell in km^2.

field grids: deque[DataArray | Dataset] | None = None

Deque of current/previous grids.

field masks: deque[DataArray | Dataset] | None = None

Deque of current/previous masks.

field match_record: Dict | None = None

Current match record.

field matched_masks: deque[DataArray | Dataset] | None = None

Deque of current/previous matched masks.

field name: str | None = None

Name of the object to be tracked.

field next_grid: DataArray | Dataset | None = None

Next grid for tracking.

field next_mask: DataArray | Dataset | None = None

Next mask for tracking.

field next_matched_mask: DataArray | Dataset | None = None

Next matched mask for tracking.

field next_time: datetime64 | None = None

Next time for tracking.

field next_time_interval: timedelta64 | None = None

Interval between current and next grids.

field object_count: int = 0

Running count of the number of objects tracked.

field object_options: BaseObjectOptions [Required]

Options for the object to be tracked.

field previous_match_records: deque[Dict] | None = None

Deque of previous match records.

field previous_time_interval: deque | None = None

Interval between current and previous grids.

field times: deque[datetime64] | None = None

Deque of current/previous times.

pydantic model thuner.track._utils.LevelTracks[source]

Bases: BaseModel

Class for recording the attributes and grids etc for tracking a particular hierachy level.

Fields:
Validators:
  • _initialize_objects » all fields

field level_options: LevelOptions [Required]

Options for the given level of the hierachy.

field objects: dict[str, ObjectTracks] = {}

Objects to be tracked.

pydantic model thuner.track._utils.Tracks[source]

Bases: BaseModel

Class for recording tracks of all hierachy levels.

Fields:
Validators:
  • _initialize_levels » all fields

field levels: list[LevelTracks] = []

Tracks for each hierachy level.

field track_options: TrackOptions [Required]

Options for tracking.