trailpack.packing.export_service

Export service for converting UI data to Frictionless Data Package in Parquet.

Classes

DataPackageExporter

Service for exporting UI data to Frictionless Data Package in Parquet.

Module Contents

class trailpack.packing.export_service.DataPackageExporter(df: pandas.DataFrame, column_mappings: Dict[str, str], general_details: Dict[str, Any], sheet_name: str, file_name: str, suggestions_cache: Dict[str, List] = None, column_descriptions: Dict[str, str] = None, standard_version: str = '1.0.0')[source]

Service for exporting UI data to Frictionless Data Package in Parquet.

Initialize with UI session state data.

Parameters:
  • df – Pandas DataFrame with the actual data

  • column_mappings – Mapping of column names to PyST concept IDs

  • general_details – Metadata from the general details form

  • sheet_name – Name of the Excel sheet

  • file_name – Original file name

  • suggestions_cache – Cache of PyST suggestions with id and label

  • column_descriptions – User-provided descriptions/comments for columns

  • standard_version – Trailpack standard version to validate against

_find_label_for_id(concept_id: str) str | None[source]

Find label for a PyST concept ID from suggestions cache.

_format_validation_errors(validation_result) str[source]

Format validation errors for better readability.

_infer_field_type(series: pandas.Series) str[source]

Infer Frictionless field type from pandas Series.

_sanitize_resource_name(name: str) str[source]

Sanitize resource name to match the pattern ^[a-z0-9-_.]+$.

The resource name must only contain: - Lowercase letters (a-z) - Numbers (0-9) - Hyphens (-) - Underscores (_) - Dots (.)

Parameters:

name – Raw name string to sanitize

Returns:

Sanitized name matching the required pattern

_validate_dataframe_for_parquet(df: pandas.DataFrame) None[source]

Validate DataFrame is compatible with Arrow/Parquet format.

Raises:

ValueError – If data quality issues are found (e.g., mixed types in columns)

build_fields() List[trailpack.packing.datapackage_schema.Field][source]

Convert column mappings to Field definitions.

build_metadata(resource: trailpack.packing.datapackage_schema.Resource) Dict[str, Any][source]

Build complete metadata using MetaDataBuilder.

build_resource(fields: List[trailpack.packing.datapackage_schema.Field]) trailpack.packing.datapackage_schema.Resource[source]

Create Resource definition with fields.

export(output_path: str, validate_standard: bool = True) Tuple[str, str | None, Any | None][source]

Execute full export workflow and write Parquet.

Parameters:
  • output_path – Path where Parquet file will be written

  • validate_standard – Whether to validate against Trailpack standard (default: True)

Returns:

Tuple of (output_path, quality_level, validation_result) - output_path: Path to exported Parquet file - quality_level: Validation level (“STRICT”, “STANDARD”, “BASIC”, “INVALID”) or None if validation skipped - validation_result: Full ValidationResult object for report generation, or None if validation skipped

Raises:

ValueError – If validation fails or data quality issues found

generate_validation_report(validation_result) str[source]

Generate a complete validation report for download.

Includes errors, warnings, and info (data quality metrics).

Parameters:

validation_result – ValidationResult object from validation

Returns:

Formatted report as string

validate() Tuple[bool, List[str]][source]

Validate all inputs before processing.

column_descriptions[source]
column_mappings[source]
df[source]
file_name[source]
general_details[source]
schema[source]
sheet_name[source]
suggestions_cache[source]
validator[source]