trailpack.packing.export_service
Export service for converting UI data to Frictionless Data Package in Parquet.
Classes
Service for exporting UI data to Frictionless Data Package in Parquet. |
Module Contents
- class trailpack.packing.export_service.DataPackageExporter(df: pandas.DataFrame, column_mappings: Dict[str, str], general_details: Dict[str, Any], sheet_name: str, file_name: str, suggestions_cache: Dict[str, List] = None, column_descriptions: Dict[str, str] = None, standard_version: str = '1.0.0')[source]
Service for exporting UI data to Frictionless Data Package in Parquet.
Initialize with UI session state data.
- Parameters:
df – Pandas DataFrame with the actual data
column_mappings – Mapping of column names to PyST concept IDs
general_details – Metadata from the general details form
sheet_name – Name of the Excel sheet
file_name – Original file name
suggestions_cache – Cache of PyST suggestions with id and label
column_descriptions – User-provided descriptions/comments for columns
standard_version – Trailpack standard version to validate against
- _find_label_for_id(concept_id: str) str | None[source]
Find label for a PyST concept ID from suggestions cache.
- _format_validation_errors(validation_result) str[source]
Format validation errors for better readability.
- _infer_field_type(series: pandas.Series) str[source]
Infer Frictionless field type from pandas Series.
- _sanitize_resource_name(name: str) str[source]
Sanitize resource name to match the pattern ^[a-z0-9-_.]+$.
The resource name must only contain: - Lowercase letters (a-z) - Numbers (0-9) - Hyphens (-) - Underscores (_) - Dots (.)
- Parameters:
name – Raw name string to sanitize
- Returns:
Sanitized name matching the required pattern
- _validate_dataframe_for_parquet(df: pandas.DataFrame) None[source]
Validate DataFrame is compatible with Arrow/Parquet format.
- Raises:
ValueError – If data quality issues are found (e.g., mixed types in columns)
- build_fields() List[trailpack.packing.datapackage_schema.Field][source]
Convert column mappings to Field definitions.
- build_metadata(resource: trailpack.packing.datapackage_schema.Resource) Dict[str, Any][source]
Build complete metadata using MetaDataBuilder.
- build_resource(fields: List[trailpack.packing.datapackage_schema.Field]) trailpack.packing.datapackage_schema.Resource[source]
Create Resource definition with fields.
- export(output_path: str, validate_standard: bool = True) Tuple[str, str | None, Any | None][source]
Execute full export workflow and write Parquet.
- Parameters:
output_path – Path where Parquet file will be written
validate_standard – Whether to validate against Trailpack standard (default: True)
- Returns:
Tuple of (output_path, quality_level, validation_result) - output_path: Path to exported Parquet file - quality_level: Validation level (“STRICT”, “STANDARD”, “BASIC”, “INVALID”) or None if validation skipped - validation_result: Full ValidationResult object for report generation, or None if validation skipped
- Raises:
ValueError – If validation fails or data quality issues found