trailpack.excel.reader ====================== .. py:module:: trailpack.excel.reader .. autoapi-nested-parse:: Excel reader module for loading and inspecting Excel files. This module provides an ExcelReader class that: - Loads only the structure (sheets and columns) into memory - Provides access to sheet names - Provides access to column names for mapping Classes ------- .. autoapisummary:: trailpack.excel.reader.ExcelReader Module Contents --------------- .. py:class:: ExcelReader(file_path: Union[str, pathlib.Path], header_row: int = 1) Excel file reader that loads only sheet structure (sheets and columns) into memory. This class provides methods to inspect Excel file structure without loading all the data, making it memory-efficient for large files: - List available sheets - Get column names from specific sheets .. rubric:: Example >>> reader = ExcelReader("data.xlsx") >>> sheet_names = reader.sheets() >>> columns = reader.columns("Sheet1") Initialize ExcelReader and load sheet structure (sheets and columns) into memory. Only the sheet names and column headers are loaded, not the actual data. This makes it memory-efficient for large Excel files. :param file_path: Path to the Excel file (.xlsx, .xlsm, .xltx, .xltm) :param header_row: Row number containing column headers (1-indexed). Defaults to 1. :raises FileNotFoundError: If the file does not exist :raises ValueError: If the file is not a valid Excel file .. py:method:: _load_structure() Load sheet structure (sheet names and column headers) from Excel file. Opens the workbook in read-only mode, extracts structure, then closes it. Only loads metadata, not actual data, for memory efficiency. .. py:method:: columns(sheet_name: Optional[str] = None) -> List[str] Get list of column names from a specific sheet. :param sheet_name: Name of the sheet to read columns from. If None, uses the first sheet. :returns: List of column names as strings. Empty cells are returned as empty strings. :raises ValueError: If sheet_name doesn't exist in the workbook .. rubric:: Example >>> reader = ExcelReader("data.xlsx") >>> columns = reader.columns("Sheet1") >>> print(columns) ['ID', 'Name', 'Value', 'Date'] .. py:method:: get_structure() -> Dict[str, List[str]] Get the complete sheet structure as a dictionary. :returns: Dictionary mapping sheet names to their column lists .. rubric:: Example >>> reader = ExcelReader("data.xlsx") >>> structure = reader.get_structure() >>> print(structure) {'Sheet1': ['ID', 'Name', 'Value'], 'Sheet2': ['Date', 'Amount']} .. py:method:: reload() Reload the sheet structure from the Excel file. Useful if the file has been modified and you want to refresh the structure. .. py:method:: sheets() -> List[str] Get list of all sheet names in the workbook. :returns: List of sheet names as strings .. rubric:: Example >>> reader = ExcelReader("data.xlsx") >>> sheet_names = reader.sheets() >>> print(sheet_names) ['Sheet1', 'Sheet2', 'Data'] .. py:attribute:: _sheet_columns :type: Dict[str, List[str]] .. py:attribute:: file_path .. py:attribute:: header_row :value: 1