smart_reader ============ .. py:module:: smart_reader Classes ------- .. autoapisummary:: smart_reader.SmartDataReader Module Contents --------------- .. py:class:: SmartDataReader(file_path: Union[str, pathlib.Path]) Adaptive data reader that chooses optimal technology based on file size. Strategy: - <10MB: pandas (simplicity, compatibility) - 10-500MB: polars (speed, memory efficiency) - >500MB: polars lazy (streaming, minimal memory) - CSV always: pyarrow or polars (much faster than pandas) .. py:method:: _choose_engine() -> str Choose optimal engine based on file size. .. py:method:: _read_pandas(sheet_name: Optional[str] = None) -> pandas.DataFrame Small files: Use pandas. .. py:method:: _read_pandas_chunked(sheet_name: Optional[str] = None, chunk_size: int = 10000) -> pandas.DataFrame Read large Excel files in chunks, return first chunk for preview. .. py:method:: _read_polars(sheet_name: Optional[str] = None) -> pandas.DataFrame Medium files: Use polars, convert to pandas. .. py:method:: _read_polars_lazy(sheet_name: Optional[str] = None) -> pandas.DataFrame Large files: Use lazy evaluation, process in chunks. .. py:method:: _read_pyarrow() -> pandas.DataFrame CSV with PyArrow (fastest CSV reader). .. py:method:: estimate_memory() -> str Estimate memory usage. :returns: Human-readable memory estimate string .. py:method:: read(sheet_name: Optional[str] = None) -> pandas.DataFrame Read file using optimal engine, always return pandas DataFrame. :param sheet_name: Sheet name for Excel files (optional) :returns: pandas DataFrame with file contents Why pandas output? - Rest of codebase expects pandas - Can convert polars → pandas at end - Only final result in memory .. py:attribute:: LARGE_FILE :value: 524288000 .. py:attribute:: SMALL_FILE :value: 10485760 .. py:attribute:: engine :value: 'polars' .. py:attribute:: file_path .. py:attribute:: file_size