quends.base.ensemble#
Classes#
Manages an ensemble of DataStream instances, enabling multi-stream analysis. |
Module Contents#
- class quends.base.ensemble.Ensemble(data_streams)#
Manages an ensemble of DataStream instances, enabling multi-stream analysis.
- Provides methods for:
Simple accessors (.head, .get_member, .members).
Identifying common variables across streams.
Generating an average-ensemble stream aligned to the shortest time grid.
Applying DataStream methods (mean, uncertainty, CI, ESS) at the ensemble level via three techniques: average-ensemble, aggregate-then-statistics, and weighted.
Tracking per-stream and ensemble metadata histories for reproducibility.
- Parameters:
data_streams (List[quends.base.data_stream.DataStream]) –
- data_streams#
- head(n=5)#
Retrieve the first n rows from each DataStream member.
Parameters#
- nint
Number of rows to return per stream.
Returns#
- Dict[int, pandas.DataFrame]
Mapping from member index to its DataFrame head.
- get_member(index)#
Fetch a specific ensemble member by index.
Parameters#
- indexint
Zero-based index of the DataStream in the ensemble.
Returns#
DataStream
Raises#
- IndexError
If index is out of bounds.
- common_variables()#
Identify variable columns shared by all members, excluding ‘time’.
Returns#
List[str]
- summary()#
Print and return a structured summary of ensemble members.
Includes each member’s sample count, column list, and head rows.
Returns#
- dict
- { ‘n_members’: int,
‘common_variables’: List[str], ‘members’: { ‘Member i’: { ‘n_samples’: int,
‘columns’: List[str], ‘head’: dict } } }
- compute_average_ensemble(members=None)#
Build a DataStream whose columns are the elementwise mean across members, aligned on the shortest time grid.
Parameters#
- membersList[DataStream], optional
Subset of streams to average; defaults to all.
Returns#
DataStream
Raises#
- ValueError
If no streams are provided.
- Parameters:
members (List[quends.base.data_stream.DataStream]) –
- resample_to_short_intervals(short_df, long_df)#
Align long_df onto short_df.time by block-averaging between boundaries.
Parameters#
- short_dfpandas.DataFrame
Reference DataFrame with the shortest time series.
- long_dfpandas.DataFrame
Stream to resample.
Returns#
- pandas.DataFrame
Resampled data matching short_df.time.
- Parameters:
short_df (pandas.DataFrame) –
long_df (pandas.DataFrame) –
- static collect_histories(ds_list)#
Gather _history lists from each DataStream in ds_list.
Parameters#
- ds_listList[DataStream]
Streams whose histories to collect.
Returns#
List[List[dict]]
- Parameters:
ds_list (List[quends.base.data_stream.DataStream]) –
- trim(column_name, batch_size=10, start_time=0.0, method='std', threshold=None, robust=True)#
- is_stationary(columns)#
Test stationarity for columns across all members.
Returns#
- dict
- { ‘results’: {Member i: {col: bool or error}},
‘metadata’: {Member i: history} }
- Return type:
Dict
- effective_sample_size(column_names=None, alpha=0.05, technique=0)#
- Compute classic ESS via three techniques:
0 - on average-ensemble 1 - on concatenated aggregate 2 - per-member then aggregate
Returns#
- dict
{ ‘results’: …, ‘metadata’: … }
- Parameters:
alpha (float) –
technique (int) –
- Return type:
Dict
- ess_robust(column_names=None, rank_normalize=True, min_samples=8, return_relative=False, technique=0)#
Compute robust ESS (rank-based) via three techniques.
Returns#
- dict
{ ‘results’: …, ‘metadata’: … }
- mean(column_name=None, method='non-overlapping', window_size=None, technique=0)#
- Compute ensemble mean via three techniques:
0 - average-ensemble 1 - aggregate-then-statistics 2 - weighted per-member
Returns#
- dict
{ ‘results’: …, ‘metadata’: … }
- mean_uncertainty(column_name=None, ddof=1, method='non-overlapping', window_size=None, technique=0)#
Compute SEM via three techniques (0: average, 1: aggregate, 2: weighted).
Returns#
dict