depictr.summary_table#

depictr.summary_table(data, vars=None, group=None, digits=1, missing=True, max_levels=20)#

Build a “Table 1” descriptive summary.

The first row always reports the sample size (N) overall and per group. Numeric variables are summarised as mean (SD); categorical variables get one row per level with count (percent). A Missing, n (%) row follows any variable that has missing values, unless missing is False.

Parameters:
  • data (pandas.DataFrame) – The data.

  • vars (list of str, optional) – Columns to summarise. If None, every column except group is used, with high-cardinality identifier-like columns skipped (see max_levels).

  • group (str, optional) – A grouping column; one summary column is produced per level, alongside an overall column.

  • digits (int) – Decimal places for the numeric summaries.

  • missing (bool) – Add a Missing, n (%) row for variables that contain missing values.

  • max_levels (int) – When vars is None, a non-numeric column whose distinct-value count is at least max_levels and exceeds half the number of rows is treated as an identifier and skipped, so it does not explode into one row per value.

Returns:

Columns variable, statistic, Overall and one column per group level. The first row reports N. The variable name is blanked on its repeated rows for readability.

Return type:

pandas.DataFrame