# Getting started This guide works through a short analysis with depictr, from a first look at the data to a fitted model and its diagnostics. Every function returns a [plotnine](https://plotnine.org) object, so anything shown here can be refined further with the usual `+` syntax. ## Install ```bash pip install depictr # core: plotnine, pandas, numpy, matplotlib, scipy pip install depictr[all] # plus the optional computation back-ends ``` The exploratory, theme and accessibility tools work with the core install. The model, classification and survival plots delegate their computation to statsmodels, scikit-learn and lifelines respectively, each an optional extra (`depictr[models]`, `depictr[classification]`, `depictr[survival]`). ## The idea depictr gives the whole workflow one theme, one colourblind-safe palette (the Okabe-Ito set) and one calling convention. Where a specialist package already computes a quantity well, depictr hands the work to it and redraws the result under the shared theme, so a ROC curve, a coefficient plot and a survival curve all share the same visual language. ```python import depictr as dp ``` Each call returns a plotnine object. In a Jupyter notebook it renders on display; in a script, call `.show()`, or save it with `dp.save_plot(p, "fig.png")`. ## A first look at the data depictr ships a few reproducibly simulated datasets. Here is a lexical-decision experiment with reaction times in two priming conditions. ```python ld = dp.lexical_decision() dp.explore_distribution(ld, "RT", group="condition", kind="density", legend_inside=True) ``` The legend sits inside the panel, in the corner the distribution leaves empty. For a wider survey, a correlation heatmap and a missing-data map give a quick overview: ```python wb = dp.wellbeing_survey() dp.correlation_heatmap(wb) dp.missingness_map(wb) ``` ## Fitting and reading a model depictr does not fit models; it reads a model you have fitted, or a tidy table of estimates. Fit an ordinary least-squares model with statsmodels, then read it from several angles. ```python import statsmodels.formula.api as smf cy = dp.crop_yield() # Q() quotes "yield" because it is a Python keyword. model = smf.ols('Q("yield") ~ fertiliser + rainfall + soil_ph + treatment', cy).fit() dp.coefficient_plot(model, title="Drivers of crop yield") dp.effects_plot(model, "fertiliser") dp.residual_diagnostics_plot(model) ``` `coefficient_plot` also accepts a plain data frame of estimates (with columns `term`, `estimate`, `conf_low`, `conf_high`), so estimates from any source -- a Bayesian fit, a bootstrap, a table copied from a paper -- plot the same way. ## Survival and classification These families delegate to lifelines and scikit-learn. Kaplan-Meier curves with a log-rank test and a number-at-risk table are one call: ```python ct = dp.clinical_trial() dp.survival_plot(ct["time"], ct["event"], group=ct["arm"], risk_table=True, legend_inside=True) dp.roc_curve_plot(ct["adverse_event"], ct["biomarker"]) ``` ## Accessibility, checked rather than asserted The default palette is the Okabe-Ito set, and that choice is verified rather than assumed. A Machado-2009 simulator and a CIE-Lab distance test report how far apart the palette's colours stay under each form of colour-vision deficiency. ```python dp.palette_safety() # {'min_delta_e': ..., 'safe': True, 'by_condition': {...}, ...} ``` ## Extending and composing Because every function returns a plotnine object, the grammar-of-graphics extensions apply: ```python from plotnine import labs dp.roc_curve_plot(ct["adverse_event"], ct["biomarker"]) + labs(title="Adverse event") ``` To place several plots in one figure, use `arrange_plots`: ```python dp.arrange_plots(dp.qq_plot(model), dp.influence_plot(model), ncol=2) ``` ## Where next - The [gallery](auto_examples/index) renders a worked example from every family. - The [API reference](api) documents each function and its options.