Tools cookbook
Pixie ships with 21 working tools that exercise nearly every input and output type. They’re real (no toy demos) and they double as canonical references when you’re writing your own.
| Category | Tools |
|---|---|
| Finance & quant | backtest-engine, black-scholes-greeks, markowitz-portfolio, stock-monte-carlo, example-compound-interest |
| Machine learning | live-mlp-training, style-transfer, vit-classifier-gradcam |
| NLP | bertopic-modelling, rag-with-citations, sentiment-over-time |
| Audio | coqui-tts, demucs-separation, whisper-transcription |
| Science & dynamics | cellular-automata, lorenz-ode-solver, n-body-simulator |
| Data & statistics | image-segmentation, time-series-forecast, yolo-object-detection |
| Agents | llm-tool-use-agent |
Finance and quant
example-compound-interest
The canonical reference. Computes compound-interest growth on a principal with monthly contributions over a horizon. Inputs: principal, annual rate, years, compounding frequency, monthly contribution, inflation-adjustment toggle. Outputs: final balance (formatted currency), growth chart, year-by-year table. No dependencies beyond FastAPI / uvicorn, so it’s the lightest possible Pixie tool. Read its source first if you’re new.
backtest-engine
Long-only moving-average crossover strategy on OHLC data. Takes fast/ slow MA periods, starting cash, and per-side commission in basis points. Returns an equity curve overlaid against buy-and-hold, per-trade log, and Sharpe / max drawdown / return metrics. Useful for teaching backtesting.
black-scholes-greeks
Closed-form European option pricer. Inputs: spot, strike, days to
expiry, risk-free rate, volatility, call/put. Returns price plus all
five first-order Greeks as kv and parametric charts (price vs spot,
price vs days, 2-D heatmap of price across strike × time). Pure
analytical — no approximation.
markowitz-portfolio
Computes the efficient frontier from a CSV of historical asset returns. Inputs: returns matrix, risk-free rate, sampling frequency, optional target return. Outputs: frontier scatter, tangency portfolio weights (max Sharpe), weights at target return, comparison table. Uses scipy’s constrained minimisation.
stock-monte-carlo
Geometric Brownian motion simulation. Inputs: spot, drift, volatility,
horizon (trading days), path count, seed. Outputs: VaR / CVaR at 95%,
sample paths with percentile bands, per-day percentile table, terminal
distribution moments. Fully vectorised — all paths in one
numpy.random.randn call.
Machine learning
live-mlp-training
Trains a small MLP on a CSV dataset with user-configurable hidden
sizes, learning rate, batch size, and epoch count. Streams loss and
accuracy curves per epoch — outputs declare streaming: true. Ships
with a pure-numpy fallback so it validates without PyTorch; the torch
runtime is an optional dependency group (uv sync --extra runtime).
style-transfer
Classical Gatys-style style transfer. Inputs: content image, style
image, style strength, iteration count, working image size. Streams the
loss curve per iteration. Pure-numpy / PIL fallback returns a smoothed
content image; torch+torchvision optional path runs the real loss.
concurrent: false because optimisation isn’t thread-safe.
vit-classifier-gradcam
ViT-B/16-224 classification with a Grad-CAM-style attention saliency
overlay (image_compare showing original vs heatmap). Top-K
predictions as kv. Model downloaded on first run (~350 MB).
concurrent: false.
NLP
bertopic-modelling
Topic modelling over a CSV of documents. Primary path: sentence embeddings + HDBSCAN + UMAP. Fallback: TF-IDF + KMeans + TruncatedSVD when BERTopic isn’t available (CI, lightweight environments). Handles tiny corpora (<8 docs) and dimension mismatches without crashing. Outputs: topic labels, top words, document scatter plot, detailed topic table.
rag-with-citations
Embeds uploaded PDFs/text files, retrieves top-k via TF-IDF cosine,
outputs an extractive markdown answer with citation markers, and
streams an LLM-synthesised answer if ANTHROPIC_API_KEY is set.
Without the key, returns only the extractive answer.
sentiment-over-time
VADER sentiment on a dated CSV. Aggregates per-row scores into rolling windows (configurable days). Outputs: rolling-mean sentiment chart and per-window summary table. No model download — runs fully offline.
Audio
coqui-tts
Offline text-to-speech via pyttsx3 (SAPI on Windows, espeak on Linux,
NSSpeechSynthesizer on macOS). Inputs: text, voice, speech rate. Output:
22 kHz WAV data URL. Runtime typically <5 s. (Folder name kept for
historical reasons; uses pyttsx3 not Coqui.)
demucs-separation
Splits stereo/mono audio into vocals / drums / bass / other stems.
Frequency-band heuristic fallback when Demucs isn’t installed
(centre-channel vs sideband decomposition); swaps in the real model
when it is. concurrent: false. Outputs: four WAV data URLs.
whisper-transcription
CPU-only faster-whisper (tiny model, ~75 MB downloaded on first
run). 8 languages with auto-detect. Streams transcript segments as
they’re decoded. concurrent: false. Includes a naive gap-based
speaker diarisation.
Science and dynamics
cellular-automata
Conway’s Life or Wolfram 1-D rules (30/90/110/184). Inputs: rule, grid width, generations, seed. Evolved grid rendered to PNG and returned as data URL. Live-cell count plotted over time.
lorenz-ode-solver
Integrates the Lorenz equations for given s, ?, ß. Outputs: time
series (x/y/z vs time), phase portrait (x vs z), Poincaré section
(z=27 plane crossings). The phase portrait is the canonical
chart_scatter example — series of points, not points at top level.
n-body-simulator
Symplectic leapfrog (velocity Verlet) gravitational N-body. Inputs: body count (2–20), mass range, step count, step size, seed. Outputs: final positions, total energy over time, trajectory traces. Softening prevents close-approach blow-up.
Data and statistics
image-segmentation
Foreground/background. Pure-numpy Otsu threshold fallback on
luminance; optional rembg path for higher-quality neural
segmentation. Outputs: segmented image (RGBA cutout or grayscale mask)
plus a small statistics table.
time-series-forecast
ARIMA / SARIMA / Holt-Winters / auto-ARIMA. Inputs: CSV (date + value), horizon, model, seasonality, confidence level. Outputs: history + forecast + confidence band line chart, forecast table, backtest metrics (MAE, RMSE, MAPE).
yolo-object-detection
YOLOv8 (Ultralytics) on an input image. Model downloaded on first run. Outputs: annotated image with bounding boxes, detection table (class, confidence, bbox), summary card (top 3 classes).
Agents
llm-tool-use-agent
A conversational agent with calculator and sandboxed web-search stubs.
Routes through Claude 3.5 Haiku if ANTHROPIC_API_KEY is set; uses a
deterministic local intent matcher otherwise, so it always works
offline. Streams the reply token-by-token; tool calls land in a log
output.
What you can learn from each
| If you want to learn… | Read this tool |
|---|---|
| The simplest possible Pixie tool | example-compound-interest |
| File upload as input | backtest-engine, image-segmentation |
| Multiple chart types in one tool | black-scholes-greeks |
| Streaming text output | whisper-transcription, llm-tool-use-agent |
| Streaming chart output | live-mlp-training, style-transfer |
| Optional dependency groups with a fallback | bertopic-modelling, live-mlp-training |
| Optional API key with an offline fallback | rag-with-citations, llm-tool-use-agent |
chart_scatter with the correct series shape | lorenz-ode-solver |
| Image output (PNG data URL) | cellular-automata |
| Audio output | coqui-tts, demucs-separation |
concurrent: false for thread-unsafe model state | every ML/NLP/audio tool |
| Reference fixtures | most tools have a reference/ folder |
If you’re stuck while authoring your own tool, the closest cookbook entry is almost always a copy-pasteable starting point.