Instructions to use breitburg/penpal-quality-assurance with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use breitburg/penpal-quality-assurance with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir penpal-quality-assurance breitburg/penpal-quality-assurance
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
penpal-quality-assurance
A small MLX ResNet that scores a 256Γ256 grayscale handwriting raster on
[0, 1]: 1 = legible, human-style handwriting, 0 = corrupted or
illegible output. Trained to filter synthetic handwriting produced by
Graves-style generative models before it's used downstream.
- ~36k parameters (channels 4 / 8 / 16 / 32 / 32)
- Single-file safetensors weights
- Apple Silicon / MLX
Inputs
- Shape
[B, 256, 256, 1](MLX NHWC),float32 0.0= background,1.0= ink- The renderer in
render.py(orgraves_handwriting_mlx.quality.render_strokes) fits each stroke bbox isotropically into the canvas with 12 px padding
Output
Raw logits. Apply mx.sigmoid for a probability in [0, 1].
Usage
With the graves-handwriting-mlx package installed:
import mlx.core as mx
from graves_handwriting_mlx.quality import QualityClassifier, render_strokes
model = QualityClassifier.from_pretrained("breitburg/penpal-quality-assurance")
# `strokes` is the project's nested word -> stroke -> point schema
image = render_strokes(strokes) # [256, 256, 1]
score = mx.sigmoid(model(mx.array(image)[None]))[0] # float in [0, 1]
Without the package, download the weights directly:
from huggingface_hub import hf_hub_download
weights_path = hf_hub_download("breitburg/penpal-quality-assurance", "weights.safetensors")
Training data
Real (label 1.0) and corrupted-synthetic (label 0.0) strokes are
rasterized through the same renderer so the classifier cannot use
rendering style as a shortcut.
- Positive β real human handwriting strokes (IAM-OnDB-derived collections)
- Negative β strokes generated by the Graves model with internal
state corruption applied during sampling (attention
ΞΊscale, attentionΞ²floor, hidden-state Gaussian noise) in a 10 / 70 / 20 mixture of very mild / mild / gibberish corruption ranges - Mid (label
0.5) β clean samples frombreitburg/penpal, which sit between the real and corrupted clusters
Loss is BCE-with-logits over the soft {0.0, 0.5, 1.0} labels.
Evaluation
Distribution of scores on 500 random rows from each source:
| Source | Mean | Median | p10 | p25 | p75 | p90 | β₯0.3 | β₯0.5 | β₯0.7 | β₯0.9 |
|---|---|---|---|---|---|---|---|---|---|---|
| held-out real handwriting | 0.675 | 0.669 | 0.390 | 0.500 | 0.881 | 0.969 | 96.4 % | 75.0 % | 46.0 % | 22.8 % |
breitburg/penpal (clean synthetic) |
0.418 | 0.396 | 0.321 | 0.352 | 0.452 | 0.529 | 100 % | 13.8 % | 3.0 % | 0.6 % |
The lowest-scoring penpal rows are genuinely degraded; the highest- scoring rows look indistinguishable from real handwriting. A residual length / scale bias exists (longer texts render smaller and tend to score lower) β acceptable for filtering, but worth knowing.
Suggested thresholds
0.3β lenient: keeps essentially all of penpal, drops only the obvious failures0.5β balanced: drops ~86 % of penpal, keeps 75 % of real0.7β strict: keeps only confidently human-looking rows (~46 % of real)
Files
weights.safetensorsβ trained parametersconfig.jsonβ architecture widths and input contractmodel.pyβQualityClassifier/BasicBlockreference implementationrender.pyβrender_strokesfor stroke β 256Γ256 raster
License
MIT.
- Downloads last month
- 54
Quantized