atkgen.Tox
Framework Mappings
Automatic attack generation, trying to elicit toxic outputs from the model.
make the model say something toxic
atkgen
Tox