c86d1c195e
- handle loading models with different names for the safetensors files (gemma) - handle merge tokens that can't be split - organize code into Load/Evaluate