GitHub / mostlygeek / llama-swap
Model swapping for llama.cpp (or any local OpenAPI compatible server)
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mostlygeek%2Fllama-swap
PURL: pkg:github/mostlygeek/llama-swap
Stars: 1,396
Forks: 84
Open issues: 11
License: mit
Language: Go
Size: 1.92 MB
Dependencies parsed at: Pending
Created at: 11 months ago
Updated at: 7 days ago
Pushed at: 7 days ago
Last synced at: 7 days ago
Commit Stats
Commits: 36
Authors: 1
Mean commits per author: 36.0
Development Distribution Score: 0.0
More commit stats: https://commits.ecosyste.ms/hosts/GitHub/repositories/mostlygeek/llama-swap
Topics: golang, llama, llamacpp, localllama, localllm, openai, openai-api, vllm
v156
v156
Changelog
- 831a90d3b0f5aa0fda842e79e6f5d213b5a766cc Add different timeout scenarios to Process.checkHealthEndpoint #276 (#278)
- 977f1856bbca9b6fdf86159db1d8147b039da94f add /completion endpoint (#275)
- 52b329f7bc8124d35b21b2181c363879918e5a9b Fix #277 race condition in ProcessGroup.ProxyRequest when swap=true
Download
v151
v151
Changes:
Pre-loading of models with hooks
- Using
hooks.on_startup.preload
a set of models can be automatically started on startup.
# hooks: a dictionary of event triggers and actions
# - optional, default: empty dictionary
# - the only supported hook is on_startup
hooks:
# on_startup: a dictionary of actions to perform on startup
# - optional, default: empty dictionar
# - the only supported action is preload
on_startup:
# preload: a list of model ids to load on startup
# - optional, default: empty list
# - model names must match keys in the models sections
# - when preloading multiple models at once, define a group
# otherwise models will be loaded and swapped out
preload:
- "llama"
Prompt Processing Metrics added to Activities page in UI
Changelog
- 5dc6b3e6d936893fa45e98dadb6a829416f4b2d0 Add barebones but working implementation of model preload (#209, #235)
- 74c69f39ef61937ac82a6a3fdae8bb27d5e0418a Add prompt processing metrics (#250)
- a1863188924485ae242f7a6b940720744a4215ae Update Readme, Add screenshot for Activities page [skip ci]
- c4e4d5e1e953d58f6beca9781b32f5284bb8ff87 Update Readme UI Screenshot [skip ci]
Download
v150
v150
Changelog
- 7985e94ba4f9c93e8d16da00cdc6f9a8da6db04b add tokens processed to ui models page
- 74556c3a36a75f9560bb76d9f42935801cec6a84 Update bug-report.md [skip ci]
- 5c381e4b3075e6e0271c5df4e343b385793049a4 Add gofmt linting to ci
- 10569ed5464ad9def7bdc722b440b78d7386e4b0 Fix model alias usage in upstream path (#230)
- 5b10b3c23f7e6258a6fced7ceba3e10a39f24c58 UI Tweaks (#228)
Download
v149
v149
Changelog
- 45ea792a3aa053486d1cc77ccc79e610cdbed0b7 Fix UI panel not saving position correctly
- 1bc2802353d11c7d259c80fa1b2d5517126da2bf fix panels not saving sizing state
- 701476c0c47ed9587003b33d30a39fc19ebd187b Update README.md - remove contributor block [skip ci]
- 5c63e0066c0ab162a76e33c6591c1dfe35ada3d5 return models sorted by id in /v1/models (#222)
- 8be5073c51b2c3341151e0e8645deda06135c5c1 Fix typo (#223) [skip ci]
- 6307bd32054f718bd25398a8e20ec81363a7467e Add support for building Linux ARM64 binary in Makefile (#221)
Download