v0.11.0
版本发布时间: 2024-05-11 17:41:09
xorbitsai/inference最新发布版本:v1.2.0(2025-01-10 17:34:30)
What's new in 0.11.0 (2024-05-11)
These are the changes in inference v0.11.0.
New features
- FEAT: Support Mixtral-8x22b-instruct-v0.1 by @qinxuye in https://github.com/xorbitsai/inference/pull/1340
- feat: add phi-3-mini series by @orangeclk in https://github.com/xorbitsai/inference/pull/1379
- FEAT: add Starling model by @boy-hack in https://github.com/xorbitsai/inference/pull/1384
- FEAT: support qwen1.5 110b by @qinxuye in https://github.com/xorbitsai/inference/pull/1388
- FEAT: Support query engine with cmdline by @Ago327 in https://github.com/xorbitsai/inference/pull/1380
- FEAT: Ascend support by @qinxuye in https://github.com/xorbitsai/inference/pull/1408
- FEAT: Audio support verbose_json and timestamp by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1402
- FEAT: [UI] Add engine option when launching LLM by @yiboyasss in https://github.com/xorbitsai/inference/pull/1456
Enhancements
- ENH: add custom image model by @amumu96 in https://github.com/xorbitsai/inference/pull/1312
- ENH: Support more quantization with VLLM by @amumu96 in https://github.com/xorbitsai/inference/pull/1372
- ENH: Update chatglm3 6b model version by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1401
- ENH: make qwen_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1425
- ENH: Removed the max tokens limitation and boost performance by avoid unnecessary repeated cuda device detection. by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1429
- ENH: Improve benchmark and add long context generate by @frostyplanet in https://github.com/xorbitsai/inference/pull/1423
- ENH: make yi_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1443
- ENH: Some minor changes by @frostyplanet in https://github.com/xorbitsai/inference/pull/1453
- ENH: make deepseek_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1444
- ENH: Rename
model_engine
for more clear inference backend by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1466 - BLD: Use self-hosted aws machine to build docker image by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1405
- CLN: Remove actor client by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1436
- CLN: Remove all speculative-related codes by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1435
- REF: Query for engine by @Ago327 in https://github.com/xorbitsai/inference/pull/1342
- REF: [UI] Refactor register model by @yiboyasss in https://github.com/xorbitsai/inference/pull/1368
- REF: Add the
model_engine
parameter for launching process by @hainaweiben in https://github.com/xorbitsai/inference/pull/1367
Bug fixes
- BUG: Fix llama3-instruct 70B filename error by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1370
- BUG: no role:user msg or content empty got an error. by @liuzhenghua in https://github.com/xorbitsai/inference/pull/1378
- BUG: fix file template of andrewcanis/c4ai-command-r-v01-GGUF by @emulated24 in https://github.com/xorbitsai/inference/pull/1389
- BUG: Fix using extra gpus due to match in
__init__
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1400 - BUG: Fix qwen tool call paramerter empty issue by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1381
- BUG: Fix tool calls return invalid usage by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1420
- BUG: Fix tools ability by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1447
- BUG: Install error on MacOS due to
auto-gptq
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1457 - BUG: fix some issues in query engine interface by @Ago327 in https://github.com/xorbitsai/inference/pull/1442
Tests
- TST: Pin
huggingface-hub
to pass CI since it has some break changes by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1427
Documentation
- DOC: update readme & fix Mac CI by @qinxuye in https://github.com/xorbitsai/inference/pull/1385
- DOC: worker address should be specified for
xinference-worker
by @amumu96 in https://github.com/xorbitsai/inference/pull/1397 - DOC: update docker doc in using xinference by @qinxuye in https://github.com/xorbitsai/inference/pull/1417
- DOC: add the missing backslash in shell command by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1451
- DOC: Usage about
model_engine
by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1468
Others
- BUG:Fix mertics is empty when call
/v1/chat/completions
by @amumu96 in https://github.com/xorbitsai/inference/pull/1406
New Contributors
- @liuzhenghua made their first contribution in https://github.com/xorbitsai/inference/pull/1378
- @emulated24 made their first contribution in https://github.com/xorbitsai/inference/pull/1389
- @orangeclk made their first contribution in https://github.com/xorbitsai/inference/pull/1379
- @boy-hack made their first contribution in https://github.com/xorbitsai/inference/pull/1384
- @frostyplanet made their first contribution in https://github.com/xorbitsai/inference/pull/1423
Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.3...v0.11.0