v0.11.0

xorbitsai/inference

版本发布时间: 2024-05-11 17:41:09

xorbitsai/inference最新发布版本:v1.2.0(2025-01-10 17:34:30)

What's new in 0.11.0 (2024-05-11)

These are the changes in inference v0.11.0.

New features

FEAT: Support Mixtral-8x22b-instruct-v0.1 by @qinxuye in https://github.com/xorbitsai/inference/pull/1340
feat: add phi-3-mini series by @orangeclk in https://github.com/xorbitsai/inference/pull/1379
FEAT: add Starling model by @boy-hack in https://github.com/xorbitsai/inference/pull/1384
FEAT: support qwen1.5 110b by @qinxuye in https://github.com/xorbitsai/inference/pull/1388
FEAT: Support query engine with cmdline by @Ago327 in https://github.com/xorbitsai/inference/pull/1380
FEAT: Ascend support by @qinxuye in https://github.com/xorbitsai/inference/pull/1408
FEAT: Audio support verbose_json and timestamp by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1402
FEAT: [UI] Add engine option when launching LLM by @yiboyasss in https://github.com/xorbitsai/inference/pull/1456

Enhancements

ENH: add custom image model by @amumu96 in https://github.com/xorbitsai/inference/pull/1312
ENH: Support more quantization with VLLM by @amumu96 in https://github.com/xorbitsai/inference/pull/1372
ENH: Update chatglm3 6b model version by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1401
ENH: make qwen_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1425
ENH: Removed the max tokens limitation and boost performance by avoid unnecessary repeated cuda device detection. by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1429
ENH: Improve benchmark and add long context generate by @frostyplanet in https://github.com/xorbitsai/inference/pull/1423
ENH: make yi_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1443
ENH: Some minor changes by @frostyplanet in https://github.com/xorbitsai/inference/pull/1453
ENH: make deepseek_vl support streaming output by @Minamiyama in https://github.com/xorbitsai/inference/pull/1444
ENH: Rename model_engine for more clear inference backend by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1466
BLD: Use self-hosted aws machine to build docker image by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1405
CLN: Remove actor client by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1436
CLN: Remove all speculative-related codes by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1435
REF: Query for engine by @Ago327 in https://github.com/xorbitsai/inference/pull/1342
REF: [UI] Refactor register model by @yiboyasss in https://github.com/xorbitsai/inference/pull/1368
REF: Add the model_engine parameter for launching process by @hainaweiben in https://github.com/xorbitsai/inference/pull/1367

Bug fixes

BUG: Fix llama3-instruct 70B filename error by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1370
BUG: no role:user msg or content empty got an error. by @liuzhenghua in https://github.com/xorbitsai/inference/pull/1378
BUG: fix file template of andrewcanis/c4ai-command-r-v01-GGUF by @emulated24 in https://github.com/xorbitsai/inference/pull/1389
BUG: Fix using extra gpus due to match in __init__ by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1400
BUG: Fix qwen tool call paramerter empty issue by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1381
BUG: Fix tool calls return invalid usage by @codingl2k1 in https://github.com/xorbitsai/inference/pull/1420
BUG: Fix tools ability by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1447
BUG: Install error on MacOS due to auto-gptq by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1457
BUG: fix some issues in query engine interface by @Ago327 in https://github.com/xorbitsai/inference/pull/1442

Tests

TST: Pin huggingface-hub to pass CI since it has some break changes by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1427

Documentation

DOC: update readme & fix Mac CI by @qinxuye in https://github.com/xorbitsai/inference/pull/1385
DOC: worker address should be specified for xinference-worker by @amumu96 in https://github.com/xorbitsai/inference/pull/1397
DOC: update docker doc in using xinference by @qinxuye in https://github.com/xorbitsai/inference/pull/1417
DOC: add the missing backslash in shell command by @mikeshi80 in https://github.com/xorbitsai/inference/pull/1451
DOC: Usage about model_engine by @ChengjieLi28 in https://github.com/xorbitsai/inference/pull/1468

Others

BUG：Fix mertics is empty when call /v1/chat/completions by @amumu96 in https://github.com/xorbitsai/inference/pull/1406

New Contributors

@liuzhenghua made their first contribution in https://github.com/xorbitsai/inference/pull/1378
@emulated24 made their first contribution in https://github.com/xorbitsai/inference/pull/1389
@orangeclk made their first contribution in https://github.com/xorbitsai/inference/pull/1379
@boy-hack made their first contribution in https://github.com/xorbitsai/inference/pull/1384
@frostyplanet made their first contribution in https://github.com/xorbitsai/inference/pull/1423

Full Changelog: https://github.com/xorbitsai/inference/compare/v0.10.3...v0.11.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-05-11发行的版本