Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 66
Star 47

Code
Issues 7
Pull requests 42
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 15 Milestones 0

Labels 15 Milestones 0

New pull request New

42 Open 627 Closed

42 Open 627 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Adopt PyTorch dynamo cache size to current layer definition

#733 opened Jan 23, 2025 by anko-intel

Loading…

[WIP] Pipeline Parallelism implementation.

#731 opened Jan 23, 2025 by jmaksymczuk • Draft

Make sure that all workers are notified about end of execution loop

#730 opened Jan 23, 2025 by kdamaszk

Loading…

1

[BLOCKER] Fix in v1.19.0 for dataclass error due to triton package update

#729 opened Jan 23, 2025 by MohitIntel

Loading…

[BLOCKER] Fix in v1.19.1 for dataclass error due to triton package update

#727 opened Jan 23, 2025 by MohitIntel

Loading…

[BLOCKER] Fix in v1.19.2 for dataclass error due to triton package update

#726 opened Jan 23, 2025 by MohitIntel

Loading…

Add interleave sliding window by using fusedsdpa kernel.

#725 opened Jan 22, 2025 by libinta

Loading…

Allow tests to run in t.compile

#724 opened Jan 22, 2025 by Kacper-Pietkun

Loading…

[DONOTMERGE] check fake-hpu build num

#722 opened Jan 22, 2025 by madamczykhabana • Draft

Delayed sampling

#720 opened Jan 22, 2025 by mfylcek • Draft

make benchmark_throughput static support single image input

#718 opened Jan 22, 2025 by yma11 • Draft

Support for multi step scheduling in enc dec models

#715 opened Jan 21, 2025 by jkaniecki

Loading…

2

Rebase 2025.01.21 rebase

#714 opened Jan 21, 2025 by kzawora-intel

Loading…

Fix LoRA test

#711 opened Jan 21, 2025 by SanjuCSudhakaran

Loading…

[SW-216341] hotfix - Revert vllm/attention/layer.py changes from 0f8cafe - fix torch.compile recompilations

#709 opened Jan 20, 2025 by RafLit

Loading…

multi-image support for llama3.2

#705 opened Jan 20, 2025 by yma11

Loading…

add force_greedy_sample

#704 opened Jan 20, 2025 by jikunshang

Loading…

Rebase 2025-01-19

#703 opened Jan 19, 2025 by kzawora-intel

Loading…

Add pip upgrade to installation steps

#699 opened Jan 17, 2025 by michalkuligowski

Loading…

[WIP] Merge lazy and t.compile jenkins tests

#693 opened Jan 16, 2025 by afierka-intel

Loading…

Enabled and optimized GLM-4v-9b on Gaudi

#691 opened Jan 16, 2025 by gyou2021

Loading…

Bump jinja2 from 3.1.4 to 3.1.5 dependencies

Pull requests that update a dependency file

#679 opened Jan 12, 2025 by dependabot bot

Loading…

add renormalize param for FusedMOE

#671 opened Jan 9, 2025 by tangleintel

Loading…

2

Draft: Delayed prompts

#659 opened Dec 20, 2024 by kamil-kaczor • Draft

Chunked Prefill

#656 opened Dec 20, 2024 by hlahkar • Draft

Previous 1 2 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.