jp6/cu126/: minference versions

Because this project isn't in the mirror_whitelist, no releases from root/pypi are included.

Latest version on stage is: 0.1.6.0

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Index	Version	Documentation
jp6/cu126	0.1.6.0

devpi

jp6/cu126/: minference versions