Bug description
i modified the generate/base.py , one inference with kvcache, and the other one is without kvcache,
i set both temperature=0, topk=None and topp=0,
i use the same seed,
i use the same model(qwen2.5-0.5b-instruct)
the only thing i did is in the function generate_fn,
if prefill_token:
tmp_x = token.view(1, -1)
else:
tmp_x = torch.cat(all_tokens, dim=0).view(1, -1)
token = next_token(
model,
input_pos=None,
x=tmp_x,
input_pos_maxp1=None,
temperature=temperature,
top_k=top_k,
top_p=top_p,
prefill_token=prefill_token,
count=count,
)
but i found that the logits is difference,
im not sure if the difference is ok,
Reproduced in studio
No response
What operating system are you using?
Unknown
LitGPT Version
No response
Bug description
i modified the generate/base.py , one inference with kvcache, and the other one is without kvcache,
i set both temperature=0, topk=None and topp=0,
i use the same seed,
i use the same model(qwen2.5-0.5b-instruct)
the only thing i did is in the function generate_fn,
but i found that the logits is difference,
im not sure if the difference is ok,
Reproduced in studio
No response
What operating system are you using?
Unknown
LitGPT Version
No response