llama

Lokales AI Modell von Facebook geleaked. Offline und mit CPU. https://github.com/ggerganov/llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp-master
mkdir build && cd buid
cmake ..

main.exe -m <path-to-model> -p <prompt>

Get models from HuggingFace (https://huggingface.co). Usually all ggml models work. GGML stands for its createor Georgi Gerganov and Machine Learning.

Siehe auch llama-cpp-python

#pip install llama-cpp-python[server]
 
python -m llama_cpp.server --model <model-path> --host 0.0.0.0 --port 8000
 
curl -X POST -H "accept:application/json" -H "Content-Type:application/json" -d "{ \"prompt\": \"\n\n### Instructions:\nWhat is general relativity?\n\n### Response:\", \"stop\": [ \"###\" ], \"max_tokens\": 4096 }" http://192.168.0.51:8000/v1/completions