Lokales AI Modell von Facebook geleaked. Offline und mit CPU. https://github.com/ggerganov/llama.cpp
git clone https://github.com/ggerganov/llama.cpp cd llama.cpp-master mkdir build && cd buid cmake .. main.exe -m <path-to-model> -p <prompt>
Get models from HuggingFace (https://huggingface.co). Usually all ggml models work. GGML stands for its createor Georgi Gerganov and Machine Learning.
Siehe auch llama-cpp-python
#pip install llama-cpp-python[server] python -m llama_cpp.server --model <model-path> --host 0.0.0.0 --port 8000 curl -X POST -H "accept:application/json" -H "Content-Type:application/json" -d "{ \"prompt\": \"\n\n### Instructions:\nWhat is general relativity?\n\n### Response:\", \"stop\": [ \"###\" ], \"max_tokens\": 4096 }" http://192.168.0.51:8000/v1/completions