Local "chatGPT-like" ,llama.cpp, need help :)

magicmars · Mar 25, 2023

I have succeeded in easily running locally a ChatGPT -like, based on llama.cpp. (https://github.com/ggerganov/llama.cpp)
after I have read this article :

I Conducted Experiments With the Alpaca/LLaMA 7B Language Model: Here Are the Results | HackerNoon

I set out to find out Alpaca/LLama 7B language model, running on my Macbook Pro, can achieve similar performance as chatGPT 3.5

hackernoon.com

The results and my impressions are very good : time responding on a PC with only 4gb, with 4/5 words per second.
Replies worth it !

If you want to test to run easily your own model language localy, on your PC. (without GPU, but CPU!)
you just need:

- 4 Gb RAM
- 4,2 Gb drive space.
- Some CPU

You can download the Windows version here::

Link : ~~Download (rar, 3.5 G)~~

- just unrar, and run chat.exe

Options are :

-i, --interactive run in interactive mode
--interactive-start run in interactive mode and poll user input at startup
-r PROMPT, --reverse-prompt PROMPT In interactive mode, poll user input upon seeing PROMPT
--color colorise output to distinguish prompt and user input from generations
-s SEED, --seed SEED RNG seed (default: -1)
-t N, --threads N number of threads to use during computation (default: 4)
-p PROMPT, --prompt PROMPT Prompt to start generation with (default: random)
-f FNAME, --file FNAME Prompt file to start generation.
-n N, --n_predict N number of tokens to predict (default: 128)
--top_k N top-k sampling (default: 40)
--top_p N top-p sampling (default: 0.9)
--repeat_last_n N last n tokens to consider for penalize (default: 64)
--repeat_penalty N penalize repeat sequence of tokens (default: 1.3)
-c N, --ctx_size N size of the prompt context (default: 2048)
--temp N temperature (default: 0.1)
-b N, --batch_size N batch size for prompt processing (default: 8)
-m FNAME, --model FNAME Model path (default: ggml-alpaca-7b-q4.bin)

temperature (optional): Controls the randomness of the generated text. Higher values produce more diverse results, while lower values produce more deterministic results.
top_p (optional): The cumulative probability threshold for token sampling. The model will only consider tokens whose cumulative probability is below this threshold.
top_k (optional): The number of top tokens to consider when sampling. The model will only consider the top_k highest-probability tokens.

Sources :
https://github.com/antimatter15/alpaca.cpp

I've seen someone try to port it on Ios :

GitHub - mikeger/llama-ios: iOS port of llama.cpp

iOS port of llama.cpp. Contribute to mikeger/llama-ios development by creating an account on GitHub.

github.com

Someone interrested in wrapping it for B4X ?

Android
You can easily run llama.cpp on Android device with termux. First, obtain the Android NDK and then build with CMake:

$ mkdir build-android
$ cd build-android
$ export NDK=<your_ndk_directory>
$ cmake -DCMAKE_TOOLCHAIN_FILE=$NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-23 -DCMAKE_C_FLAGS=-march=armv8.4a+dotprod ..
$ make

Install termux on your device and run termux-setup-storage to get access to your SD card. Finally, copy the llama binary and the model files to your device storage. Here is a demo of an interactive session running on Pixel 5 phone:

magicmars · Mar 27, 2023

Sorry, I have to remove the download link :

dmca/2023/03/2023-03-21-meta.md at master · github/dmca

Repository with text of DMCA takedown notices as received. GitHub does not endorse or adopt any assertion contained in the following notices. Users identified in the notices are presumed innocent u...

github.com

Don't want problem ?

MP me for more info.

Local "chatGPT-like" ,llama.cpp, need help :)

magicmars

Member

I Conducted Experiments With the Alpaca/LLaMA 7B Language Model: Here Are the Results | HackerNoon

GitHub - mikeger/llama-ios: iOS port of llama.cpp

Android

magicmars

Member

dmca/2023/03/2023-03-21-meta.md at master · github/dmca

Local "chatGPT-like" ,llama.cpp, need help :)

magicmars

Member

I Conducted Experiments With the Alpaca/LLaMA 7B Language Model: Here Are the Results | HackerNoon

GitHub - mikeger/llama-ios: iOS port of llama.cpp

Android​

magicmars

Member

dmca/2023/03/2023-03-21-meta.md at master · github/dmca

Android