Skip to content

gpt4-x-alpaca: Equipping Alpaca with GPT4

gpt4-x-alpaca is a fine-tuned LLaMA model with 13 billion parameters. Many users have reported excellent conversational results. In this post, you will learn

  • what gpt4-x-alpaca is
  • How to run it on Mac
  • How to run it on Windows

What is gpt4-x-alpaca?

gpt4-x-alpaca is a 13B LLaMA model that can follow instructions like answering questions.

gpt4-x-alpaca’s HuggingFace page states that it is based on the Alpaca 13B model, fine-tuned with GPT4 responses for 3 epochs.

That’s all the information I can find! This seems to be a community effort.

Users generally have good words about its performance. See the following discussions on performance.

Well, at least the performance should be similar to Vicuna 13B.

Install and run gpt4-x-alpaca on Mac

There are two ways to run WizardLM on Mac: (1) llama.cpp and (2) text-generation-webui. You will see instructions for both.

llama.cpp (Mac)

llama.cpp is a command line program for running LLaMA models. Make sure you have installed llama.cpp before proceeding.

We will use model weights from this repository. The following instruction is for installing the q4_1 4-bit quantized version. There are several other quantized WizardLM models available in the repository.

Step 1. Open Terminal App. Navigate to the llama.cpp directory. Create the model folder.

mkdir models/anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g

Step 2. Download the model weights to the newly created folder.

wget -P models/anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g

Step 3. Run the model.

Since it is an Alpaca model, make sure to use the --instruct parameter to turn on the instruct mode. The mode inserts ### instruction: in the beginning and ### Response: at the end of your prompt. Otherwise, you will get some nonsense.

./main -m models/anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g/ggml-model-q4_1.bin -t 4 -c 2048 -n 2048 --color -i --instruct
llama.cpp output for gpt4-x-alpaca

text-generation-webui (Mac)

The same repository can be used for text-generation-webui because it calls llama.cpp under the hood.

Make sure you have installed text-generation-webui before proceeding.

Step 1. Open Terminal App. Navigate to the text-generation-webui directory. Create the model folder.

mkdir models/anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g

Step 2. Download the model.

wget -P models/anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g

Step 3. Open text-generation-webui normally.

If you follow the installation guide, first activate the virtual environment.

source ./venv/bin/activate

You should see the label venv in front of the command prompt.

python --chat

Step 4. Go to Model Page. In the Model dropdown menu, select anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g.

You should see a confirmation message at the bottom right of the page saying the model was loaded successfully.

Now you can chat with gpt4-x-alpaca on the text-generation page.

Install and run gpt4-x-alpaca on Windows

We will use the 4-bit GPTQ model from this repository.

Systems requirements

You should have text-generation-webui installed on your Windows PC. Follow this guide to install.

You should have a windows PC with a GPU card.

Step-by-step guide (Windows with GPU)

Step 1. Start text-generation-webui normally.

Step 2. Navigate to the Model page. In the Download custom model or LoRA text box, enter


Press the Download button. Wait for the download to complete.

download box text-generation-webui empty

Step 3. Click the refresh icon next to the Model dropdown menu.

Step 4. Delete the following file and folder in the model’s folder. (We will only use the cuda version.)

  • gpt4-x-alpaca-13b-native-4bit-128g (folder)

Step 5. In the Model dropdown menu, select anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g. Ignore the error message.

Step 6. Fill in the following values in the GPTQ parameters section.

  • wbits: 4
  • Groupsize: 128
  • model_type: llama

Step 7. Click Save settings for this model so that you don’t need to put in these values the next time you use this model.

The automatic parameter loading will only be effective after you restart the GUI. You don’t need to restart now.

Step 8. Click Reload the model.

Step 9. Start chatting with the model on the text-generation page.

1 thought on “gpt4-x-alpaca: Equipping Alpaca with GPT4”

  1. Hi!

    Thank you for your tutorials.

    I’ve tried to install gpt4-x-alpaca locally on my MacBooks (Intel and M2) and the result always the same:

    Traceback (most recent call last): File “/Users/shpilkin/Downloads/textAI/ooba_manual/text-generation-webui/”, line 68, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) File “/Users/shpilkin/Downloads/textAI/ooba_manual/text-generation-webui/modules/”, line 78, in load_model output = load_func_maploader File “/Users/shpilkin/Downloads/textAI/ooba_manual/text-generation-webui/modules/”, line 290, in AutoGPTQ_loader import modules.AutoGPTQ_loader File “/Users/shpilkin/Downloads/textAI/ooba_manual/text-generation-webui/modules/”, line 3, in from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig ModuleNotFoundError: No module named ‘auto_gptq’

    I also used other tutorials but still have error. Do you have any ideas why is it happening? Maybe there already other tutorials available?

Leave a Reply

Your email address will not be published. Required fields are marked *