Meta, the company behind Facebook, is gaining attention with its large-scale language model “Llama 2.” While many AI enthusiasts are familiar with OpenAI’s GPT-4 and ChatGPT, Meta’s Llama 2 is emerging as a strong competitor in the field.
Llama 2, with a size of 13 billion parameters, often responds in English even when spoken to in Japanese. However, it has the capability to understand Japanese and engages in meaningful conversations.
What sets Llama 2 apart is its performance, which is said to rival the model used in OpenAI’s free version of ChatGPT, known as GPT-3.5. The fact that GPT-3.5’s model was open-sourced made it even more impactful.
Additionally, Llama 2 boasts not only high performance but also a smaller model size. While GPT-3 and GPT-3.5 have an estimated parameter size of 175 billion and 355 billion, respectively, Llama 2 offers a comparable performance with only 70 billion parameters.
With smaller parameters, the required GPU memory decreases. While GPT-3.5 is designed to run on data center server GPUs, Llama 2 might be able to operate on local PCs at home. To test this, the “Text Generation web UI” was used to try out Llama 2 on a personal computer.
The “Text Generation web UI” is a tool that simplifies the usage of various large-scale language models through a web app-like user interface. Similar to the famous “Stable Diffusion web UI” designed for image generation AI by AUTOMATIC1111, it allows for easy switching between Llama 2 and text generation.
The installation process proved to be surprisingly simple. After downloading the installer specific to the operating system from the GitHub page and extracting the zip file, the extracted folder can be placed in the C: drive.
Running “start_windows.bat” launches the installation process, which includes selecting the GPU. NVIDIA GPUs have become the de facto standard for AI generation, so it is likely that PCs used for installation also have NVIDIA GPUs. In the case of the borrowed “DAIV FX-A9G90” PC, it is equipped with the top-notch consumer-grade “GeForce RTX4090” with 24GB VRAM, making NVIDIA the obvious choice.
The installation process automatically downloads various files from the internet, which may take around 20 minutes depending on the network environment. Once the installation is complete, running “start_windows.bat” in the future will quickly launch the program.
Upon launching, the terminal displays the URL for the Text Generation web UI: “Running on local URL: http://127.0.0.1:7860”. Opening this URL in a browser will display the WebUI. Like the Stable Diffusion WebUI, the main body runs in the terminal, so it is important not to close the terminal while the WebUI is running.