Posts

Latent Problem Modeling: A Pragmatic Form of Artificial Intelligence

Image
Chinese Version:  潜变量问题建模:一种务实的人工智能形式 Why Latent Problem Modeling? Instead of pursuing the broad and often vague ambition of Artificial General Intelligence (AGI)—which aims to emulate the entirety of human cognition—it may be more effective to focus on a more actionable and testable alternative: latent problem modeling. Latent Problem Modeling does not define intelligence as mimicking humans. Rather, it frames intelligence as the capacity to define a problem, understand the structure that defines it, and discover solutions within that structure. It models the latent structure of objectives, inputs, and outputs that together form a problem instance. World modeling can be understood as a particular form of Latent Problem Modeling, in which the "problem" being modeled corresponds to understanding and predicting the structure of an external environment or reality. By centering AI research on problem-solving, this approach enables clearer goals, interpretable reasoning, and appli...

Praising the Nobel Prize Committee for Breaking in the Wall between Physics and Computer Science

The Nobel Prize committee is so brave and progressive to break the wall between physics and computer science. After all, knowledge is knowledge, when learnt, it all becomes a unified entity in one's mind. The separation of "disciplines" exists only to show the material limitation of the human brain, such that it is impossible for one person to learn every piece of knowledge in his single lifetime. However, it wasn't the case when rational thinking first became a thing -- everyone was able to learn every piece of knowledge, until knowledge exploded and humans have to invent the concept of "disciplines" to preserve knowledge and to school. It will not be the case in the future either, when AI becomes a universal tool for problem solving, rendering "scientific understanding" unnecessary and impossible. Separation of disciplines of the Nobel Prize will be outdated as well when that happens.

Thoughts on AIGC for Non-AI Industries

1. AIGC is a paradigm shift from goal-oriented problem solving to free-form interactive engineering. It is time to expand our imagination to products that can talk and draw with the customers, on top of being able to completing its own tasks with these interactions. Your fridge can help to order groceries when asked, but can also answer generic questions like ChatGPT does. No reason to limit the AI to do what its shell product is designed to do. For manufacturers of these products, it means better customer stickiness. 2. The entire AIGC economy is in its infancy because right now the paying customers are the tech-savvy people who can afford a few tens of bucks in subscription fees every month. To make it really ubiquitous in every product and every place, the AI model serving cost must be reduced by multiples of thousands of times. When that is achieved, products like GitHub copilots might just be free like Bing search. At that time, every product that is capable of accessing the Inter...

Serving Llama-2 7B using llama.cpp with NVIDIA CUDA on Ubuntu 22.04

This blog post is a step-by-step guide for running Llama-2 7B model using llama.cpp, with NVIDIA CUDA and Ubuntu 22.04.  llama.cpp is an C/C++ library for the inference of Llama/Llama-2 models. It has grown insanely popular along with the booming of large language model applications. Throughout this guide, we assume the user home directory (usually /home/username) is the working directory. Install NVIDIA CUDA To start, let's install NVIDIA CUDA on Ubuntu 22.04. The guide presented here is the same as the CUDA Toolkit download page provided by NVIDIA, but I deviate a little bit by installing CUDA 11.8 instead of the latest version. At the time of writing, PyTorch 2.0 stable is released for CUDA 11.8 and I find it convenient to keep my deployed CUDA version in sync with that. $ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb $ sudo dpkg -i cuda-keyring_1.1-1_all.deb $ sudo apt update $ sudo apt install cuda-11-8 After ...

A Perplexity Benchmark of llama.cpp

Without further ado, here are the results (explanations and discussions later): Table 1: Perplexity on wikitext-2 test set.   Model \ Quantization q4_0 q4_1 q5_0 q5_1 q8_0 fp16 llama-7b 6.157 6.0915 5.9846 5.948 5.9063 5.68 llama-13b 5.385 5.3608 5.285 5.2702 5.2547 5.09 llama-30b 4.2707 - - - - 4.1 alpaca-30b 4.4521 - - - - - llama-2-7b 5.9675 6.0398 5.8328 5.8435 5.7897 - llama-2-7b-chat 7.7641 7.7853 7.5055 7.5392 7.5014 - llama-2-13b 5.2172 5.2115 5.1343 5.1289 5.1005 - llama-2-13b-chat ...