Skip to main content

Thoughts on AIGC for Non-AI Industries

1. AIGC is a paradigm shift from goal-oriented problem solving to free-form interactive engineering. It is time to expand our imagination to products that can talk and draw with the customers, on top of being able to completing its own tasks with these interactions. Your fridge can help to order groceries when asked, but can also answer generic questions like ChatGPT does. No reason to limit the AI to do what its shell product is designed to do. For manufacturers of these products, it means better customer stickiness.

2. The entire AIGC economy is in its infancy because right now the paying customers are the tech-savvy people who can afford a few tens of bucks in subscription fees every month. To make it really ubiquitous in every product and every place, the AI model serving cost must be reduced by multiples of thousands of times. When that is achieved, products like GitHub copilots might just be free like Bing search. At that time, every product that is capable of accessing the Internet will have a chance to tap in some AIGC capability. The good news is that the AI industry and academic world is making very fast progress on serving costs.

3. The real business barrier for AIGC is use case and its accompanying data. The technology cannot be the business barrier as long as open source AI development is happening at its current pace. OSS will out-pace most AIGC technology providers, and eventually we will just end up with a few that are really good from the hundreds of new companies we are seeing now. However, if you are a non-tech company that has your own products, be aware that the AIGC uses cases of that product are the true business barriers of the future. Do not send data to any AIGC service provider that doesn’t promise that they will not use your data to train their own models (yes, cuing OpenAI and Google here).

4. The development of AIGC has to be tightly coupled with direct user feedback. There are 2 reasons for this: 1) initial deployment of any AIGC model cannot be the best solution because there is not sufficient use case data to fine-tune the models; 2) because AIGC models operate in free form, users can frequently teach us how to use generative models even if we are creators. It is impossible to think of AIGC as a product that we can purchase once and deploy somewhere without change — it has to be a joint iterative development process with both the product and the users.

5. In spite of its paradigm-shifting capabilities, AIGC did not come from nowhere. Human kind find new scientific theories and technologies when physical scales of experiments are pushed to their extremes. For example, at planet-scale observation we proved the general theory of relativity, and quantum mechanics were discovered when we dig deep into particle scales. For computer science, the physical scale is computational power. It is actually quite natural that this generative AI boom came about when people started training generative models at much larger scales than before. There is no myth.


Popular posts from this blog

Serving Llama-2 7B using llama.cpp with NVIDIA CUDA on Ubuntu 22.04

This blog post is a step-by-step guide for running Llama-2 7B model using llama.cpp, with NVIDIA CUDA and Ubuntu 22.04.  llama.cpp is an C/C++ library for the inference of Llama/Llama-2 models. It has grown insanely popular along with the booming of large language model applications. Throughout this guide, we assume the user home directory (usually /home/username) is the working directory. Install NVIDIA CUDA To start, let's install NVIDIA CUDA on Ubuntu 22.04. The guide presented here is the same as the CUDA Toolkit download page provided by NVIDIA, but I deviate a little bit by installing CUDA 11.8 instead of the latest version. At the time of writing, PyTorch 2.0 stable is released for CUDA 11.8 and I find it convenient to keep my deployed CUDA version in sync with that. $ wget $ sudo dpkg -i cuda-keyring_1.1-1_all.deb $ sudo apt update $ sudo apt install cuda-11-8 After

A Perplexity Benchmark of llama.cpp

Without further ado, here are the results (explanations and discussions later): Table 1: Perplexity on wikitext-2 test set.   Model \ Quantization q4_0 q4_1 q5_0 q5_1 q8_0 fp16 llama-7b 6.157 6.0915 5.9846 5.948 5.9063 5.68 llama-13b 5.385 5.3608 5.285 5.2702 5.2547 5.09 llama-30b 4.2707 - - - - 4.1 alpaca-30b 4.4521 - - - - - llama-2-7b 5.9675 6.0398 5.8328 5.8435 5.7897 - llama-2-7b-chat 7.7641 7.7853 7.5055 7.5392 7.5014 - llama-2-13b 5.2172 5.2115 5.1343 5.1289 5.1005 - llama-2-13b-chat 6.62

The SmileyFace Dream: Everyone can share the dividends of AI era.

  I have decided to take a break from money-making careers for the next 6 months, and focus on one thing: build a platform for decentralized AI serving. It has become obvious to me that we are technologically ready to change our economy such that common people, instead of being consistently exploited by the big AI players for both their data and their money, can be compensated in some ways and share the dividends of the AI era. The practical way to do it now is to lower the participation bar for AI serving as much as possible, which has become increasingly possible because of the awesome open source development in the AI field (e.g., llama.cpp), and the permissive licensing from companies like Meta (e.g., LLaMa-2). They have enabled consumer computing devices to serve large generative models. The key in this is a platform that connects people who needs AI inference to people who have spare computing power. If you knew cryptocurrency, this is like a mining pool, but instead of making pe