Artificial General Problem Solver

Chinese Version: 通用人工问题求解器:一种生成式框架 - 知乎

OpenAI ChatGPT 4o assisted in text refining for this article.

Why Artificial General Problem Solver?

Instead of aiming for the grandiose and often vague goal of Artificial General Intelligence, which aspires to replicate the full breadth of human-like cognition, it can be far more productive to focus on developing an Artificial General Problem Solver. This approach defines intelligence in actionable, testable terms: the capacity to define a problem and learn from its world of objectives, inputs, and outputs, and to find solutions systematically from such a world model of the problem.

By concentrating on problem-solving as the core function, AI research can remain grounded in clear objectives, interpretable reasoning, and practical applications, while avoiding the pitfalls of anthropomorphism and overgeneralization. In essence, an Artificial General Problem Solver reframes the quest for “intelligence” into building systems that understand, model, and solve the worlds of problems we care about, offering a more disciplined and achievable path toward truly useful general-purpose AI.

General Definition of Problem: The \([v, w, x]\) Triplet

To make problem solving truly general and systematic, we need a clear, structured definition of what a "problem" is. One way to do this is to represent any problem as a triplet \([v,w,x]\):

  • \(v\) represents the objectives to be minimized—these quantify what it means for a solution to be good or optimal, providing a target for learning or optimization.
  • \(w\) represents the outputs—the actual solutions or actions produced by the model, which must satisfy the objectives within the context of the problem.
  • \(x\) is the collection of inputs—the information, conditions, or context describing the specific instance of the problem.

By framing problems this way, we enable a general problem solver to learn a world model that disentangles the relationships between objectives, inputs and outputs. This structured view allows the system to train on large numbers of instances of the problem, capturing the underlying relationships and constraints that define the "world" of the problem. It turns problem solving into a process of modeling and navigating a well-defined space of objectives, inputs, and outputs—a foundation for building truly general, adaptable AI systems.

General Formuation of Generative Models: Problem World Modeling

A key approach for building an Artificial General Problem Solver is generative modeling, where the system produces solutions by sampling from a learned distribution over the problem space. In generative modeling, we typically express the process as:

\[y = g(z), \quad z \sim \mathbb{D}\]

where \(y\) is a sample from data, \(g\) is a generator neural network written as a function, and \(z\) follows a pre-defined distribution \(\mathbb{D}\).

It is easy to see that Generative Adversarial Networks (GANs) naturally result in models of this form, learning to map noise variables \(z\) to realistic outputs matching the data distribution. Autoregressive language models (such as transformers) can also be viewed in this framework, where \(z\) is the concatenation of all random sampling variables used during the decoding process. In stable diffusion models, \(z\) represents the concatenation of all the noise added during the diffusion process that is progressively transformed into coherent outputs.

For the Artificial General Problem Solver, we can extend this generative modeling idea to our structured problem definition, writing:

\[[v,w,x] = g(z), \quad z \sim \mathbb{D}.\]

Here, the generator learns to produce instances of the problem, including objectives \(z\), outputs \(w\), and inputs \(x\), by sampling from the latent space. This formulation enables the model to replicate the underlying relationships that define the problem's world. By learning these structured dependencies, the system effectively builds a world model that it can use to generate coherent, effective solutions across a wide range of problem definitions.

Solving the Problem by Sampling-based Inference

Once the generative model of the problem world has been learned, solving a specific problem instance becomes a matter of performing sampling-based inference. Given new inputs \(x\) and objectives \(v\), the goal is to generate outputs \(w\) that minimize \(v\) while satisfying the structure learned from data. This can be framed as sampling latent variables \(z\) such that the generated triplet

\[[v,w,x]=g(z)\]

respects the desired problem definition and yields low objective values for the given inputs.

A naive form of sampling-based inference can proceed as follows: (1) sample a large number of triplets \([v,w,x']\) from the model; (2) select a subset of top triplets where the sampled \(x'\) is closest to the given input \(x\) under some similarity metric, ensuring that the solutions remain relevant to the target inputs; (3) from this filtered subset, take a small number of top triplets with the lowest objective values \(v\), which ensures that \(v\) is relatively minimized in this collection; and (4) output the average of all corresponding \(w\) in this final small set as the generated solution. While simple, this method leverages both input conditioning and objective minimization to make effective use of the structure learned by the generative world model.

Limitations and Data Considerations

The triplet formulation \([v,w,x]\) offers a general and systematic way to define many scientific and engineering problems, capturing their essential structure in terms of objectives, outputs, and inputs. In fact, a wide range of problems can be cast into this definition. For example, large language models can be seen as minimizing perplexity \(v\) over generated text \(w\) given context \(x\). Similarly, the problems tackled in reinforcement learning often aim to minimize cumulative cost or maximize reward \(v\) by selecting actions \(w\) in states \(x\). Many tasks in optimization, control, and design across science and engineering can also be represented naturally in this form. By defining problems this way, we unify diverse domains under a shared framework that enables systematic learning and solution generation. If readers can think of problems that do not fit this general definition, they are welcome to share their thoughts in the comments or contact us directly.

Additionally, it is important to recognize that the data provided for learning the generative model of the problem world do not need to contain only optimal solutions—those with minimized \(v\). In fact, deliberately including bad solutions in the training data can be beneficial. By exposing the model to a wide range of solutions with varying objective values, it learns the landscape of the problem more completely. This diversity helps the model understand how to minimize \(v\) given \(x\), rather than simply memorizing high-quality outputs \(w\). In this way, the learning process becomes about modeling the entire structure of the problem's world, equipping the Artificial General Problem Solver to generate new, improved solutions even in novel scenarios.

Finally, it is worth clarifying how this approach differs from the traditional symbolic notion of a General Problem Solver. The classic symbolic system relied on explicit, human-crafted search strategies and rules to transform problem representations into solutions, typically operating on well-defined, logical structures. By contrast, our generative formulation leverages data-driven learning to approximate the mapping

\[[v,w,x]=g(z)\]

across many sampled instances, capturing implicit structure without hand-coded heuristics. Rather than encoding search procedures manually, the model learns to generate solutions that reflect the underlying problem world, offering greater flexibility, scalability, and adaptability to complex real-world domains.

Compared to Discriminative Formulations

It is instructive to compare the generative approach we describe with a more traditional discriminative formulation, where one might try to learn the mapping

\[v = g(w, x).\]

In this discriminative setup, the goal is to predict the objective value v given an output w and inputs \(x\). While this may seem appealing for evaluating solutions, it cannot fully resolve the ambiguities inherent in many problem settings. Specifically, for many problems there may be multiple valid or plausible objective values \(v\) corresponding to the same \(x\) and \(w\) pair, due to factors like unmodeled constraints, stochastic effects, or hidden variables in the problem's world.

Such ambiguity means that the mapping \(v=g(w,x)\) is often not a well-defined, deterministic function in practice. Even if learned as a conditional expectation, it can fail to capture the true range of possible objectives for given inputs and outputs. By contrast, our generative approach models the joint distribution over \([v,w,x]\) directly. It learns to represent the entire space of possible objectives, outputs, and inputs, capturing their dependencies without assuming that \(v\) can be uniquely or cleanly predicted from \(w\) and \(x\) alone. This richer, more flexible representation enables the Artificial General Problem Solver to handle the inherent uncertainty and complexity found in real-world problem definitions.

Moreover, even extending the discriminative formulation to include latent variables—as in

\[v = g(w, x, z), \quad z \sim \mathbb{D},\]

does not make it fully general. While introducing \(z\) can help model uncertainty and multi-modality in \(v\), it still assumes that objectives \(v\) are always well-defined for a given problem. In practice, many problems may not have explicit objective functions at all, or may operate in settings where \(v\) is undefined or irrelevant. Our generative formulation, by modeling the joint distribution over \([v, w, x]\), naturally accommodates these cases by allowing for missing or undefined components without enforcing that \(v\) must always be computable as a function of \(w\), \(x\), and \(z\).

Example: Simulated Motor Control

To illustrate the concept of an Artificial General Problem Solver in practice, consider the task of optimizing motor control algorithms in a MATLAB simulation environment. Here, the problem can be naturally defined using the triplet \([v,w,x]\)1:

  • \(v\) represents the objective to be minimized: specifically, the difference between the current motor speed and the target motor speed, capturing tracking error.
  • \(x\) consists of inputs from current sensors, including phase signals that can be used for estimating motor speed.
  • \(w\) represents the outputs, which are the voltages applied to the motor's exciting coils to produce torque and control speed.

The optimization process begins with a not-so-good, unoptimized PID-based Field-Oriented Control (FOC) algorithm. This initial controller is used to run the motor in simulation, collecting control data—including sensor inputs \(x\), applied voltages \(w\), and resulting speed errors \(v\). This dataset forms the first epoch of training data for the generative model.

Subsequently, an evolutionary training process is initiated. The generative model trained on the first epoch proposes new control strategies by sampling candidate \(w\) values given \(x\) that are predicted to reduce \(v\). These candidates are then tested in the simulation environment, generating new data for the next epoch of model training. By iteratively repeating this cycle—sampling, evaluating, and retraining—the Artificial General Problem Solver gradually learns to generate control inputs that better minimize speed error.

Experimental results show that after several such epochs, the evolved controller achieves significantly lower overshoot and reduced energy cost when starting the motor from zero speed. This example demonstrates how the Artificial General Problem Solver can systematically improve control policies through data-driven, generative modeling of the problem’s world, without relying on manually tuned heuristics or fixed PID gains as traditionally done.




1 This is greatly simplified from what is actually done for illutrative purposes.

Comments

Popular posts from this blog

Serving Llama-2 7B using llama.cpp with NVIDIA CUDA on Ubuntu 22.04

A Perplexity Benchmark of llama.cpp

Dynamic Generative Control