Blog Post

[3 - 2 FNN/ANFIS] Implanting 'Intuition' into Neural Networks: The Learning Fuzzy System

Published on May 17, 2026

Hello, this is MiTornAve.

In our last session, we looked into RBFNNs, which respond sensitively in localized regions using overwhelming "speed" as their primary weapon. In the world of control systems where data changes in milliseconds, the simple and fast structure of RBFNNs was an excellent solution.

However, real-world problems do not always resolve into razor-sharp, exact numbers and formulas. "The room feels a bit warm, could you turn down the AC appropriately?" This sentence, which is perfectly natural to humans, sounds like a critical error-inducing alien language to a computer that only understands 0 and 1. What exactly do "a bit" and "appropriately" translate to in numbers?

In the previous post, we broke down this cold, binary world and discovered the powerful weapon of Fuzzy Logic, which implants human "ambiguity" and "intuition" into machines. We structured human knowledge into IF-THEN rules to flexibly solve complex non-linear problems.

But do you remember the question we posed at the very end of our last session? No matter how wonderful Fuzzy Logic (F)—which resembles human intuition—is, engineers were thrown back into agony by a massive wall: "expert dependency" and the "rule explosion" caused by having to manually tune membership functions and handwrite thousands of rules.

On the other hand, Artificial Neural Networks (T), which learn effortlessly as long as they are fed data, are incredibly smart. However, they possess a fatal flaw—they are a "black box" structure, meaning that if you look inside, it is impossible to understand why they reached a certain conclusion.

"Fuzzy Logic (F) gets a perfect score for understanding and intuition but cannot adapt on its own. Neural Networks (T) possess data learning capabilities. Can't we just combine them in a completely slick way?"

The technology born from this bold and perfect engineering fusion is FNN (Fuzzy Neural Network). Today, we will dive deep into the world of ANFIS (Adaptive Neuro-Fuzzy Inference System), the most prominent player in neuro-fuzzy systems and the pinnacle of intelligent control that evolves by reducing its own errors.

1. The Fusion of Concepts: What is an FNN?

Have you ever imagined two people with polar opposite MBTI types meeting and perfectly filling in each other's flaws? The exact same thing happened in the world of engineering: the marriage of Fuzzy Logic (F) with human linguistic intuition and Artificial Neural Networks (T) with cold data analysis capabilities. Let's break down step-by-step why this combination was necessary and how it is achieved.

1.1 The Complementary Blank Spaces of Fuzzy and Neural Networks

First, traditional fuzzy systems express human knowledge through clear rules like "IF (temperature is high) THEN (turn on the AC strongly)," making their internal reasoning completely transparent. In other words, their "explainability" is 100%. However, they had a fatal weakness. If the environment changed, a human had to manually adjust the center (c) or width (\sigma) of the Gaussian membership functions. It was, in essence, a static system incapable of self-improvement.

Conversely, Artificial Neural Networks (ANN) take data and use the Backpropagation algorithm to tune thousands of parameters automatically, finding the optimal answer. Their learning ability is unparalleled. However, the answer remains tightly locked away inside numerous layers and weights (w), creating a "black box" structure where humans cannot decipher why the machine made a particular judgment.

1.2 FNN (Fuzzy Neural Network): The Dawn of Engineering Fusion

Combining only the strengths of these two systems is the FNN (Fuzzy Neural Network).

The essence of an FNN is straightforward: it maps the mathematical mechanisms of a Fuzzy Inference System (FIS) directly into the hierarchical layer structure of an artificial neural network. The "membership functions" and "IF-THEN rules" that formed the core of the fuzzy system are mapped onto the nodes and weights of the neural network.

Structuring the system this way yields incredible synergy:

Learning Ability Granted to the Fuzzy System: The parameters of the membership functions, which used to be hand-tuned by humans, are now optimized in real-time by the neural network's backpropagation algorithm as it calculates errors.
Explainability Granted to the Neural Network: The chronic "black box" problem of artificial neural networks is transformed into a white box. Once learning is complete, looking inside the weights of the FNN reveals them plainly in the form of human-readable fuzzy rules and membership functions.

1.3 The Academic Value of Neuro-Fuzzy Systems

"By utilizing the optimization algorithms of neural networks, the neuro-fuzzy model features an innovative architecture that automatically identifies the premise and consequent parameters of fuzzy rules."

Ultimately, the FNN serves as a flawless engineering milestone that bridges ambiguous human linguistic expressions into a mathematical structure computers can parse, while allowing data to independently finalize the finer details of that structure.

2. Core Architecture: Dissecting the 5 Layers of ANFIS

The hero that most perfectly realized the massive concept of FNN in reality is ANFIS (Adaptive Neuro-Fuzzy Inference System).

ANFIS takes the internal components of fuzzy logic—"linguistic variables," "membership functions," and "IF-THEN rules"—and maps them cleanly into five distinct neural network layers. To make this easy to understand, let's dissect what each layer does based on the most classic ANFIS structure, which features two input variables (x, y) and two rules based on the widely used Sugeno fuzzy model.

For reference, the two rules we will look at are as follows:

Rule 1: IF x is A_1 and y is B_1, THEN f_1 = p_1 x + q_1 y + r_1
Rule 2: IF x is A_2 and y is B_2, THEN f_2 = p_2 x + q_2 y + r_2

2.1 Layer 1: Fuzzification Layer

The first layer acts as a translator, converting the crisp, exact real-world data coming from the outside (e.g., current temperature x) into degrees of membership for fuzzy sets. Every node in this layer is an adaptive node, meaning its shape changes depending on its parameters.

\nu_{1,i} = \mu_{A_i}(x) \quad (i = 1, 2)

If we use the Gaussian membership function we learned about in the previous session, the formula is expressed as follows:

\mu_{A_i}(x) = \exp\left(-\frac{(x - c_i)^2}{2\sigma_i^2}\right)

Neural Network Interpretation: The parameters (c_i, \sigma_i) belonging to the nodes in this layer are called premise parameters. As data flows in, they act as weights (w) whose center points (c) and widths (\sigma) are continuously and precisely refined by the backpropagation algorithm.

2.2 Layer 2: Rule Layer

The second layer is the stage where the input conditions combine to complete a single rule. It consists of fixed, circular nodes that perform a multiplication (\Pi) operation on all incoming input signals.

\nu_{2,i} = w_i = \mu_{A_i}(x) \cdot \mu_{B_i}(y) \quad (i = 1, 2)

Meaning: Remember from the last post when we multiplied the weights of two ambiguous situations, such as "the temperature is 80% warm and the humidity is 50% high"? This multiplied result (w_i) is called the firing strength of each rule. It quantifies how strongly that specific rule applies to the current situation.

2.3 Layer 3: Normalization Layer

The third layer also consists of fixed, circular nodes (N). It calculates the ratio of an individual rule's firing strength relative to the sum of all rules' firing strengths. In essence, it normalizes the scales.

\nu_{3,i} = \bar{w}_i = \frac{w_i}{w_1 + w_2} \quad (i = 1, 2)

Meaning: By dividing its own firing strength by the sum of all firing strengths, it ensures that the total sum of all rule influences always equals 1. This acts as a brake, preventing any single rule from dominating or freezing up the system.

2.4 Layer 4: Defuzzification Layer

The fourth layer returns to adaptive nodes where parameters can be modified. Here, the system multiplies the normalized firing strength (\bar{w}_i) by the consequent function (f_i) of that corresponding rule.

\nu_{4,i} = \bar{w}_i f_i = \bar{w}_i (p_i x + q_i y + r_i) \quad (i = 1, 2)

Neural Network Interpretation: The p_i, q_i, r_i appearing in this formula are called consequent parameters. This concept aligns perfectly with the weights right before the output layer in a standard neural network. They adjust the precision of the final conclusion derived by the rules, refining themselves sharply through data learning.

2.5 Layer 5: Output Summation Layer

The final fifth layer consists of a single fixed node (\Sigma). It sums up all the individual rule conclusions passed down from the previous layer to finally output a single, clear control value (Crisp Output) that computers and external machinery can execute.

\nu_{5,1} = \text{Overall Output} = \sum_{i} \bar{w}_i f_i = \frac{\sum_{i} w_i f_i}{\sum_{i} w_i}

"ANFIS perfectly replicates the fuzzy inference mechanism through a multilayer feedforward neural network structure, serving as a integrated linear-nonlinear model where the nonlinear premise parameters and linear consequent parameters are organically linked."

Ultimately, a fuzzy system—which once looked like a messy, tangled web of countless rules and membership functions—gets sorted into clear mathematical operations as it passes through the well-engineered, 5-stage neural network conveyor belt of ANFIS.

3. Hybrid Learning

The primary reason ANFIS is so widely loved across industrial sites and academic research is its "overwhelmingly fast learning speed."

In truth, if you try to solve all errors using only standard backpropagation across the entire network like a typical neural network, the computation time increases exponentially with even a small addition of parameters. ANFIS sidesteps this issue by utilizing a brilliant algorithm that capitalizes on a mathematical loophole: the Hybrid Learning Algorithm. It splits the 5 layers into a forward pass and a backward pass, applying different optimal mathematical tools to each.

3.1 Forward Pass: High-Speed Linear Calculation of the Consequent Part

When the input data flows forward from Layer 1 to Layer 4, ANFIS temporarily treats the premise parameters (c_i, \sigma_i) in Layer 1 as fixed constants.

By locking the front end, the final output equation simplifies beautifully into a linear equation with respect to the consequent parameters (p_i, q_i, r_i) located in Layer 4.

\text{Output} = \bar{w}_1 (p_1 x + q_1 y + r_1) + \bar{w}_2 (p_2 x + q_2 y + r_2)

What is the fastest and most definitive way to solve for linear variables? It is the ultimate cheat code of linear algebra: the Least Squares Estimator (LSE). (Granted, plenty of other methods exist). By grouping the data into matrices and finding the pseudo-inverse, the system pinpoints the optimal consequent parameters that minimize error in a single matrix calculation (One-shot), completely bypassing the need for hundreds of gradient descent iterations.

3.2 Backward Pass: Precise Fine-Tuning of the Premise Part

With the newly found optimal consequent parameters locked in place, the system calculates the error between the final output value and the actual target. Then, it initiates the Backpropagation process to send this error backward.

The targets for refinement during this phase are the premise parameters (c_i, \sigma_i) in Layer 1, which dictate the shape of the membership functions. Because these parameters are trapped inside the Gaussian function, they are nonlinear variables and cannot be solved in a single shot.

Therefore, we deploy our old friend, Gradient Descent (GD). The system rides the error gradient downward, shaving and reshaping the center points and widths of the membership functions bit by bit.

\theta_{\text{premise}}(t+1) = \theta_{\text{premise}}(t) - \eta \frac{\partial E}{\partial \theta_{\text{premise}}}

3.3 The Synergy of Hybrid Learning

"Hybrid learning, which fuses the Least Squares Estimator (LSE) for the forward pass and Gradient Descent (GD) for the backward pass, accelerates convergence speed dozens of times over compared to standard neural networks that rely purely on gradient descent. Furthermore, it significantly lowers the risk of getting trapped in local minima."

If this were a legacy fuzzy system, human experts would have spent months relying on trial and error to fine-tune the boundaries of membership functions and output rules. ANFIS turns this rational 'hybrid gear,' allowing data to discover the perfect configuration autonomously in just a few seconds.

4. The True Value of FNN/ANFIS: "Explainable AI (XAI)"

No matter how massive and intelligent modern Deep Learning technologies become, there is a definitive reason why medical fields, financial institutions, and manufacturing plants dealing with precision control hesitate to fully adopt them: the unopenable "Black-box" architecture.

Imagine a deep learning model with dozens of layers suddenly slams on the brakes of an autonomous vehicle. A collision was avoided, but when the developer asks, "Why did you brake at that exact microsecond?", deep learning can only present an array of millions of weights (w), failing to offer an explanation in human language. In fields where safety and accountability are legally binding, this is a fatal vulnerability.

ANFIS, however, is cut from a different cloth. Once data learning is complete, opening up the internal parameters of the model reveals something extraordinary.

Visualized Membership Functions: You can visually confirm that the center points (c) and widths (\sigma) of the Gaussian functions—which were arbitrarily set before training—have shifted to match the characteristics of the data. Humans can intuitively understand: "Ah, after looking at the data, the machine lowered its baseline center for 'hot' from 31°C to 29.3°C!"
Extractable IF-THEN Rules: The weight matrix can be translated directly back into human-readable linguistic rules. It provides clear evidence, such as: "The current system operated with 92% confidence in Rule 3: IF temperature is very high AND humidity is normal, THEN set output to 85%."

"Neuro-fuzzy systems secure nonlinear mapping capabilities on par with artificial neural networks while explicitly surfacing the causal relationships between inputs and outputs in the form of fuzzy rules. Consequently, they offer powerful practical advantages as early models of Explainable AI (XAI)."

Ultimately, ANFIS provides the ultimate solution to engineers who used to compromise on interpretability for the sake of performance, or vice versa, by placing formidable performance (T) and transparent explainability (F) right into their hands simultaneously.

5. Conclusion and Next Session Preview (Outro)

Over two comprehensive sessions, we have pushed hard from Fuzzy Logic—which translated ambiguous human senses into math—all the way to ANFIS (FNN), which combines neural network backpropagation and LSE to evolve on its own.

These technologies have flawlessly proven what beautiful synergy occurs when the "know-how of a seasoned expert" hidden in the human mind merges with the "sensor data" that computers thrive on.

Yet, when engineers attempt to deploy these smart and sophisticated intelligent systems into actual industrial environments, they run into a completely different ambush: the overwhelming "flood of data."

When dozens of variables like pressure, temperature, vibration, and flow rate surge in all at once, even the mighty ANFIS cannot escape the threat of a 'rule explosion' and computational overload. To make matters worse, that data is riddled with a chaotic mix of "noise" and redundant, useless information.

How can we strip away the superficial fluff from this massive sea of variables and extract only the "pure, vital information" absolutely necessary to control the system?

In our next session, we will explore the art of dimensionality reduction and optimization, which efficiently compresses data down to its core skeleton: [PLS/Optimization] Extracting Only the 'Real' Info from a Flood of Data: Dimensionality Reduction Techniques.

Previous / Next

Previous: [2 - 3 Fuzzy Logic] AI That Understands "Appropriately": Membership Functions and Linguistic Variables