Google AI researchers come up with a new training method called “DEEPCTRL” that embeds rules into deep learning
As the number and range of their training data increases, deep neural networks (DNNs) provide increasingly precise outputs. While investing in large-scale, high-quality labeled datasets is one way to improve models, another is to use prior information, called “rules” – reasoning heuristics, equations, associative logic, or restrictions . Consider a classical physics problem in which a model is responsible for predicting the future state of a double pendulum system. Although the model can learn to expect the total energy of the system at a given time solely from empirical data, unless it additionally receives an equation that incorporates known physical restrictions, such as conservation energy, it will generally overestimate the energy. The model alone cannot represent such well-established physical principles. How could such rules be taught, so that DNNs acquire the appropriate information rather than just learning from the data?
Researchers present Deep Neural Networks with Controllable Rule Representations (DeepCTRL), an approach for providing rules for a model that is independent of data type and model architecture and can be applied to any type of rule defined for inputs and the outputs, in “Controlling Neural Networks with Rule Representations”, published at NeurIPS 2021. DeepCTRL ensures that models more closely adhere to rules while simultaneously improving the accuracy of downstream activities, improving model reliability and user confidence. The main advantage of DeepCTRL is that it does not require retraining to adjust the rule strength. The user can change the strength of the ruler depending on the desired precision operating point during inference. We also provide a unique input perturbation approach that allows DeepCTRL to be applied to indistinguishable restrictions. We illustrate the utility of DeepCTRL in teaching deep learning rules in real-world domains where rules are crucial, such as physics and health. DeepCTRL also enables additional use cases, including hypothesis testing rules on sample data and unsupervised rule-based adaptation shared across datasets.
The benefits of rule-based learning are:
- Rules can provide additional information for low-data scenarios, improving test accuracy.
- Lack of understanding of the logic behind the thinking and discrepancies of DNNs is a major impediment to their wider use. Rules can increase reliability and user confidence in DNNs by reducing differences.
- DNNs are sensitive to input changes that are imperceptible to humans. These modifications can be reduced using rules since the search space of the model is further confined to reduce underspecification.
Learn group rules and tasks
The traditional technique of enforcing regulations is included by adding them into the calculation of losses. This technique has three shortcomings that we want to address: (i) before training, the strength of the rule must be determined (so the trained model cannot operate flexibly depending on how well the data satisfies the Rule) ; (ii) The strength of the rule cannot be matched to the target data during inference if the training configuration is incompatible; and (iii) the rule-based objective must be differentiable for learnable parameters (to allow learning from labeled data).
DeepCTRL modifies canonical training by combining rule representations with data representations, essential for controlling the strength of rules at the time of inference. These representations are concatenated stochastically during training using a control parameter, represented by, to generate a single representation. The strength of the rule on the exit decision can be increased by increasing the value. Users can adjust the behavior of the model to accommodate unknown inputs by modifying during inference.
DeepCTRL combines a data encoder with a rule encoder to create two goal-related latent representations. The relative weight of each encoder is controlled by the control parameter, which is modifiable at inference.
Using Input Perturbations to Embed Rules
When using rule-based goals, the goals should be able to be differentiated based on the learnable parameters of the model. Many applicable rules are indistinguishable in terms of entry. “A blood pressure of 140 or higher is associated with an increased risk of cardiovascular disease,” for example, is a difficult rule to fit into traditional DNNs. We also present a unique input perturbation approach to applying DeepCTRL to indistinguishable constraints, which involves adding tiny perturbations (random noise) to the input features and creating a rule-based check depending on whether the output is in the desired direction.
DeepCTRL is tested on machine learning use cases in physics and healthcare, where the rules are very critical.
- Improved reliability based on the principles of physics:
The verification rate is the proportion of output samples that meet the criteria is used to measure the reliability of a model. It can be useful to operate at a higher verification rate, especially if the rules are still valid, as in the natural sciences. A more excellent rule check report, and therefore more reliable predictions, can be produced by changing the check parameter.
- Adapting to changes in health care delivery:
The strength of some rules may vary depending on the subsets of data to which they apply. For example, the link between cardiovascular disease and high blood pressure is more vital in older people than in younger people when it comes to predicting disease. When the work is shared, but the distribution of data and the validity of rules vary between datasets, DeepCTRL can adapt to changes in distribution by regulating.
Building interpretable, resilient, and reliable DNNs may require rule learning. DeepCTRL is a new way to embed rules in data-learned DNNs that we propose. DeepCTRL allows controlling the strength of rules during inference without the need to retrain. We propose a unique perturbation-based rule coding approach to combine arbitrary rules into meaningful representations. DeepCTRL is demonstrated in three ways: improving reliability based on well-known principles, analyzing candidate rules, and adapting the domain based on rule strength.