Title: Evaluation and Comparison of Neural Network Architectures for Evaluating Linear and Non-Linear Mathematical Models
Abstract
This projects aims at implementation and training of different types and orientation of Neural Network architectures with respect to different Mathematical equations and problems. Following a comparative study on how different architectures react with different sets of Math- ematical equations/problems. These Mathematical problems range from a simple straight forward equation (e.g. [a + b = c] ) to a bit complex equations (e.g. [∫ a2cos(a) = b] ) in nature too. As Neural networks play an extremely important role in today’s landscape of AI and Data Science tools. Hence, understanding the underlying capability of these Algorithms under dif- ferent architectures becomes extremely important. The following thesis aims at exposing the underlying mathematical equations that drive these different architectures, including Percep- trons, Feed-Forward neural networks, Multi-Layer Feed-Forward neural networks, Convolu- tional Neural Networks (CNN) and many more. Additionally, this thesis explores the impact of different architectures on the Neural net- works ability to recognize/capture complex patterns and generalize problems effectively. This also investigates the adaptability of the following network architectures towards various math- ematical problems, while shining light on their fitness for different applications. The final outcomes of this research provide valuable insights to the selection and design of neural network architectures based on their mathematical traits of the problem one is trying to solve. As the field of deep learning continues to improve, this thesis aims to guide all researchers and scholars in making informed decisions regarding the choice of neural network architectures for any given problem, thereby progressing the state-of-the-art in artificial intelligence and data science.
Aims The following projects aims at implementation and training of different orientations and types of Neural Networks with respect to Mathematical equations and problems ranging from a simple straight forward equation to a bit complex equations in nature too.
Some examples of Mathematical equations:-
- a+b
- a!
- ∫a²cos(a)
- and many more.
Some examples of Neural Network orientations/architectures:-
- Perceptron
- Feed Forward Neural networks
- Convolutional neural networks
- and many more.
Then doing the comparitive study on how different architectures react with these problems and which architectures are better in general for solving such problems.
During the evaluation of various neural networks, we discovered that even considerably basic architectures like feed-forward neural networks and convolutional neural networks demonstrated a very promising results When they were trained on controlled synthetic mathematical data.
We even saw extremely high accuracies in some of the mathematical problems while also maintaining an incredibly low count on the error parameters.
In terms of speed perceptron was the fastest as it was the simplest model but in terms of multiple layers, we still saw feed forward neural networks as the fastest amongst all the distinct kinds of neural network models.
But the parameter of interpretability is where simple machine learning models like perceptron and feed-forward neural network face a huge challenge as explaining ”How they reached the final solution?” becomes extremely tedious which only worsens interpretability in case of more complex problems.
On the other hand, models which required sequential data like recurrent neural networks and long short-term memory based RNN’s displayed relatively low performance when tested on controlled synthetic data sets. This suggests that these architectures might not be a good fit for mathematical problems.
Which also suggests that solving a basic mathematical problem might Require a lot of parallel computations then originally led on. The surprise superstars of all these neural network architectures in relationship to mathematical problems comes out as convolutional neural networks. As the following neural network models while not originally made for the purpose of solving mathematical problems they still achieved a good enough accuracy and low on error parame- ters for many problems.
These models might not be as fast as feed forward neural networks, but these models still dis- played a respectable amount of performance on the various kinds of mathematical problems. While many advanced state-of-the-art and neural network architectures displayed overall higher performance on the Established generic data sets.
For example, models like ConvNext, VGG16, and LSTMs achieved extremely high accuracy on the image classification dataset while models like Transformers and LSTMs also displayed high accuracy on text classification data set. While these models displayed overall high performance, but it quickly became apparent that they are very hard to train because of their highly complex design large number of layers.
The following thesis can be further explored with a lot of different avenues. Some of them are as follows:
- New architectures: As the number of neural network architectures increases the scope of this project increases with it.
- More complex Problems: We can also explore inclusion of more complex mathemat- ical problems for evaluation of these modals.
- Real-world data: This project will really benefit from integration of real-world data. As this will allow these machine learning model to function and be evaluated in real world scenarios.
- XAI: This project can also benefit from a very novel but impressive concept off ex- plainable AI(XAI) as these explainable AI models make it quite easy to interpret the decision making of these neural network models removing the black box aspect of these architectures.
- Heterogeneous architectures: Though not much explored in the following thesis we can also investigate the impact of heterogeneous architectures. As this will allow these machine learning models in overcoming their common shortcomings.
- Multi-modal architectures: We can also explorer some neural network architectures that are multi-modal in nature. As these kinds of machine learning models can handle different kinds of data. Hence, they might be very successful in handling mathematical problem data too.