When you are starting up with Non Linear methods , you need to understand how these algorithms approach the problem.
The idea is that a non linear problem can be converted to a linear problem by doing some manipulations. One of them could be to just treat it as a linear problem and get an approximate solution. This surprisingly works in most scenarios (except image, audio and other complex data situations) and can get you a working solution (unless you are a data scientist trying to achieve the best accuracy).
Otherwise, find a way to transform the input into an intermediate form that could be solved linearly. This is often called “Phase Transformation”. For example, a problem in Cartesian coordinates (x,y) which has a non linear solution could actually be a linear problem in polar (r, theta) coordinates. If you are working with just numbers and have domain expertise, sometimes you could find this transformation yourself. For example – squaring one of the inputs.
The function that you use to transform data is often called a “Kernel” and the branch called kernel based methods. Neural networks are successful because they learn the kernel along with the prediction parameters.
Sometimes a single transformation may not be enough to express the diversity of the problem and you might have to apply the kernel again and again. This is what a Multi layer neural network does.
Hope this will be helpful for you staring up.