Data analysis is a process where raw data is inspected, cleaned, modified and modeled in order to discover useful information from it which can be beneficial for the future. It comprises of many techniques and and approach in order to do so.
Relation with machine learning
Now many might say that Data Analysis is just another synonym for data preprocessing or data gathering or modeling step of machine learning. Well that’s not entirely wrong. In fact both of them are considered same by some. But some explain Data analysis as a more broader study and consider it as a more practical term. Where machine learning is a way of learning from processed data, data analysis is responsible for providing that processed data.
Some even say both are the same thing. Well I don’t think you should be worried about differentiating them. You can define data analysis as whatever you want. Like I use both of these words interchangeably.
Components of Data analysis
- gathering data – Might involve the sources and way you get the data you’ll be working on.
- Cleaning the data – It refers to filtering the data. The raw data you might collect might have some missing values, some useless features etc. which must be removed before working on the data further. Basically it means converting to data into a more usable format. A format in which the data is requires to work further.
- Models and algorithms
Considered a part of Data analytics, it focuses on advanced transformations and changes require to better interpret and discover patterns in the raw data