A Step by Step Guide for Data Scientists
In this step by step guide for data scientists, I am going to share the baby steps for becoming successful data scientist or machine learning developer.
Step 1: Basics of Maths
Before creation, God did just pure mathematics. Then he thought it would be pleasant change to do some applied
-John Edensor Littlewood
Data Science is game of numbers; you must be aware of following maths concepts:
- Linear Algebra
- Calculus
- Statistics
Links:- https://www.khanacademy.org
https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/
Step 2: Choose programming language of your comfort
you can choose between Python, R, Java, and C/C++
Links :- https://learnxinyminutes.com/
Step 3: Learn basic libraries
- Numpy
- Pandas
- Matplotlib etc
Links :- http://www.numpy.org/
Step 4 : Data Pre-processing
Data! Data! Data!
Mostly data is of two types, Structured (tabular) and Unstructured (Text, Voice , Image).
For developing machine learning model, we largely put effort on data pre-processing rather than developing model. Approximately 80 % out of 100.
Data Processing is basically cleaning data and making it ready for model, the basic steps includes:
- Data Cleaning
- Data Integration
- Data Transformation
- Data Reduction
- Data Discretization
links:-
https://towardsdatascience.com/data-pre-processing-techniques-you-should-know-8954662716d6
Step 5: Machine Learning Models……A big weapon (SKLEARN)
There are two types of Learning:
1.) Supervised Learning
- Classification
- Regression
Models: – Linear regression, Multi- Linear Regression, Random Forest (CART), Support Vector Machine (SVM), Logistic Regression, Navie Bayes etc.
2.) Unsupervised Learning
- Clustering
Models: – K-Means Clustering
links:- https://medium.com/
https://www.analyticsvidhya.com/
Step 6: Measuring Performance of the Model and Adjusting Hyper – parameters
In my next article, I will share steps for learning deep learning and natural language processing.
Links:-
https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234